[wplug] File systems and Defrag

Thu Sep 16 18:10:07 EDT 2004

"Teodorski, Chris" <cteodorski at mahoningcountyoh.gov> wrote:

> After a coworker read the article about Sun's new file system ZFS
> (http://www.sun.com/2004-0914/feature/) a conversation about file
> systems started.  The meat of the conversation revolved around why
> Microsoft's OS all have file systems that need defragmenter and how come
> *nix file systems do not.  From what I've read it is not so much that
> *nix file systems do not fragment, but that they are virtually
> unaffected by this fragmentation.  Can anyone shed any additional light
> on this?  The information I've found online seems to contradict itself.

Believe it or not, this has been a subject of interest and independent
study of mine for a while now.

First off, I understand Unix-like filesystems better than I do MS-based,
so I may be off a bit on how MS filesystems work.  Secondly, my study
has been of BSD's FFS, which (to my understanding) is _very_ similar to
ext2fs ... 

FFS is based on locality.  Data is intentionally spread across the disk, but
always kept close to similar data (i.e. a directory is alway kept close
to the data to the files it contains, whereas an unrelated directory is
stored somewhere else on disk)

Additionally, a _shitload_ of research at Berkely demonstrated that small
files are more common than large files, and that large files are seldom
accessed all at once.  As a result, files are stored contiguously up to
a certain size, then intentionally fragmented.  This has a number of
effects: First, the disk near a directory entry has room to add new file
data in the directory that stays close to the directory data ... thus,
when you read a directory to find the file, then read the file, the
head doesn't have to do a long seek.  Second, when a large file is
created, the initial part of the file is close by.  If the file is a
program, it's likely that the on-demand pager won't need the whole
thing right away, thus, the fact that the whole file isn't in one
spot isn't a problem.  Additionally, if the file grows, there is a
pattern of filesystem usage that allows the file to grow, while still
being laid out on the disk in a predictable manner.  Additionally, if
a small file is added to the directory with the large file in it, there
will still be space close by the directory to store the new file data.

Additionally, FFS allocates space in big blocks that can be broken into
smaller pieces (fragments) on demand.  As a result, when a big file
does fragment, it fragments into (for example) 8K fragments, which the
disk can retrieve pretty quickly, since it's doing 8K at a time.  NTFS,
etc al, use 512byte blocks, so a big file can turn into a real mess if
it fragments, causing seeks all over the disk.  Additionally, since
NTFS tries to keep everything contiguous all the time, a large file
can seperate other files that are normally accessed together, thus
causing lots of disk seeks.  Think of it this way, worste case for
an 800K file on FFS is 100 long seeks, whereas a maximally fragmented
file on NTFS can be 1500 long seeks.  Thus, fragmentation on FFS
doesn't cause as much performance degradation as is possible on NTFS.

Also, in practice, 100% contiguous files don't really improve disk
performance noticably.  Since 100% contiguous files are difficult to
maintain, it just becomes a management issue.  This may not be
intuitive, but the research a Berkely determined that the work
involved in keeping files 100% contiguous was not justified by the
performance improvement, and they came up with a better scheme.

It boils down to this:
Tons of research into disk speed had identified which considerations
are important, and which are negligable.  Fact is, certian types of
fragmentation cause negligable performance degradation, while other
typs of fragmentation cause major performance degradation.  FFS is
carefully designed to minimize very bad fragmentation by carefully
fragmenting in a manner that doesn't cause noticable performance
problems, and allows the data on the disk to stay organized in a
manner that is efficient.  The FFS code itself optimizes disk layout
during normal use.  It's kind of a Zen thing, I think ... if you
resist something 100%, it will overcome you, but if you flow with
it, you can manage it.
Microsoft grabbed a crappy FAT filesystem and threw it together to
be easy to use, then they upgraded it without much thought (into
NTFS) and never really thought through how to make it self-maintaining.
As a result, NTFS does not optimize disk layout to any degree, thus
a third-party program (defragmenter) has to be used regularly to
maintain some semblence of an optimized layout.

Why Microsoft chose this route when there was a crapload of published
literature on all the research that was done a Berkely, is beyond me.
It's possible that nobody at Microsoft is smart enough to understand
the work that McKusick did, but that's just my theory.

-- 
Bill Moran
Potential Technologies
http://www.potentialtech.com