Full Hard Drives and Drive Corruption
I recently read a disturbing note by a technician from MicroMat
(the makers of TechTool
Pro) regarding a potential drive corruption issue on Macintosh HFS
and HFS+ volumes. In summary, if your hard drive (or a hard drive partition)
is very full (>70% on HFS and >85% on HFS+ volumes) and is very
severely fragmented, you run the risk of serious file corruption if you
add any more files - ouch! However, before you get too worried, realize
that the conditions under which this can occur are pretty extreme and
require a huge amount of disk fragmentation, so it is really quite unlikely
that you’ll ever run into this. See my detailed technical explanation
here...
To make a long story short, the moral here is to avoid filling your hard
drive partitions to capacity (luckily hard drives are cheap these days!)
and use a disk defragmenting utility on a fairly regular basis. Micromat’s
Drive10, Symantec’s
Norton Utilities as well as Alsoft’s
PlusOptimizer, included when you purchase the excellent DiskWarrior
on CD, are ways of obtaining disk defragmenting tools. All are now compatible
with OS X volumes but Norton Utilities and PlusOptimizer still need to
be run while booted into OS 9. At this point, if you have OS
X v10.2.2 (both client or Server) and have file-system
journaling enabled (turned off by default), then it is not recommended
to run a defragmentation tool (see the companies respective tech support
departments for more information on this.)
After discussing this issue with a friend who’s a PC guru, I can
reassure you Windows users out there that this specific technical issue
cannot occur on PC hard drive file-systems. However, from a performance
perspective, there are many good reasons why you should keep a decent
amount of free space on your hard drives and keep them defragmented as
well.
What the heck is actually going on...
(the long, technical version!)
The technical reason why this HFS/HFS+ corruption can occur requires
a little background on how files are actually kept track of on a Macintosh
disk drive. Each hard drive volume (complete drive or partition) has a
Catalog B-tree (a type of database) of the files that are stored
on the drive - images, documents, fonts, system files, every file on your
hard drive has an entry in the Catalog B-tree database. The operating
system uses this database to keep track of where these files are physically
located on the drive volume. For example, when you double-click on a file
to open it, behind the scenes, the MacOS is looking up the Catalog B-tree
entry for the file you clicked in order to physically locate its data
on the drive. If the file is fragmented, things get a little more complicated...
In order to understand what file fragmentation is, imagine the following
scenario: Lets say you have a hard drive that is totally full - there
is no more free space on it. You decide to free up some space because
you have a 10 Mb image you want to save on that drive. You go ahead and
delete ten individual 1 Mb files that you don't need anymore from that
drive. Now there is enough room to store your 10Mb image. Well, those
10 files that you deleted were scattered all over your hard drive. Now,
when you go and copy the 10Mb image over, the OS has to split it up into
10 different 1 Mb chunks in order to save it to the drive. No problem,
the OS handles all this seamlessly for you - you never know that it's
happened. All you know is that now your image has been successfully copied
to the hard drive and everything is fine. Or is it...?
In this scenario, what would actually happen is that the positions of
the first 3 chunks, or fragments of data, for this image will be kept
track of by the Catalog B-tree. Since the Catalog B-tree is designed to
only handle up to a maximum of 3 fragments per file, another database
is called into play to handle any additional fragments - the Extents
B-tree. When the Extents B-tree database starts getting very large,
as it would on a severely fragmented drive, this file corruption issue
can arise. What came as a surprise to me was that apparently the Macintosh
operating system takes some error-checking shortcuts - potentially disastrous
ones - when this Extents B-Tree grows large and the hard drive starts
reaching capacity! Let me explain...
Imagine, in our little scenario, that those 10 files we deleted from
the drive were not fragmented. This means that they would not
have had entries in the Extents B-Tree since all the data for each file
was being kept track of in the Catalog B-tree. Now we come along and copy
this 10Mb image over and suddenly the OS has to keep track of 7 additional
fragments in the Extents B-Tree - remember, the first 3 are being tracked
in the Catalog B-tree. Well, our hard drive is now full, so there is no
more room for the Extents B-tree to grow. So, rather than report an error
and say the disk is full, the Macintosh OS simply overwrites 7 other entries
in the Extents B-tree with the data locations for this 10Mb image that
you've just copied to the drive! So effectively, the system has just clobbered
some Extents B-tree data that it would need to find fragments for some
other files elsewhere on the hard drive! So, the next time you try to
open some other file, you might suddenly find that it's corrupt, even
though it opened fine the last time around. Or, maybe even worse, if a
critical system file gets clobbered, you may not even be able to boot
your computer anymore! Not a very pretty picture, is it?
So why aren't we constantly having file corruption problems? Well, I
oversimplified the above scenario but it serves to illustrate the point
I think. Realistically, it is rare that a file will have more than 3 fragments
and the Catalog B-tree, which stores the location of these first fragments,
seems to be implemented in a more robust fashion than the Extents B-tree.
Also, since the amount of data needed to keep track of fragment locations
is generally tiny in comparison to the actual data being stored, there
is usually plenty of room for the Extents B-tree database to grow to whatever
size is needed. Problems will only occur with a severely fragmented drive
where the Extents B-tree has grown to an enormous size and the hard drive
has only many small "holes" scattered around the drive in which
to store fragments for newly added files.
Because of all the variables involved, it is nearly impossible to say
when exactly a problem such as this could occur. That is why it is not
recommended to fill an HFS+ disk to more than 85% capacity since we generally
don't keep track of how fragmented our hard drives are from day to day.
If a drive has absolutely no fragmentation at all, then you could likely
fill it to 99% capacity or more without encountering any problems whatsoever.
Why did Apple design things this way, you might be asking? Well, that's
a good question and believe me, I've been asking myself this as well!
I can only imagine that this software design decision, compromise,
design flaw or whatever you want to call it, exists because somewhere
along the line, many years ago, a software engineer decided that the possibility
of this kind of file corruption was too remote and the speed benefits
of performing less error checking on Extents B-tree operations were too
appealing to pass up. Of course now, in order to retain full backward
compatibility, it is likely difficult for Apple to engineer out this little
problem without losing the ability to work with older HFS or HFS+ volumes.
Hopefully they will find a way to address this issue in the future however...
On a positive note, I have read that under OS X, a freshly formatted
drive has more room preallocated for its Catalog and Extents B-tree database
files, so this problem may be less likely to occur.
Again, you shouldn't lose too much sleep over all of this. Just make
sure that you are not filling your hard drive volumes to capacity and
consider investing in one of the defragmentation utilities mentioned at
the beginning of this article. Just don't start de-fragging you drives
every other day - it really isn't all that necessary! Play it safe, backup
your data frequently and you shouldn't have any problems.
Mike Mander
Beau Photo Supplies
November 2002
[send
e-mail] |