File Systems: FAT, FAT16, FAT 32

Now FAT32 solves this problem by removing the 65535 clusters per disk limitation. FAT32 now uses 32bit number, which is a number with 32 digits. That allows it to count much higher. And since it can handle a bigger number of clusters, its cluster size is much smaller than that of FAT16 for bigger disks. In fact, FAT32’s maximum disk size is 2 Terabytes.

 

To get this number, you take the total number of sectors addressable (and I do mean sectors), which would be 2^32 – 1, and multiple that by 512 bytes per sector. That’s a whopping 2048 Gigabytes, or 2 Terabytes. At this point, some of you may be scratching your heads trying to figure out the inconsistencies in my explanation. The first item to address is that even though the file system accesses the sectors by a cluster count first, that still doesn’t alleviate the need to number the sectors individually. Even in FAT16, the sectors are numbered. And that leads to the second concern some of you may have. Since FAT16 uses 16 bit numbers, doesn’t that mean that there can be only 2^16 – 1 sectors? Wouldn’t that translate into 32 megs? Yes. You are right. But unknown to most is the fact that since DOS 4.0, the underlying sector numbering had already been changed to a 32bit value! The limit placed on the disk size was purely due to the 16bit numbering of the clusters, and the limit of the numbering system for the sectors in each cluster, as discussed above.

Ok, so we know what sectors and clusters are. But how does that get translated into files? That is where the File Allocation Table comes in. The FAT is a huge database that contains records of where each file is on the disk. In fact, it would not be too much of a stretch to just think of the FAT as a table with several columns that each record something about the files on the drive. Each record inside the FAT will take up 32 bytes of space. In other words, if I had 100 files on the computer, it would take the system roughly 3200 bytes to record all of that information into the FAT. Just for fun, let’s take a look at what is stored in these 32 bytes:

 

 

Byte Range Info Stored
1 to 8 Filename
9 to 11 Extension
12 Attributes (i.e. read-only, archive, hidden)
13 to 22 Reserved bits for latter features
23 to 24 Time Written
25 to 26 Starting cluster
29 to 32 File Size

Interesting list isn’t it? Some of the entries are self-explanatory. But there are two that are rather interesting. The first thing to look at is the Starting Cluster field. Some of you may have been wondering how the system translates cluster and sector indices into filenames and such. The answer is that for each file, there is a field in the FAT that indicates the first cluster of the file. The system would read that FAT entry and then find the starting cluster and read the file. Now the question is how does the system know when to stop reading? Furthermore, even before that, how does the system know where to read next after this cluster? The answer is that written within each cluster is the address of the next cluster that contains information from this file. So a computer reads the current cluster and checks to see if there are any other clusters after it. If there is, it skips to that cluster and reads it, and checks for the next one. This process repeats until it finds a cluster with no pointers. The CS majors reading this would recognize this as a Linked List implementation.

The other interesting feature of this table is that each directory entry (record in the FAT) uses 4 bytes to store the size of the file. This may not seem like much at first. But what it actually tells you is the maximum size possible for any single file. The fact that we use 4 bytes to store a file size tells us that the largest number that can be represented is 32bits (recall that there are 8 bits per byte). So what is the largest 32bit number? That would be 2^32 – 1. So a file can have a maximum of 2^32 -1 bytes, or 4 Gigabytes. This calculation is obviously done under the assumption that we are using FAT32.

The last two fields I’d like to take a look at are the filename field and the reserved bytes field. The interesting thing about the filename field is that DOS uses that field to perform undelete. When you erase a file in DOS, you aren’t actually erasing the file. All you are doing is changing the first letter of the filename field into a special character. And as far as the file system is concerned, the file isn’t there, and the next time a file is written to this cluster, the current file is erased. The way DOS performs an undelete is to simply change that first letter back to something else. That is why when you used undelete in DOS, it always asked for the first letter of the filename before it could restore the file. Mystery solved.

Now let me just make a quick mention of the reserved fields. The reserved fields didn’t do much in FAT16, but it became rather useful in FAT32 and in NTFS. Since FAT32’s cluster numbering used 32bit numbers instead of 16bit, as was the case in FAT16, the system needed two extra bytes to accommodate the added digits. Those two bytes were taken out of the reserved field. And in NTFS, compression attributes, some security information was also written into the reserved field of the FAT.

Before I move on, I’d like to point out a few of the other differences between FAT16 and FAT32. In FAT32, the root directory is unlimited in size. What this means is that you can have as many files and directories in C:\> as you’d like. In the days of FAT16, you could have a maximum of 255 directory entries. That means that if you had normal filenames of 8 letters + 3 extensions, you have a maximum of 255 directories + files. That may seem like more than you’d need to put in the root directory. And it probably is , if you had 8.3 filenames. But in Win95, the system can support long filenames. The trick is that Win95 combines multiple directory entries to support long filenames. So consider a file that’s named “My English Paper”. That is 16 letters long. So it takes 2 directory entries, at least. Actually, it takes 3 directory entries. It takes 2 for the long filename, and another one for the short 8.3 filename to be compatible with DOS and Win3.1. As you can see, long filenames can quickly deplete directory entries.

Another nice feature is that FAT32 has better FAT redundancy. Both FAT32 and FAT16 store two copies of the file allocation table on disk. But traditionally, the file system only read from one of them. In FAT32, the system could choose to read from either one, which provides a better failsafe for freak accidents involving corrupt file tables.

It is apparent that FAT32 is a superior file system then FAT16. Unfortunately, FAT32 is not supported by every operating system. The original version of Windows 95 couldn’t read FAT 32. It wasn’t until version B (OSR2) did Win95 gain that ability. And all versions of WinNT before 5.0 (named Windows 2000 or short Win2K) could not read FAT32 drives either.

– Xin Li –

Page 1: Introduction to file systems

Page 2: This Page

Page 3: New Technology File System (NTFS)

Page 4: File Auditing / Data Recovery / Conclusion

Leave a Comment: