The file systems of Windows
As explained in the article on Filesystem basics, operating systems tend to limit the list of file systems they are compatible with. And in case of the Microsoft Windows family, the choice is usually made between two major FS types: NTFS, the primary format most modern versions of this OS use by default, and FAT, which was inherited from old MS-DOS and has exFAT as its later extension. ReFS was also introduced by Microsoft as a new-generation format for server computers starting from Windows Server 2012. HPFS developed by Microsoft together with IBM can be found only on extremely old machines running Windows NT up to 3.5. Read on to learn about these formats in more detail and find out how they compare with each other.
FAT (acronym for File Allocation Table) is one of the simplest FS types, which has been around since the 1980s and traces its roots back to the old Microsoft’s MS-DOS operating system. Thus, it comes as no surprise that FAT has been originally designed with low-capacity storage in mind.
As its name suggests, this file system is actually based on a table that acts as an index for its content. The overall FS structure is arranged into three separate areas:
The boot sector;
The File Allocation Table (FAT);
The data storage area.
The boot sector is the very first sector in any partition formatted with FAT, which contains important information about its organization.
Next goes the primary File Allocation Table (FAT) as well as its backup copy that can be accessed if a problem occurs with reading of the original one.
The majority of the partition belongs to the data storage area, which is divided into clusters. A cluster consists of adjacent sectors and is used as a minimal unit for the allocation of files. Its size is fixed, but can range from 512 bytes to 64 kilobytes, depending on the volume’s size and FAT version. A file, regardless of how small it is, takes up the whole cluster, and the remaining unoccupied space gets wasted. When multiple clusters are required for a file, they may be allocated in a consecutive chain or scattered all over the volume, resulting in the file’s fragmentation.
Each cluster has an associated entry in the File Allocation Table. A zero value in it denotes that the cluster is currently unused, whereas a non-zero one may point to the next cluster of the same file or a special indicator for its end.
Directories, just like files, exist in the data storage area. They are composed of 32-byte-long directory entries, each of which describes a file stored in this directory (or its subdirectory). Besides the file’s name, size and other attributes, the directory entry contains the information about the first cluster of a file. Consequently, it is possible to discover where the necessary file begins by referring to the corresponding directory entry, and any next cluster can be found through the File Allocation Table by using it as a linked list.
In the course of time, FAT has undergone several revisions. The original version was followed by FAT12, next came FAT16 and, finally, FAT32. The numbers in their names stand for the number of bits used to address a single cluster: 12 bits in FAT12, 16 bits in FAT16, and 32 bits in FAT32 respectively.
FAT12 and FAT16 were applied to old floppy disks and do not find extensive employment nowadays. In contrast, FAT32 is still widely used, mainly due to its wide compatibility. It can be accessed by almost any operating system, including macOS and Linux, which makes it a good alternative for portable devices, like memory cards and USB sticks. The format is also supported by smartphones, digital cameras, video recorders, gaming consoles and other gadgets.
However, FAT32 does not have native support for storage capacities of more than 32 GB. For this reason, it can be used on Windows-compatible external storages or disk partitions with the size under 32 GB when they are formatted with the built-in tool of this OS, or up to 2 TB, when other means are employed to format the storage. The file system also doesn't allow creating files the size of which exceeds 4 GB.
To address this issue, exFAT (Extended File Allocation Table) was presented. It doesn't have any realistic limitations with regard to the size and is frequently utilized on external hard drives, SSDs, larger USB thumb drives, etc. Yet, the underlying technology has already become outdated and has a lot of restrictions that make it unsuitable for prevailing use in modern computing environments.
NTFS (New Technology File System) was introduced in 1993 with Windows NT and is currently the most common file system for end-user computers running Windows. The operating systems of the Windows Server line use this format as well.
NTFS has become a significant improvement over FAT in numerous aspects. It is quite reliable thanks to its journaling capabilities and supports many features, including access control, encryption, file compression, etc. Also, it uses more advanced data structures that enable better utilization of storage space and make it far less prone to fragmentation. The entire filesystem relies on several service files:
The $Boot file;
The $MFT file (Master File Table);
The $Bitmap file;
The $LogFile and others.
The $Boot file takes part in the booting process and contains many important FS parameters.
The Master File Table has an entry for each and every file in the filesystem. The records in it are called attributes, and they can hold all sorts of information, from the file’s name, size, permissions, creation/last modification time to the actual data content. When this content isn’t small enough to fit into the MFT entry (which is 1024 bytes in size), NTFS allocates clusters for it outside the MFT and creates pointers to their locations. Other attributes may also be too large for the MFT entry, for example, long file names. Such attributes then get separate clusters as well.
The clusters are usually allocated in sequences referred to as extents. NTFS always attempts to place the content into a single extent. Yet, if contiguous clusters are not available, it creates a new extent somewhere else, dividing a file into fragments.
Directories in NTFS are stored as files, but instead of typical data content, such files keep lists of file names and references identifying those files.
The $Bitmap file keeps track of the status of clusters. Each bit in it represents one cluster and may have the value of 1 when the cluster is occupied, or 0 – when the cluster is free.
Before altering any of its crucial structures, NTFS records these changes to the $LogFile. Such a journal makes it possible to recover them in case of any inconsistencies, that may be caused by a crash during their update. When an error is encountered during normal operation, NTFS identifies the faulty cluster, records it in the $BadClus file and copies the data to another location.
In view of its feature-rich and effective organization, NTFS well suited for the internal use in Windows computers. On the other hand, devices like memory cards or USB flash drives may need a more lightweight filesystem that would remain accessible outside the Windows-only environment.
ReFS (Resilient File System) is the latest development of Microsoft released with Windows Server 2012 and later added to Windows 8.1. Now it is also available for Windows 11.
ReFS has been designed to address certain shortcomings of NTFS, specifically in respect to data corruption. It has much higher tolerance to failures thanks to the Copy-on-Write (CoW) mechanism. When editing the existing metadata, ReFS saves its copy to another area on the storage medium, and instead of overwriting it in place, updates the copy and links this modified copy to the corresponding file. Thus, a considerable quantity of older copies are stored in different locations, making it easy to restore the filesystem integrity and prevent data loss. ReFS also employs checksums that allow it to promptly detect any possible data corruption.
The architecture of ReFS absolutely differs from other Windows formats. It employs B+-trees as a common on-disk structure to represent both metadata and the data of files. Such a tree is composed of the root, internal nodes and leaves. Each tree node has an ordered list of keys or pointers to the nodes of lower level (leaves).
Such a design makes ReFS an optimal format for large storage and high availability systems. But despite its clear advantages, it cannot yet be as stable as NTFS and provide compatibility with other Windows-based devices.
HPFS (High Performance File System) was created by Microsoft in cooperation with IBM and brought to the market with OS/2 1.20 in 1989 as a file system for servers that could provide much better performance when compared to FAT.
In contrast to FAT, which simply allocates any first free cluster on the disk for the file fragment, HPFS seeks to arrange files in contiguous blocks, or at least ensure that its fragments (referred to as extents) are placed maximally close to each other.
At the beginning of HPFS, there are three control blocks occupying 18 sectors: the boot block, the super block and the spare block.
The remaining storage space is divided into parts of contiguous sectors referred to as bands taking 8 MB each. A band has its own sector allocation bitmap showing which sectors in it are occupied (1 – taken, 0 – free).
Each file and directory has its own F-Node located close to it on the disk – this structure contains the information about the location of a file and its extended attributes. A special directory band located in the center of the disk is used for storing directories, while the directory structure itself is a balanced tree with alphabetical entries.
Nevertheless, HPFS had significant limitations and eventually became obsolete. The native support for it has been removed from Windows starting from NT 4.
Hint: The information concerning data recovery perspectives of the FS types used by Windows can be found in the articles on data recovery specificities of different OS and chances for data recovery. For detailed instructions and recommendations, please, read the manual devoted to data recovery from Windows.
If you are interested in the native formats of popular operating systems other than Windows, please refer to the corresponding article:
Last update: April 19, 2023