The basics of Apple’s Fusion Drive: data organization and recovery principles
In an effort to find a compromise and offer better performance when compared to the traditional HDD while leaving the cost per gigabyte relatively low, Apple Inc. launched Fusion Drive – a combination of a solid state and platter-based hard drive aimed to take the best of two worlds, i.e. modern flash and ordinary spinning disk storage. However, the substantial performance boosts provided by it are sometimes accompanied by various issues which inevitably result in data loss and the need for its recovery. The article is intended to explain the peculiarities of this technology and suggest some techniques which may be applied to recover data lost from it.
What is Fusion Drive?
First seen in the Mountain Lion operating system released in late 2012, it is now supported by two Apple's desktop computers – iMac and Mac Mini running macOS 10.8 and later.
Fusion Drive is Apple's intelligent automatic data management system which integrates two different digital media – a traditional rotating HDD and a nonvolatile storage based on the SSD technology which function as a single logical unit and are presented as a single volume to the end-user in Finder.
Similarly to RAID 0, in which data is split across the drives of the array, data on such a storage is spanned across the two constituents of the system with one key difference: storage space can be dynamically reallocated according to the data usage frequency to achieve maximum performance – most frequently accessed files together with system files reside on the faster flash drive while rarely used ones are moved to the capacious HDD part. As a result, the system can boot faster and the launch time of regularly employed applications is also reduced.
In essence, the technology is based on the concept of automated storage tiering which implies data migration between different connected storage devices in compliance with performance and capacity requirements and is its pure software-based implementation. The only software component it relies on is a logical volume manager called CoreStorage. CoreStorage serves as an extra layer of abstraction between macOS and Mac's drives/partitions which are arranged into Logical Volume Groups instead of being directly handed over to the operating system. It allows creating spanned volumes while Fusion Drive is fundamentally a Logical Volume Group consisting of a hard disk and a solid state drives.
As has already been stated, the system is composed of two individual drives: a hard disk drive and a solid state one. The total capacity of such a storage equals the sum of the capacity of both disks. The typical configuration looks as follows:
dev/disk0 – a physical SSD incorporated into a Logical Volume Group;
dev/disk1 – a physical HDD incorporated into a Logical Volume Group;
dev/disk2 – a logical volume which includes both disk0 and disk1.
Both disk0 and disk1 consist of at least 3 partitions: one small service EFI partition at the beginning of the disk, one large Fusion Drive data partition in the middle of the disk and macOS system configuration partition at the end of the disk.
The data partition occupies up to 99% of the space and usually starts with sector 409,640. This is the only partition dedicated to Fusion Drive. It also stores all the metadata needed for correct assembly of the whole system and correct reading of its data. Three major areas of metadata can be singled out:
The Encrypted blocks area is found at the end of the data partition and includes encrypted metadata necessary for data interpretation. Metadata on disk0 and disk1 is encrypted with different keys, its contents do not coincide completely, but one of the copies is enough for correct data reconstruction.
The Volume header area is located in the zero and the last sector of the data partition and stores its UUID and the UUID of the Logical Volume Group it belongs to, the size of this volume, encryption keys for Encrypted blocks found in the Encrypted blocks area and the disposition of the copies of Disk Label.
The Disk label area contains Volume Descriptor which stores the location of encrypted blocks, various information about the Logical Volume Group in XML, including its UUID (which corresponds to the value in the Volume Header), name and the list of volumes it consists of.
All user data is written to the solid state drive first (disk0) until it becomes almost full – a "buffer area" of about 4 GB is reserved for incoming files. After that, the system will start filling up the HDD (disk1) while infrequently accessed items will be transferred from the flash drive to the magnetic drive (and frequently used ones – to the SSD accordingly). Data movement between disk0 and disk1 is performed during idle periods in block chains (the size of one block is 128 KB, the number of block chains can reach several million) and hinges solely on data access patterns tracked by CoreStorage: if rarely used data stored on the HDD gets commonly accessed, it will be migrated to the SSD.
It also should be mentioned that "fusion" in this case is not the synonym of "hybrid": hybrid drive architectures employ dissimilar techniques relying on data caching, in which information is primarily stored on the HDD element and only some algorithmically determined portions of it are mirrored from it to enhance performance.
Advantages and drawbacks
Giving the speed and instantaneous start-up of an SSD and the cheap storage space of a HDD, Fusion Drive ensures that read and write times for frequently used data are as short as possible. Still, the technology has a downside which also should be considered:
Such a set cannot be as fast a pure SSD, especially when working with older files;
Fusion Drive is supported by the iMac and Mac Mini models only. According to the reports, the option is not expected to become available for other Mac systems;
The configuration significantly increases the chances for total data loss or corruption should one of the drives be accidentally disconnected or fail.
What may lead to data loss?
Just like any storage device, Fusion Drive may run into troubles while functioning and lose critical user data. Besides typical data loss situations when data loss is caused by mistaken deletion of files or storage formatting, among most commonly encountered issues are:
The storage is presented as two drives instead of one in Finder
In this case disks become unlinked and do not function as a Fusion Drive anymore, rendering the data they contain unusable. This may be caused by the misuse of disk management tools, a software issue or the replacement of one of the drives.
Fusion Drive becomes unbootable
As a rule, this problem appears when either the SSD or HDD fails. The HDD component is more prone to sudden failure, while the SDD one usually fails gradually and provides warnings, like write errors, SMART notifications, etc. But what is peculiar about Fusion Drive is that even if only one of the drive fails, the entire storage becomes inoperable and won’t boot anymore leaving the data on the intact drive unreadable.
Partitions go missing
There may be a number of reasons for this problem, from file system corruption due to a sudden power failure or software malfunction to incorrect usage of disk management utilities.
Bad sectors on a hard drive
Attempts to fix bad sectors or solve corruption issues using Disk Utility may cause serious logical damage and lead to irreversible data loss. Therefore, important data should be retrieved before the repair procedure.
The specifics of data recovery
Data on Fusion Drive is spanned across the two disks without being duplicated or shadowed and is extensively fragmented. A part of metadata blocks needed for its correct reading is stored on the SSD only, and the other part – on the HDD, thus, working with it without proper storage assembly or when one of the components is missing will give no usable result, as it becomes beyond the bounds of possibility to put all the pieces together. Also, almost all file system metadata is stored on the SSD part, without which it is impossible to correctly build a file and directory tree. Moreover, at least one copy of the encrypted blocks area must be decrypted successfully with the decryption keys stored inside the Volume Header.
Consequently, total failure of one of the drives or severe damage of metadata leaves no chances for data recovery. In other cases lost files can be restored using one of the following approaches:
physical issues with the drive cannot be handled by software means, but some of them can be fixed if both of the drives are sent to a reliable recovery service provider;
in case of a logical issue, the storage can be assembled with the help of UFS Explorer.
Last update: January 22, 2019