What is data recovery?

essence of data recovery and principles of working of data recovery software

Despite the constantly growing reliability of storage devices, the loss of digital information remains quite commonplace. The frequent causes of lost files include human errors, software malfunctions (such as computer viruses), power outages as well as hardware failures. Luckily, the information stored on a digital medium is almost always recoverable. The following article explains what data recovery is, describes the most common data loss problems and ways of addressing them.

What is data recovery?

Data recovery can be defined as a process of obtaining the information located on a storage device that cannot be accessed by the standard means due to its previous deletion or certain damage to the digital medium. Different approaches are used to regain the missing files, yet, only on the condition that their content is present somewhere within the storage. For instance, data recovery doesn't cover the situations when a file has never been written to a persistent storage, like documents that were created but could not be eventually saved to the hard disk drive due to a power failure. Also, none of the existing restore methods can cope with the cases of permanent erasure which occurs when some other information occupies its storage space – under such circumstances, the lost files can only be retrieved from an external backup.

In general, data recovery techniques are divided into two types: software-based and ones involving the repair or replacement of the damaged hardware components in a laboratory setting. A software-based approach is employed in the majority of cases and involves the use of specialized utilities able to interpret the logical structure of the problem storage, read out the required data and deliver it to the user in a usable form for further copying. Physical repairs are conducted by specialists in the most severe instances, for example, when some mechanical or electrical parts of the drive no longer work properly – in this case, all the measures are directed towards a one-time extraction of the critical content, without the possibility of continued usage of the affected device.

The most typical cases of data loss

By and large, the overall success of a data rescue procedure depends heavily on the choice of the right method for retrieval and its timely application. That is why it is highly important to understand the nature of the particular loss instance and know what can be done in each specific scenario. In contrast, the wrong actions can lead to the irreversible destruction of the information.

The most common causes of data loss include:

  • Accidental deletion of files or folders

    Each file system acts differently when deleting a file. For instance, in Windows the FAT file system marks file directory entries as "unused" and destroys the information about the allocation of the file (except for the beginning of the file), in NTFS only the file entry is marked as "unused", the record is deleted from the directory and the disk space is also marked as "unused"; most Linux/Unix file systems destroy the file descriptor (information about the file location, file type, file size, etc.) and mark the disk space as "free".

    Hint: To learn more about file systems and their types, please, refer to the basics of file systems.

    The main purpose of file deletion is to release storage space used by the file for storing a new file. For performance reasons storage space is not wiped immediately which makes the actual file content remain on the disk until this storage space is reused for saving a new file.

    Hint: Please rely on the following guide if you need to recover deleted files.

  • File system formatting

    File system formatting can be started by mistake, for example, as a result of specifying a wrong disk partition or on account of mishandling a storage (e.g. NAS devices usually format the internal storage after an attempt to reconfigure RAID).

    The formatting procedure creates empty file system structures on the storage and overwrites any information after that. If the types of the new and the former file systems coincide, it destroys the existing file system structures by overwriting them with new ones; if the types of the file systems differ, the structures are written to different locations and may wipe the user’s content.

  • Logical damage to the file system

    Modern file systems have a high level of protection against internal errors, yet, they often remain helpless against hardware or software malfunctions. Even a small piece of wrong content written to a wrong location on the storage can cause the destruction of file system structures, breaking file system object links and making the file system non-readable. Sometimes, this issue may occur due to blackouts or hardware failures.

  • Loss of information about a partition

    This failure may occur because of a failed "fdisk" operation or user's errors, which usually result in the loss of information about the location and size of a partition.

  • Storage failure

    If you suspect any physical issues with the storage (e.g. the device doesn't boot, makes unusual noises, overheats, faces problems with reading, etc.), it is not recommended to perform any data recovery attempts on your own. You should take the storage to a specialized.

    If a failure has occurred to a RAID system (failure of one drive in RAID 1 or RAID 5, failure of maximum two drives in RAID 6, etc.), restoration is possible without the missing drive, as the redundancy of RAID allows recreating the content of a failed component.

    Hint: For more information concerning the possibility of successful retrieval of files depending on the data loss case and the operating system, please refer to the articles describing the specifics of data recovery from different OS and chances for restoring data.

How does data recovery software work?

The information remaining on an intact storage can usually be recovered without professional help by means of data specialized software. However, it is important to keep in mind that no information is recoverable after being overwritten. For this reason, nothing should be written to the storage until the last file from it is rescued.

Most data recovery utilities operate using the algorithms of metadata analysis, the method of raw recovery based on the known content of files or a combination of the two approaches.

Metadata is hidden service information contained within the file system. Its analysis allows the software to locate the principal structures on the storage that keep record of the placement of files content, their properties and directory hierarchy. After that, this information is processed and used to restore the damaged file system. This method is preferred over the raw recovery as it allows obtaining files with their original names, folders, date and time stamps. If the metadata wasn’t seriously corrupted, it may be possible to reconstruct the entire folder structure, depending on the specifics of the mechanisms employed by the file system to get rid of “unnecessary” items. Yet, such analysis cannot be performed successfully when the crucial parts of metadata are missing. That is why it is extremely important to refrain from using file system repair tools or initiating operations that may result in its modification until the data is restored completely.

As a rule, when the desired result wasn’t achieved with the help of metadata analysis, the search for files by their known content it performed. In this case, the “known content” doesn’t imply the entire raw content of a file, only particular patterns that are typical for the files of the given format and may indicate the beginning or the end of the file. These patterns are referred to as “file signatures” and can be used to determine whether a piece of data on the storage belongs to a file of a recognized type. Files recovered with this method receive an extension based on the found signature, new names and get assigned to new folders, usually created for files of different types. The main limitation of this approach is that some files may lack identifiable signatures or have only a signature denoting the start of a file, making it hard to predict where it ends, especially when its parts are not stored consequently.

To get the lost files back with maximum efficiency, data recovery software may use the described techniques concurrently during a single scan launched on storage. Other details depend mainly on the type of the type of digital medium and can be found in the data recovery solutions section.

Last update: August 08, 2022

If you liked this article, you can share it on social media: