| |||||||||||||||||
| |||||||||||||||||
The information below describes general issues of deleted files recovery from Linux/Unix file systems and explains why it's hard or impossible to retrive deleted files with thier exact size and names from these file systems. Please take to account that in some cases deleted data recovery is still possible and could be available with our products. Please read product information of the specialized software products to check if deleted file recovery is available with our products for the specific file system. How data is organized? Most existing file systems, including Linux file systems, use so-called block data organization. Originally, storage devices could logically operate with small data units, called sectors. Usually, the size of these sectors is 512 bytes. Each data fragment or file on disk takes one or more these sectors. To access this file, storage driver must address this sector and read data from it. Assuming 'Linear Block Access' (LBA) we may think about storage sectors as about array of cells with ordinal numbers. Due to disk addressing optimization, file system realization logically gathers equal sets of sectors to so-called blocks. Each block is the set of sectors that could be logically addressed with file system driver. Minimal possible block size is one sector; in practice, it's never used block size over 64KB. Most existing file systems (Windows FAT/NTFS, Linux Ext2/Ext3/XFS etc.) use the block as smallest logical disk unit. This means that file (or file tail) below block size will take entire this block. Some file systems (like ReiserFS), however, may use free block 'tails' to store small files and data fragments. Assuming most common case, we have file organization on the disk as follows:
Free space and fragmentation In practice, many sequential create file/append data/truncate data/delete file requests makes free space on file system fragmented. ![]() Figure 1. How fregmentation appears Note: file system driver in this example not wipes blocks 1-2 after 'File 1', but just marks them as 'free'. Before 'File 3' is written, blocks 1-2 still had file data of 'File 1'. Also note that in practice, large files may consist of up to few hundreds of unlinked data fragments, few blocks each. How fragments are linked? File system stores special objects to describe files: information nodes (briefly - inodes). Among all, inode contains the following information:
How object allocation information is organized? The key part of object allocation information is array, list or B-tree of pointers to data blocks or to continuous fragments of blocks. The first part or root of this information is stored as part of inode. So why no undelete? Unlike Windows, Linux/BSD/Unix operation system drivers clean the part of inode information after file is deleted: it fills with zero object size, object type/mode information and object allocation information. This means after file if deleted, software knows nothing about it. Assume files 2 and 3 on Figure 1 are RAW encrypted files (no headers; both are like 'white noise'). Assume both takes full blocks and both are deleted. These means that no allocation information left and data recovery software is unable to detect file 2 or file 3 boundaries: both are too similar and they are mixed one with another. This is classic example of 'impossible recovery' situation. Practical situation is not so far from this: binary files fragments are too similar for data recovery software; there is much heavy fragmentation and so on. All this makes pessimistic prognoses for file undelete. No chance? Recovery is still possible. There set of recovery methods that could help with file recovery, but none can give 100%, or even 80% result (remember about encrypted or compressed file fragments that data recovery software is unable to classify):
Our data recovery products now are developed into two recovery directions: IntelliRAW™ signature search (to recognize file types) and file system structures analysis. This allows combining set of stages of data recovery with level-by-level data classification and going closer to best recovery rates. Last update: 04.06.2009
| |||||||||||||||||
| Copyright © 2004-2009 SysDevSoftware, the Development & Research division of SysDev Laboratories LLC. All rights reserved. | |||||||||||||||||