Search deleted files in NTFS

In this article I would like to tell you about the algorithms that work with the NTFS file system that we used when creating programs for data recovery Hetman Partition Recovery. The article is written as a continuation of previous post about the FAT.

Under the cut I will describe an algorithm to find and restore deleted files from NTFS partition, which we used when developing our program. Best of all, this algorithm is described in the book “Forensic analysis of file systems” author Brian Carrie.


NTFS (New Technology File System) was developed by Microsoft for Windows NT. The main objectives for the developers of NTFS, was the reliability, security and support media high capacity.

Perhaps one of the main features of this file system – storage of all service data in the files. Files with administrative data can be anywhere in the volume as normal files. Thus, unlike other file systems, NTFS has no rigidly defined structure. Entire file system is considered a data area, and any sector can be allocated to a file. This should remain a condition that the first volume sectors contain the boot sector and boot code.

NTFS stores all file information in the master file table MFT (Master File Table). Small files can be stored directly in the MFT record. Otherwise, files are allocated to clusters, the MFT record contains a list of these clusters. The records themselves are very simple. Their size is 1 KB, but only the first 42 bytes has a purpose. The remaining bytes store attributes is a small data structure that serve a specialized function. For example, one attribute is used to store the file name and another to store its content.


figure 1. The basic structure of an MFT record with a title and three attributes


The MFT record contains a small header and the remaining bytes are intended for storage of various attributes. The entry shown in figure, contains three attributes.

It should be noted that the MFT has a backup that can be very useful when restoring data.

Contents of the MFT record


The size of each MFT entry is defined in the boot sector, but all versions of Microsoft uses 1024-byte records. The first 42 bytes contain 12 fields, and the remaining 982 bytes do not have a fixed structure and filled with attributes. In simple terms, the MFT record can be compared with a big trunk for storage. Outside on the trunk is written basic information about the owner — name and address (similar to the fixed fields of MFT records). In the chest you can put any object that is smaller than the size of the chest. MFT record also do not have a fixed structure and contain attributes which store specific information.

In the MFT is a series of 48-bit addressing entries, the first entry has address 0. The maximum MFT address changes as the MFT and the extension is determined by dividing the $MFT on the size of each entry.

Each MFT record also contains a 16-bit sequence number will automatically increase when creating a record. Let's look at the MFT record 313 with a sequence number of 1. The file that was highlighted entry 313 is deleted, and the entry re-allocated to the new file. If that record is assigned a new sequence number 2. Address MFT is combined with a sequence number (which occupies the upper 16 bits) and generates a 64-bit base address of the file.


figure 2. The base address of the file is formed by combining the address of the MFT record and sequence number


NTFS uses the base address to access the MFT record because the sequence number facilitates identification of file system corruption. For example, if the process of allocating data structures for the file in the system fails, we will come to the rescue serial number. Thanks to him, it will be possible to determine whether the address of the MFT record from the previous file, or it is part of the new file. In addition, the sequence number can be used when restoring a deleted content.
As you may have noticed, the structure of the MFT records are minimal, and most of them is used to store attributes of objects that contain data of a certain type. The number of different attributes is large, and each of them has its own internal structure. For example, there are attributes for the file name, date and time, and even the contents of a file. This NTFS again excelled. Typically, the file system read and write file content, but NTFS read and write attributes, one of the varieties which encapsulates the contents of files.

Back to our analogy, in which the MFT record were compared with the trunk, and attributes with small boxes, which are placed in the chest. Boxes can be any shape that best fits for object storage. For example, the discs are easier to store in round boxes, and posters — in long tubes.


figure 3. An example of the MFT record headers and content areas


Although different types of attributes designed for different types of data, all attributes have two parts: header and content. The header is generic and standard to all attributes. The content depends on the attribute type and may have any size.

Recovery files


To restore a deleted file in NTFS is easier than in most file systems. But NTFS has one unpleasant feature. When you delete a file, its name is removed from the index of the parent directory, and its MFT entry and its clusters are freed. When you exclude a file name from the index of the parent directory, the index is sorted again, and name information may be lost. In this case, the name of the remote file disappears from the source directory.

But don't give up, because this disadvantage is partly kompensiruet the fact that all records of the MFT are stored in one table. This greatly simplifies the search of all available records. In addition, each entry contains an attribute with the base address of the parent directory. This means that while free writing, usually, it is possible to define its full path.

To restore all deleted files in NTFS, you need to spend in the MFT search free records. Finding a free entry, we can determine the name of the attribute file name and the address of the parent directory. Pointers to the clusters still exist, and if the data has not been overwritten, they will be able to recover. Recovery is possible even when fragmentation of the file. If the attribute value was a resident (ie for storage, only one MFT record), the data will not be overwritten until the re-allocation of MFT record. If the storage attributes of a file requires more than one MFT record, to fully recover we may need other records.

When restoring files or view remote content can be useful to log data to the file system or change log.

The journal file system allows the operating system to quickly restore the correct state of the file system. Damage to file systems, usually occur when the system crashes during writing data to the file system. The log stores information about all the upcoming updates metadata, and records are created on their successful upgrade. In case of any error, the operation can be undone and the system reverted to its previous state. It should be noted that the magazine contains non-resident data stored in the external cluster, so it cannot be used for file recovery. It contains content resident attributes to undo recent changes.

The change log is a file which records all changes to files and directories. It can be useful to determine the files that were changed during a certain period of time. To detect changes, we need to iterate through all the files and directories in the file system and compare their time stamps with a threshold value. This procedure can take quite some time, but the change logs greatly simplify it.
In conclusion, let me emphasize that NTFS is a very complex and powerful file system. This is because it was designed not only to meet current needs, but also future-proofed. However, despite its complexity, data recovery in NTFS is easier than in most other file systems.

Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Briefly on how to make your Qt geoservice plugin

Database replication PostgreSQL-based SymmetricDS

Yandex.Widget + adjustIFrameHeight + MooTools