Next: , Previous: , Up: Top   [Index]


4 Algorithm

GNU ddrescue is not a derivative of dd, nor is related to dd in any way except in that both can be used for copying data from one device to another. The key difference is that ddrescue uses a sophisticated algorithm to copy data from failing drives causing them as little additional damage as possible.

Ddrescue manages efficiently the status of the rescue in progress and tries to rescue the good parts first, scheduling reads inside bad (or slow) areas for later. This maximizes the amount of data that can be finally recovered from a failing drive.

The standard dd utility can be used to save data from a failing drive, but it reads the data sequentially, which may wear out the drive without rescuing anything if the errors are at the beginning of the drive.

Other programs read the data sequentially but switch to small size reads when they find errors. This is a bad idea because it means spending more time at error areas, damaging the surface, the heads and the drive mechanics, instead of getting out of them as fast as possible. This behavior reduces the chances of rescuing the remaining good data.

The algorithm of ddrescue is as follows (the user may interrupt the process at any point, but be aware that a bad drive can block ddrescue for a long time until the kernel gives up):

1) Optionally read a mapfile describing the status of a multi-part or previously interrupted rescue. If no mapfile is specified or is empty or does not exist, mark all the rescue domain as non-tried.

2) (First phase; Copying) Read the non-tried parts of the input file, marking the failed blocks as non-trimmed and skipping beyond them. Skip also beyond slow areas. The skipped areas are tried later in two additional passes (before trimming), reversing the direction after each pass until all the rescue domain is tried. The third pass is a sweeping pass, with skipping disabled. (The purpose is to delimit large errors fast, keep the mapfile small, and produce good starting points for trimming). Only non-tried areas are read in large blocks. Trimming, scraping and retrying are done sector by sector. Each sector is tried at most two times; the first in this step as part of a large block read, the second in one of the steps below as a single sector read.

3) (Second phase; Trimming) Trimming is done in one pass. For each non-trimmed block, read forwards one sector at a time from the leading edge of the block until a bad sector is found. Then read backwards one sector at a time from the trailing edge of the block until a bad sector is found. Then mark the bad sectors found (if any) as bad-sector, and mark the rest of the block as non-scraped without trying to read it.

4) (Third phase; Scraping) Scrape together the data not recovered by the copying or trimming phases. Scraping is done in one pass. Each non-scraped block is read forwards, one sector at a time. Any bad sectors found are marked as bad-sector.

5) (Fourth phase; Retrying) Optionally try to read again the bad sectors until the specified number of retry passes is reached. The direction is reversed after each pass. Every bad sector is tried only once in each pass. Ddrescue can’t know if a bad sector is unrecoverable or if it will be eventually read after some retries.

6) Optionally write a mapfile for later use.


The total error size (errsize) is the sum of the sizes of all the bad-sector blocks. It increases during the trimming and scraping phases, and may decrease during the retrying phase. Non-trimmed and non-scraped blocks are not considered errors. Note that as ddrescue retries the failed blocks, the good data found may divide them into smaller blocks, decreasing the total error size but increasing the number of errors.

The ‘remaining time’ is calculated using the average rate of the last 30 seconds and does not take into account that some parts may be excluded from the rescue (for example with ‘--no-trim’), or that some areas may be unrecoverable. Therefore it may be very imprecise, may vary widely during the rescue, and may show a non-zero value at the end of the rescue. In particular it may go down to a few seconds at the end of the first pass, just to grow to hours or days in the following passes. Such is the nature of ddrescue; the good parts are usually recovered fast, while the rest may take a long time.

The mapfile is periodically saved to disc, as well as when ddrescue finishes or is interrupted. So in case of a crash you can resume the rescue with little recopying. The interval between saves varies from 30 seconds to 5 minutes depending on mapfile size (larger mapfiles are saved at longer intervals).

Also, the same mapfile can be used for multiple commands that copy different areas of the input file, and for multiple recovery attempts over different subsets. See this example:

Rescue the most important part of the disc first.

ddrescue -i0 -s50MiB /dev/hdc hdimage mapfile
ddrescue -i0 -s1MiB -d -r3 /dev/hdc hdimage mapfile

Then rescue some key disc areas.

ddrescue -i30GiB -s10GiB /dev/hdc hdimage mapfile
ddrescue -i230GiB -s5GiB /dev/hdc hdimage mapfile

Now rescue the rest (does not recopy what is already done).

ddrescue /dev/hdc hdimage mapfile
ddrescue -d -r3 /dev/hdc hdimage mapfile

Next: , Previous: , Up: Top   [Index]