|FROM ||Ruben Safir
|SUBJECT ||Subject: [NYLXS - HANGOUT] Data Recovery Tools
|Wish I had this article about 3 years ago.
Open Source Data Recovery Tools To The Rescue
Disasters happen to the best of computers. Luckily, open source apps like
SystemRescueCD, dd, Partedmagic, BackTrack, Security Tools Distribution,
Helix, and TestDisk can help recover important data and bring dead
systems back to life.
By Serdar Yegulalp, InformationWeek June 24, 2008 URL:
At some point, it hits all of us. One day you're chugging merrily along,
and then you're staring at a machine that won't boot -- a machine that
just happens to contain everything of importance to you. While some
degree of disaster is inevitable, it's how you cope with it that counts.
While computers and file systems get more robust with each passing year,
there's always going to be room for disaster recovery techniques. With
open source solutions to help you recover from such messes, you're not
tied to a proprietary product's costs or licensing agreements, and if
you're so inclined you can rework the source code to meet your own needs.
I'm going to explore various ways you can use open source solutions
to recover data, bring dead systems back to life, and save your bacon
in general. Many of the solutions described here run cross-platform
(Lin/Win/Mac), but some of them are *NIX-only and will be described
No two data disasters unfold in quite the same way. To that end, you'll
want to take the appropriate recovery action depending on what went wrong.
* Simple recovery. This is the sort of recovery that you employ
for files that have been freshly deleted -- typically by mistake,
since accidental data deletion is the single biggest reason why
files go missing. Most of the time your local "undelete" function
or recycle bin / trash repository will keep you from doing anything
really stupid, but sometimes that's of no use: for instance, if
you forced a hard delete, or if you deleted a file from a volume
that has no recovery functions enabled. Generally, you can scan
the volume in question offline with a tool designed to look for and
recover specific types of files (see PhotoRec later in this article),
and pick up the pieces that way.
The BackTrack live CD distribution is packed with recovery tools,
including the powerful Autopsy toolset. (click for image gallery)
* Complex recovery. This encompasses things like whole partitions
or boot loaders going missing, along with everything on that
partition. There are two basic approaches you can use here: recover
what data you can from the partition, without trying to reconstruct
the partition information per se; or attempt to reconstruct the
partition's metadata and remount the partition normally.
The first option is easier, although if you're dealing with an
operating system partition, odds are the files you recover from
there probably can't be cobbled back together into a bootable
system. The second option will give you a more complete recovery,
but that'll depend on how much of the actual volume data is
intact. Fortunately, there are ways to recover files even from
partly formatted volumes, as I'll discuss later on.
* Removable storage or optical media. I've had more than my fair share
of CDs or DVDs end up unreadable due to damage or age, but I've been
lucky enough recover what I needed from them. Most tools work fine
on removable drives or optical media, although in the case of CDs
and DVDs, one important thing to keep in mind is the drive you're
using to read them. If you have access to more than one CD/DVD drive,
try creating images using different drives. You may find that one
has better error correction or jitter tolerance than the others, and
can recover data that other drives might skip over entirely. Also,
performing the recovery from within Linux rather than Windows may
yield different results as well.
* Encrypted media recovery. This is where you say your prayers. If
you're trying to recover something -- like, say, the encryption key --
from an encrypted volume, unless you used a laughably weak encryption
algorithm in the first place (and why were you doing that?), you're
about as likely to get that data back as a falling apple is likely
to evade the law of gravity. Encrypted volumes should be treated
with double the care of standard volumes: back up the volume headers
(encrypted file systems typically have a provision for doing this)
and keep them in a safe place, also encrypted.
In all cases, whenever you start performing recovery, make sure that any
media you mount to recover from is mounted as read-only. This minimizes
the chances of the distro or OS mounting and performing read/write
operations on a drive that's "fragile" (i.e., holding data that needs to
be recovered). Also do as little work with the original media as possible:
make an image file from the media, store it on another drive you know
is good, and then work from the image. Linux Distros Specifically For
A recovery-specific Linux distribution is the simplest place to start
if you have a big recovery job ahead of you. You get a great many tools
in one place, along with at least some degree of organization for them.
Many "tiny" distributions like Puppy Linux or DSL (two of my personal
favorites) work nicely for this sort of thing, and are fine for just
mounting a volume and copying files out by hand without doing anything
special. That said, the technically savvy may be better off with a distro
that has as many recovery-specific tools as possible built in.
The Gentoo-based SystemRescueCD distribution, for instance, packs a broad
gamut of tools into a single 200-MB .ISO file. Boot it from a CD or USB
drive and you can perform recovery functions either from a command line
or an optional X desktop. SRCD does require some foreknowledge of Linux,
though; if you're not comfortable doing things like manually mounting
volumes from the command line, you might find yourself somewhat at
sea. That said, it's possible to accomplish just about anything if you
don't mind a bit of a learning curve.
Another recovery distribution that's a touch more user-friendly is
Partedmagic. It features many of the same tools, but boots directly into
an X desktop and provides graphical user interface access to some of the
most common and powerful programs. Again, at least some knowledge of
Linux is helpful, even if it's only basic techniques such as mounting
or unmounting file systems, but less sophisticated users can probably
Helix offers a markedly different interface than BackTrack or STD.
(click for image gallery)
Both distributions also give you the freedom to run the whole gamut of
Linux applications out there, connect to the Internet if needed, and do
most anything else that you might care to do. (They don't come with as
broad a range of programs as most desktop distros, but, if you're so
inclined, it's not hard to add applications of your choice or perform
other customizations on the disk image.)
Experts who aren't daunted by the command line and want to get the full
range of tools available can go for one of various live CD distributions
compiled for performing forensic investigations of various kinds (with
data recovery being one of several possible functions).
I liked BackTrack and the Knoppix-based STD (Security Tools Distribution),
which also crams in a good many general security-oriented tools on top
of a roster of forensic / data-recovery apps.
Another Knoppix re-spin for forensic / recovery work is Helix, which by
default runs in what could be called "paranoid mode": it will not mount
any file systems unless specifically commanded to do so. Since each one
of these is a live CD, give each one a try and see which one you're most
comfortable with out of the box. Dd And Its Variants
One common command-line data recovery tool from the Linux (or, rather,
*NIX) world is dd, which can be used to generate an image file from any
mounted file system or device. This is useful if you are dealing with a
file system or device that's damaged in some way and you want to perform
your recovery operations on a backup copy of the data rather than the
The best way to describe how dd works is to give an example of it in
action. The Paulski Pages Web site shows it being used to perform data
recovery from a physically failing drive. Debian Admin offers another
walkthrough, written for Debian-based Linux distributions, but the
basic concepts can be used more or less universally. The user copies
the file system to an image file, mounts the image as a file system,
and then performs recovery operations from the image.
The "stock" version of dd comes with just about any Linux distro, but
you may want instead to use a variant named dd_rescue, which sports
some slightly different behaviors. You can also use dd_rhelp on top of
it, which is a utility that makes using dd_rescue all the easier. (Yet
another program called ddrescue [note the lack of the underscore] also
exists, with again a slightly different set of options.)
Note that when you attempt to make an image using one of these programs,
always ensure the target for the image has at least as much free space
as the original device. Also, dd can be used to destroy data on any
mounted file system or device, so get familiar with the proper use of
the program before you try to use it for anything!
You can use TestDisk to spelunk a damaged disk and find now-missing
partitions. (click for image gallery)
If using dd seems entirely too difficult, you may want to go instead
with something a little more interactive. Consider TestDisk, which can
perform dd-like actions and is also designed for many other whole-disk
recovery operations -- for instance, repairing a partition lost due to
a damaged partition table or manual deletion.
TestDisk is menu-driven and comes with detailed step-by-step instructions
for running it on a variety of platforms and file systems, although
you'll need to have some understanding of disk structures (partitions,
allocation tables, etc.) before you dive in and begin swimming. You
might want to have a guru riding shotgun with you.
Dd images can be mounted directly as file systems within Linux by using
the mount command. However, if the image contains a file system that
you want to boot directly, there are ways to boot that image within a
virtual machine. This is useful if you're worried about, for instance,
a virus on the imaged drive, or want to perform some other kind of work
that requires a live environment but don't have a machine to spare.
The Java-based Live View tool lets you turn any dd image into a VMware
virtual machine, and also caches any changes made to the image so that
the original image is not touched. Note that you need to have one of
the VMware products (like VMware Player) installed for Live View to
work. File Carving
I mentioned before that it is possible to recover files from file
systems where the partition information has been damaged or where the
volume itself has been partially reformatted. This can be done through a
technique called "data carving" or "file carving," where a program hunts
for specific kinds of files by looking for patterns common to certain
file types. One of the best things about this technique is that it's
almost entirely automatic: all you have to do is point it at a partition
-- or the place where a partition was -- select a place to restore the
files to, and let the program do the heavy lifting.
The authors of TestDisk have created an excellent file-carving tool
called PhotoRec, which recovers many common file formats from pretty
much any type of media.
This is about as essential a standalone file-recovery tool as you're going
to get. I tried it out on a camera card I'd written off as wiped out after
a recent trip, and managed to recover literally everything from the card
-- both pictures and video. It wasn't able to recover some of the actual
filenames, but the files themselves were all perfectly readable. Like
TestDisk, by default it will attempt to copy files somewhere other than
the media being rescued to avoid further damage.
The default settings for PhotoRec generally work fine, but if you need
finer-grained control there are a few options you can set. "Paranoid
mode," normally disabled, recovers everything including partially
corrupted files; if you turn this on, you'll get more data recovered,
but the recovery process may take much longer. (Brew some coffee.) In
the same vein, "Keep corrupted files" will recover files that are not
fully readable in the hopes that the user can salvage something further
from them, perhaps with a hex editor or another tool.
These files recovered with PhotoRec have new names, but the metadata
tells you they are audio files. (click for image gallery)
Note that most of the time, files recovered with PhotoRec will not have
their original names, but internal metadata (e.g., MP3 tags or EXIF data)
will still be available. Also note that if you're looking for one very
specific kind of file in a relatively small file system, you can use the
program's internal options to narrow down the search and not waste time
recovering everything under the sun.
TestDisk and PhotoRec are also both included by default with the
Partedmagic rescue disc, so that's one of the easiest ways to get your
hands on them and put them to work -- but you can also download them as
standalone programs and use them that way. Both can also be integrated
into the BartPE rescue disc if you use it; my trick has been to include
them with my PortableApps installation. It's also possible to mount them
on a removable drive, boot a Vista installation DVD (if you have one),
go to the System Recovery command line, and then run the programs from
there. Advanced Data-Carving Applications
TestDisk and PhotoRec are only the tip of the iceberg, but more advanced
programs are typically intended for full-blown forensics work and not
The granddaddy of all data-carving programs is probably Foremost,
originally developed for the United States Air Force's Office of Special
Investigations. It's now been released into the public domain, so it can
be used pretty much anywhere and even re-used in other programs. Note
that binaries are not readily available for Foremost; you generally
need to compile the program from source to make it workable. Some Linux
distributions (like Ubuntu Feisty) have a precompiled version of Foremost
available in the software repository, which makes it easy to download
In the same vein is Scalpel, a rewrite of the 0.69 version of Foremost
-- it's faster, makes better use of memory, and has a number of other
handy functions for more advanced file retrieval. It also is not readily
available in a binary edition, and must be compiled from source.
Output from The Sleuth Kit's fsstat tool details information about a
file system image dumped from a damaged drive. (click for image gallery)
One extremely powerful collection of tools that can be run on a variety
of platforms (*NIX, BSD, and Windows using the CYGWIN library) is the
Sleuth Kit. Like Foremost and Scalpel, it can search for deleted files
based on hashes or signatures, but it also sports a tremendous range of
other functions. TSK nominally comes as a set of command-line tools,
which you can use if you're comfortable with that, but you can also
download a graphical interface called Autopsy that provides you with a
powerful front-end for the tools in the kit .
One batch of data-carving and -recovery tools that is not open source
but not commercial either, and worth mentioning as a postscript, is the
PartitionSupport.com utilities, which run on 32-bit Windows and provide
a host of functions for recovering data from damaged file systems of
Among the apps available -- just to give you an idea of what they can
do -- is FindJPG and FindDoc, which comb through a disk, cylinder by
cylinder, and reconstruct all .JPG or .DOC files to the current directory,
much like PhotoRec. Again, source code is not available for these tools,
but they're easy to work with and freely available for individual use,
which makes them fine as a personal or in-house recovery toolkit.
-- http://www.mrbrklyn.com - Interesting Stuff http://www.nylxs.com -
Leadership Development in Free Software
So many immigrant groups have swept through our town that Brooklyn, like
Atlantis, reaches mythological proportions in the mind of the world -
RI Safir 1998
http://fairuse.nylxs.com DRM is THEFT - We are the STAKEHOLDERS -
RI Safir 2002
"Yeah - I write Free Software...so SUE ME"
"The tremendous problem we face is that we are becoming sharecroppers
to our own cultural heritage -- we need the ability to participate in
our own society."
"> I'm an engineer. I choose the best tool for the job, politics be
damned.< You must be a stupid engineer then, because politcs and
technology have been attached at the hip since the 1st dynasty in
Ancient Egypt. I guess you missed that one."
© Copyright for the Digital Millennium