Compressed Drive Imaging over a Network
Warning/Disclaimer: You should always back up your data before doing any operations that can put your data at risk such as imaging.
Migrating data between computer drives can often be a big hassle. Sometimes you don’t have enough available cables inside your computer to hook both drives in at the same time. Additionally, you often want to copy data to a drive on a completely separate computer from the source. This article will teach you how to efficiently copy an entire drive image from one computer running linux to another.
Acquiring drive information from the host machine (source)
Let’s assume that you want to copy the first hard drive from your host computer to another computer. Let’s first look at the drive that we wish to image.
kain@slickbox:~> mount ... /dev/sdb1 on /windows/E type vfat (rw,noexec,nosuid, nodev,gid=100,umask=0002,utf8=true) /dev/sda2 on /windows/D type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true) /dev/sda1 on /windows/C type ntfs (ro,noexec,nosuid,nodev,gid=100,umask=0002,nls=utf8)
In this example the first hard drive, /dev/sda, contains two partitions. We also should check the drive size.
kain@slickbox:~> dmesg|grep sda SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) sda: Write Protect is off .... kain@slickbox:~>
We now know that the source drive is roughly 80 GB in size.
Setting up the destination machine
The destination machine should have a designated destination drive, at least 80 GB in size, that you fully intend to clobber with the data from the host machine’s drive. This computer should be booted into linux with the help of a linux boot cd. There are many out there to choose from, including OpenSUSE’s live desktop cd. As you’ll see later, its important to choose a system that has the dd, netcat, and gzip or bzip2 commands.
Let’s assume that the drive you wish to image to on the destination machine is /dev/hda.
kain@newbox:~/> dmesg|grep hda ... hda: Maxtor 7040 AT-TTT, 120015MB w/1024kB Cache, CHS=932/5/17 ...
It’s also important to make sure that this drive is not mounted in any way during the imaging process (ie mount|grep hda should spit out nothing)
kain@newbox:~/> mount|grep hda kain@newbox:~/>
Next, we need to acquire the network address of this computer so that it can later be connected to from the host machine.
kain@newbox:~> su -c "ifconfig" Password: eth1 Link encap:Ethernet HWaddr 00:13:02:20:9F:70 inet addr:192.168.2.4 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::213:2ff:fe20:9f70/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:6745 errors:13187 dropped:14296 overruns:0 frame:0 TX packets:1732 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:9185101 (8.7 Mb) TX bytes:45909990 (43.7 Mb) Interrupt:18 Memory:8c000000-8c000fff kain@newbox:~>
Note, the correct network address depends on what type of device you are using. In this example, the destination machine is connected via its wireless card, “eth1”, and its ip address is 192.168.2.4.
Now you’re ready to set up the receive-and-write end of the network pipeline, so that bytes read over the network are immediately decompressed and written to the disk. It may seem a little backwards to set up the receive process before send process has even started. However, this is necessary because the destination machine will run a server that must be started before the other machine can connect to it to send its data.
The command line for the receive & write process must be ran as root and is as follows:
netcat -l -p 12345 |bzip2 -d | dd of=/dev/hda
The -l switch tells netcat that you are listening for an incoming connection. The -p switch specifies which port you are listening on which is left to your discretion. The network input, when received is then piped into the bzip2 decompressor which in turn sends its data to the dd imaging utility and thus, the hard drive itself. The compression is used to save network bandwidth and is entirely optional. Additionally, you are welcome to use gzip in place of bzip2 if bzip2 is either unavailable or undesired.
A graphical overview of the receive pipeline can be seen on the right.
Setting up the host machine
Now that the destination machine is ready, its time to send the drive over the network to it from the host machine. This is done via the one-liner below
bzip2 -c /dev/sda | netcat 192.168.2.4 12345
The /dev/sda, again is the drive you wish to image. The 192.168.2.4 address is the address obtained of the destination machine above, and lastly note the familiar 12345 port which was the chosen port number for our destination machine’s server.
This process will take some time. Feel free to use system monitor tools such as gkrellm on the host machine to monitor the network traffic rate to estimate roughly how long it will take. If your source drive is partly empty, you’ll get a good compression ratio and will save heavily on network traffic (at the expense of cpu time however).
A graphical overview of the send pipeline can be seen on the right.
Wrapping it up
When finished, you should see something similar to the following on the destination machine’s command line:
netcat -l -p 12345 |bzip2 -d | dd of=/dev/hda 5777+1 records in 5777+1 records out 2957828 bytes (3.0 MB) copied, 110.887 seconds, 26.7 kB/s
Your numbers of course, will be a lot higher. The send command on the host machine should have quietly returned to the prompt as well. At this point, you should be able to access the copied drive on the destination machine. If the destination drive is bigger than the source, you’ll want to create a new partition out of the unallocated space at the end of the drive. Various partition tools such as cfdisk can do this for you.