Useless Use Of dd

tl;dr: dd works for reading and writing disks, but it has no "low level I/O" capabilities that make it more suited for this than any other shell utility. Like cat you should use it where it makes sense, e.g. to take advantage of its wide array of options, rather than try to ensure that all disk related commands begin and end with dd out of fear and superstition.

If you’ve ever used dd, you’ve probably used it to read or write disk images:

# Write myfile.iso to a USB drive
dd if=myfile.iso of=/dev/sdb bs=1M

Usage of dd in this context is so pervasive that it’s being hailed as the magic gatekeeper of raw devices. Want to read from a raw device? Use dd. Want to write to a raw device? Use dd.

This belief adds unnecessary complexity to simple commands. How do you combine dd with gzip? How do you use pv if the source is raw device? How do you dd over ssh?

People cleverly find ways to insert dd at the front and end of pipelines. dd if=/dev/sda | gzip > image.gz, they say. dd if=/dev/sda | pv | dd of=/dev/sdb.

In both these cases, dd serves no real purpose. It’s purely a superstitious charm trying to ensure safe passage of the data. You can see how silly this is when you replace dd with the functionally equivalent cat: cat /dev/sda | pv | cat > /dev/sdb

The fact of the matter is, dd is not a disk writing tool. Neither “d” is for “disk”, “drive” or “device”. It does not support “low level” reading or writing. It has no special dominion over any kind of device whatsoever.

dd just reads and writes file.

On UNIX, the adage goes, everything is a file. This includes raw disks. Since raw disks are files, and dd can be used to copy files, dd be used to copy raw disks.

But do you know what else can read and write files? Everything:

# Write myfile.iso to a USB drive
cp myfile.iso /dev/sdb

# Rip a cdrom to a .iso file
cat /dev/cdrom > myfile.iso

# Create a gzipped image
gzip -9 < /dev/sdb > /tmp/myimage.gz

dd uses the same interface these commands do, and is not any safer or more reliable.

dd can even end up doing a worse job. By specification, its default 512 block size has had to remain unchanged for decades. Today, this tiny size makes it CPU bound by default. A script that doesn’t specify a block size is very inefficient, and any script that picks the current optimal value may slowly become obsolete — or start obsolete if it’s copied from

Meanwhile, cat is free to choose its buffer size that best serves a modern system, and the GNU cat buffer size has grown steadily over the years from 512 bytes in 1991 to 131072 bytes in 2014. src/ioblksize.h in the coreutils source code has benchmarks backing up this decision.

However, this does not mean that dd should be categorically shunned! The reason why people started using it in the first place is that it does exactly what it’s told: no more and no less.

If an alias specifies -a, cp might try to create a new block device instead of a copy of the file data. If using gzip without redirection, it may try to be helpful and skip the file for not being regular. Neither of them will write out a reassuring status during or after a copy.

dd, meanwhile, has one job*: copy data from one place to another. It doesn’t care about files, safeguards or user convenience. It will not try to second guess your intent, based on trailing slashes or types of files.

However, when this is no longer a convenience, like when combining it with other tools that already read and write files, one should not feel guilty for leaving dd out entirely.

This is not to say I think dd is overrated! Au contraire! It’s one of my favorite Unix tools!

dd is the swiss army knife of the open, read, write and seek syscalls. It’s unique in its ability to issue seeks and reads of specific lengths, which enables a whole world of shell scripts that have no business being shell scripts. Want to simulate a lseek+execve? Use dd! Want to open a file with O_SYNC? Use dd! Want to read groups of three byte pixels from a PPM file? Use dd!

It’s a flexible, unique and useful tool, and I love it. My only issue is that, far too often, this great tool is being relegated to, and inappropriately hailed for, its most generic and least interesting capability: simply copying a file from start to finish.

* dd actually has two jobs: Convert and Copy. A post on comp.unix.misc (incorrectly) claimed that the intended name “cc” was taken by the C compiler, so the letters were shifted in the same way we ended up with a Window system called X. A more likely explanation is given in that thread as pointed out by Paweł and Bruce in the comments: the name, syntax and purpose is almost identical to the JCL “Dataset Definition” command found in 1960s IBM mainframes.

24 thoughts on “Useless Use Of dd”

  1. # cat /dev/cdrom > myfile.iso

    Works

    # cat myfile.iso > /dev/cdrom

    Won’t – dd allows you do handle writes in various block sizes, so if a device can’t handle a one-byte write, cat could well end up writing your byte followed by zeros (and because of where the pointer now is, do it again for the next byte).

    Still, all your other points stand – nice piece.

  2. “If an alias specifies -a, cp might try to create a new block device rather than…”

    The above line is not clear, could please elaborate.

    Also, if dd has only 1 job (convert and copy) why can’t it be used to tar up a directory and its contents into a regular file (similar to what tar does)?

    1. It’s clear if you don’t chop of the ‘rather than…’ bit – you’re making the comparison incomplete!

  3. I had no idea! I was just blindly using dd. Thank you for this wonderful article.

  4. DD is actually a pun on Data Definition I IBM’s JCL language that contemporary computer users would be quite possibly aware of.

  5. You are absolutely right. I heard that dd was originally intended to be called “carbon copy”, but as you said, cc was already occupied.

  6. The name dd likely comes from JCL SYSIN DD command, which was used for punch cards.
    Also – dd(1) is low level in that it can handle hardware read errors. Look at the source and you can see it re-tries when an error occurs.

  7. @Nobody
    ># cat /dev/cdrom > myfile.iso
    >Works
    ># cat myfile.iso > /dev/cdrom
    >Won’t

    Correct. And you’ll see the exact same behavior if you try with `dd`, because they’re both the same

    1. Tell us you don’t understand DD without telling us you don’t understand DD. Noob.

  8. You are wrong, dd has “oflag=direct” and “conv=fdatasync” options, which make a real difference when working with USB drives.

  9. > @void:
    > dd has “oflag=direct” and “conv=fdatasync” options, which make a real difference when working with USB drives.

    Which one? ‘cat /path/to/iso-or-img >/dev/sdx’ works fine for me for years. Sure it’s not placebo?

  10. The article doesn’t tribute the tool for its usefulness. This comes from the fact that the author didn’t mention all the other command options besides ifile and ofile. Try doing the same with cat!

  11. Why is this stupid article being resurrected on Hacker News 7 years later?

    1. Simply because I found it really interesting and thought others would think the same.

      In the context of Hacker News I don’t think resurrected is the right word. When enough time has passed, “reposts” of classic articles are allowed and even seem to be encouraged. It exposes the material to new people.

  12. # Rip a cdrom to a .iso file
    cat /dev/cdrom > myfile.iso

    To be clear, the file “my file.iso” will not necessarily conform to the ISO specs.

Leave a Reply to Blake Frederick Cancel reply