Linux – Page 3 – Vidar's Blog

Is it terminal?

Applications often behave differently in subtle ways when stdout is not a terminal. Most of the time, this is done so smoothly that the user isn’t even aware of it.

When it works like magic

Consider ls:

vidar@vidarholen ~/src $ ls
PyYAML-3.09      bsd-games-2.17       nltk-2.0b9
alsa-lib-1.0.23  libsamplerate-0.1.7  pulseaudio-0.9.21
bash-4.0         linux                tmp
bitlbee-1.2.8    linux-2.6.32.8
vidar@vidarholen ~/src $

Now, say we want a list of projects in our ~/src dir, ignoring version numbers:

# For novelty purposes only; parsing ls is a bad idea
vidar@vidarholen ~/src $ ls | sed -n 's/-[^-]*$//p'
PyYAML
alsa-lib
bash
bitlbee
bsd-games
libsamplerate
linux
nltk
pulseaudio
vidar@vidarholen ~/src $

Piece of cake, right?

But think about the magic that actually happened there: We started out with three lines of coloured text, ran it through sed to search&replace on each line, and ended up with nine lines of uncoloured text.

How did sed filter the colours? How did it put each filename a separate line, when the same does not happen for echo "foo bar" | sed ..?

The answer, of course, is that it didn’t. ls detected that output wasn’t a terminal and altered its output accordingly.

When outputting to a terminal, you can be fairly sure that the user will be reading it directly, so you can make it as pretty and unparsable as you want. When output is not a terminal, it’s likely going to some program or file where pretty output will just complicate things.

Life without magic

Try the previous example with ls -C --color=always instead of just ls, and see how different life would have been without this terminal detection. You can also try this with xargs, to see how colours could break things:

vidar@vidarholen ~/src $ ls -C --color=always | xargs ls -ld
ls: cannot access PyYAML-3.09: No such file or directory
ls: cannot access alsa-lib-1.0.23: No such file or directory
...

The directories obviously exist, but the ANSI escape codes that give them that cute colour also prevents utilities from working with them. For additional fun, copy-pasting this error message from a terminal strips the colours, so anyone you reported it to would be quite stumped.

Magic efficiency tricks

It’s not all about making output pretty or parsable depending on the situation. Read/write syscalls are notoriously expensive; reading anything less than about 4k bytes at a time will make disk reads CPU bound.

glibc knows this, and will alter write buffering depending on the context. If the output is a terminal, a user is probably watching and waiting for it, so it will flush output immediately. If it’s a file, it’s better to buffer it up for efficiency:

vidar@kelvin ~ $ strace -e write -o log grep God text/bible12.txt 01:001:001 In the beginning God created the heaven and the earth. ... vidar@kelvin ~ $ wc -l log 3948 log

In other words, grep wrote about god 3948 times (insert your own bible forum jokes).

vidar@kelvin ~ $ strace -e write -o log grep God text/bible12.txt > tmp vidar@kelvin ~ $ wc -l log 64 log

This time, grep produced the exact same output, but wrote to a file instead. This resulted in 64 writes – about 1% of the more interactive mode!

Spells of confusion

Sometimes magic can confuse and astound. What if output is kinda like a terminal, only not?

ls -l gives the user pretty colours. ls -l | more does not. The reason is not at all obvious for users who just consider ” | more” a way to scroll in output. But it works, even if it’s not as pretty as we’d like.

Here’s a much more confusing example (just go along with the simplified grep):

# Show apache traffic (works)
cat access.log

# Show 404 errors with line numbers (works)
cat access.log | grep 404 | nl

Basic stuff.

# Show apache traffic in realtime (works)
tail -f access.log

# Show 404 errors with line numbers in realtime (FAILS)
tail -f access.log | grep 404 | nl

While the logic is the same as before, our realtime error log doesn’t show anything!

Why? Because grep’s output isn’t a terminal, so it will buffer up about 4k worth of data before writing it all in one go. In the mean time, the command will just seem to hang for no apparent reason!

(Observant readers might ask, “Isn’t tail buffering?”. And it might be or it might not. It depends on your version and distro patches.)

Mastering magic

Ok, so what can we do to take charge of these useful peculiarities?

Many apps have flags for this, though none of them are POSIX.

GNU ls lets you specify -C for columned mode, and --color=always for colours, regardless of the nature of stdout.

sed has -u, grep has a --line-buffered. awk has a fflush function. tail, if yours buffers at all, has a -u since about 2008 which as of now isn’t in debian stable.

If your app doesn’t have such an option, there’s always unbuffer from Expect, the interactive tool scripting package.

unbuffer starts applications within its own pseudo-tty, much like how xterm and sshd does it. This usually tricks the application into not buffering (and maybe to prettify its output).

Obviously, this depends on the app using standard C stdio, or that it checks for a terminal itself. Apps can unintentionally be written to avoid this, like when setting Java’s System.Out to a BufferedOutputStream.

And finally… how can you create such behaviour yourself?

if [[ -t 1 ]] #if stdout is a terminal
then
    tput setaf 3 #Set foreground to yellow
fi
echo "Pure gold"

RAM-loadable Linux on a stick

I wanted to play some SNES games with a friend on one of a dozen public windows boxes, but I didn’t want to start downloading ROMs and installing zsnes on them. The simple solution was to just make a bootable USB memory stick with Ubuntu and boot from that on whichever box was available at the time.

The boxes turned out to have more horsepower than I assumed, and conveniently came with Linux-friendly Intel GPUs, so I wanted to try out OpenArena. Of course, then you need multiple windows boxes, and I just had one memory stick. Time to make it load and run entirely from memory, so the memory stick can be unplugged and used to boot other boxes.

Thanks to the fantastic initramfs mechanism, the best Linux feature since UUID partition selection (initrd wasn’t nearly as sweet), this is very easy to do, even when the distro doesn’t support it. Here are some hints on how to do it:

Install on a memory stick. These days, you can conveniently do this in a VM and still expect it to boot on a PC: kvm -hda /dev/sdb -cdrom somedistro.iso -boot d -m 2200 -net nic -net user -k en-us. A minimal install is preferable, loading GNOME from a slow memory stick just to cover it with OpenArena is a waste.

Ubuntu installs GRUB with UUID partitions, but Debian does not, so in that case you have to update menu.list: replace root=/dev/hda1 with root=UUID=<uuid from tune2fs -l here>

Debian has a fancy system for adding initramfs hooks (see /etc/initramfs-tools) that will survive kernel upgrades, but for generality (and not lazyness at all, no siree), we’ll do it the hacked up manual way: Make a new directory and unpack the initramfs: gzip -d < /boot/initrd.img-2.6.26-2-686 | cpio -i

vim init. Find the place where the root fs has just been mounted, and add some code to mount --move it, mount a tmpfs big enough to hold all the files, copy all the files from the real root and then unmount it:

echo "Press a key to not load to RAM"
if ! read -t 3 -n 1 k
then
    realroot=/tmp/realroot

    mkdir "$realroot"
    mount --move "$rootmnt" "$realroot"
    mount -t tmpfs -o size=90% none "$rootmnt"
    echo
    echo "Copying files, wait..."
    cp -a "$realroot"/* "$rootmnt"
    umount "$realroot"
    echo "Done"
fi

Exercises for the reader: Add a progress meter to make the 1-2 minute load time more bearable.

Pack the initramfs back up: find . | cpio -o -H newc | gzip -9 > /boot/initrd.img-2.6.26-2-686

Boot (still in the VM, if you want) and hit a key when prompted so you're running straight from the stick, install all the packages you want, and configure them the way you want them. In my case, I made the stick boot straight into X, running fluxbox and iDesk to make a big shiny Exit icon that would reboot the box (returning it to Windows), just in case any laymen wandered in on it.

Very important: apt-get clean. I had 500MB of cached packages the first time around, which is half a gig of lost memory and an additional minute of load time.

Try booting it from RAM. Make sure you remember if you're running in RAM or not when configuring, or all changes will be lost.

Debian required some kludges in the checkroot.sh init script to make it not die when the root fs wasn't on disk and thus failed to check, but Ubuntu was very smooth about it. Still, no big deal.

In the end, I had a 1000MB installation that could easily turn a dull park of windows web browsing boxes into a LAN party with no headaches for the administrator. Game on.

What’s up with directory hard link counts?

Ever considered the hard link count from ls on directories?

 
vidar@kelvin ~/src $ ls -l
total 108
drwxr-xr-x  4 vidar vidar  4096 2009-11-22 12:52 aml-lsb
drwxr-xr-x 13 vidar vidar  4096 2009-12-13 16:00 delta3d_REL-2.4.0
drwxr-xr-x 23 vidar vidar  4096 2010-02-02 18:22 linux-2.6.32.7
...

For files, this is the number of hard links. You can use find / -samefile filename to find all files that point to the same file inode.

So what does this number mean for directories? Exactly the same thing.

Users, including root, are blocked from creating directory hard links out of the kernel’s mortal fear of cyclical directory trees (or should I say directory graphs?). The kernel still creates them though, specifically in the form of the “.” entry in the directory itself, and “..” in each subdirectory.

An empty directory /foo/bar will have two links, /foo/bar itself, and /foo/bar/.. When creating a subdirectory /foo/bar/baz, you will get the additional hard link /foo/bar/baz/... In other words, the hard link count is the number of subdirectories plus two.

Here’s a party trick for listing directory hard links in bash:

vidar@kelvin ~/src $ ls -ld aml-lsb/{,.,*/..}
drwxr-xr-x 4 vidar vidar 4096 2009-11-22 12:52 aml-lsb/
drwxr-xr-x 4 vidar vidar 4096 2009-11-22 12:52 aml-lsb/.
drwxr-xr-x 4 vidar vidar 4096 2009-11-22 12:52 aml-lsb/bin/..
drwxr-xr-x 4 vidar vidar 4096 2009-11-22 12:52 aml-lsb/lib/..
vidar@kelvin ~/src $

Clearly, each of them refers to the same thing, and the numbers add up (if they don’t, shopt -s dotglob)

As a side note, you can use mount --rbind to fake a directory hard link. This will remount a directory and all submounts on some other directory, but will prevent cycles.

You can also use mount --bind to remount without submounts. This can be useful for when you want to copy the contents of a directory that has another file system mounted over it. This is most commonly /dev, which is over-mounted with udev early in the boot process. Many people don’t realize that they have an entire /dev they’ve never seen!

Multithreading for performance in shell scripts

Now that everyone and their grandmother have at least two cores, you can double the efficiency by distributing the workload. However, multithreading support in pure shell scripts is terrible, even though you often do things that can take a while, like encoding a bunch of chip tunes to ogg vorbis:

mkdir ogg
for file in *.mod
do
	xmp -d wav -o - "$file" | oggenc -q 3 -o "ogg/$file.ogg"
done

This is exactly the kind of operation that is conceptually trivial to parallelize, but not obvious to implement in a shell script. Sure, you could run them all in the background and wait for them, but that will give you a load average equal to the number of files. Not fun when there are hundreds of files.

You can run two (or however many) in the background, wait and then start two more, but that’ll give terrible performance when the jobs aren’t of roughly equal length, since at the end, the longest running job will be blocking the other eager cores.

Instead of listing ways that won’t work, I’ll get to the point: GNU (and FreeBSD) xargs has a -P for specifying the number of jobs to run in parallel!

Let’s rewrite that conversion loop to parallelize

mod2ogg() { 
	for arg; do xmp -d wav -o - "$arg" | oggenc -q 3 -o "ogg/$arg.ogg" -; done
}
export -f mod2ogg
find . -name '*.mod' -print0 | xargs -0 -n 1 -P 2 bash -c 'mod2ogg "$@"' --

And if we already had a mod2ogg script, similar to the function just defined, it would have been simpler:

find . -name '*.mod' -print0 | xargs -0 -n 1 -P 2 mod2ogg

Voila. Twice as fast, and you can just increase the -P with fancier hardware.

I also added -n 1 to xargs here, to ensure an even distribution of work. If the work units are so small that executing the command starts becoming a sizable portion of it, you can increase it to make xargs run mod2ogg with more files at a time (which is why it’s a loop in the example).

Incremental backups to untrusted hosts

There’s no point in encryption, passphrases, frequent updates, system hardening and retinal scans if all the data can be snapped up from the backup server. I’ve been looking for a proper backup system that can safely handle incremental backups to insecure locations, either my personal server or someone else’s.

This excludes a few of the common solutions:

Unencrypted backups with rsync. Prevents eavesdropping when done over ssh, but nothing else.
Rsync to encrypted partitions/images on the server. Protects against eavesdropping and theft, but not admins and root kits. Plus it requires root access on the server.
Uploading an encrypted tarball of all my stuff. Protects against everything, but since it’s not incremental, it’ll take forever.

My current best solution: An encrypted disk image on the server, mounted locally via sshfs and loop.

This protects data against anything that could happen on the server, while still allowing incremental backups. But is it efficient? No.

Here is a table of actual traffic when rsync uploads 120MB out of 40GB of files, to a 400gb partition.

Setup	Downloaded (MB)	Uploaded (MB)
ext2	580	580
ext3	540	1000
fsck	9000	300

Backups take about 15-20 minutes on my 10mbps connection, which is acceptable, even though it’s only a minute’s worth of actual data. To a box on my wired lan, it takes about 3 minutes.

Somewhat surprisingly, these numbers didn’t vary more than ±10MB with mount options like noatime,nodiratime,data=writeback,commit=3600. Even with the terrible fsck overhead, which is sure to grow worse over time as the fs fills up, ext2 seems to be the way to go, especially if your connection is asymmetric.

As for rsync/ssh compression, encryption kills it (unless you use ECB, which you don’t). File system compression would alleviate this, but ext2/ext3 unfortunately don’t have this implemented in vanilla Linux. And while restoring backups were 1:1 in transfer cost, which you’ve seen is comparatively excellent, compression would have cut several hours off of the restoration time.

It would be very interesting to try this on other FS, but there aren’t a lot of realistic choices. Reiser4 supports both encryption and compression. From the little I’ve gathered though, it encrypts on a file-by-file basis so all the file names are still there, which could leak information. And honestly, I’ve never trusted reiserfs with anything, neither before nor after you-know-what.

ZFS supposedly compresses for read/write speed to disk rather than for our obscure network scenario, and if I had to guess from the array of awesome features, the overhead is probably higher than ext2/3.

However, neither of these two FS have ubiquitous Linux support, which is a huge drawback when it comes to restoring.

So a bit more about how specifically you go about this:

To set it up:

#Create dirs and a 400gb image. It's non-sparse since we really
#don't want to run out of host disk space while writing.
mkdir -p ~/backup/sshfs ~/backup/crypto
ssh vidar@host mkdir -p /home/vidar/backup
ssh vidar@host dd of=/home/vidar/backup/diskimage \
        if=/dev/zero bs=1M count=400000

#We now have a blank disk image. Encrypt and format it.
sshfs -C vidar@host:/home/vidar/backup ~/backup/sshfs
losetup /dev/loop7 ~/backup/sshfs/diskimage
cryptsetup luksFormat /dev/loop7
cryptsetup luksOpen /dev/loop7 backup
mke2fs /dev/mapper/backup

#We now have a formatted disk image. Sew it up.
cryptsetup luksClose backup
losetup -d /dev/loop7
umount ~/backup

To back up:

sshfs -C vidar@host:/home/vidar/backup ~/backup/sshfs
losetup /dev/loop7 ~/backup/sshfs/diskimage
cryptsetup luksOpen /dev/loop7 backup
mount /dev/mapper/backup ~/backup/crypto

NOW=$(date +%Y%m%d-%H%M)
for THEN in ~/backup/crypto/2*; do true; done #beware y3k!
echo "Starting Incremental backup from $THEN to $NOW..."
rsync -xav --whole-file --link-dest="$THEN" ~ ~/backup/crypto/"$NOW"

umount ~/backup/crypto
cryptsetup luksClose backup
losetup -d /dev/loop7
umount ~/backup/sshfs

If you know of a way to do secure backups with less overhead, feel free to post a comment!

dd is not a backup tool!

Pretty much all Linux newbies will at some point be dazzled by the amazing powers of dd, and consider using it for backups. DON’T! Allow me to elaborate:

dd must be run on an unmounted device. The point of using dd is usually to get a snapshot, but it’s not a snapshot if the system keeps running and modifying the FS while it’s being copied! The “snapshot” will be a random collection of all the states that the data and metadata were in during the 30+ minutes it took to copy.
It’s hard to restore on a file by file basis. You hardly ever want to restore everything, usually you just want one file or directory that was accidentally deleted, or all files except the ones you’ve been working on since the backup was taken.
It’s hard to restore to new hardware. If you suffer a massive disk crash, you will indeed want to restore everything. If you’re restoring to the same size disk, and you don’t decide that you want less swap or a bigger root partition while you’re at it, you can now easily restore and thank the gods that most FS don’t rely on disk geometry anymore. If you try to restore to a smaller disk on a secondary/old computer, you’re just screwed. If you upgrade to a larger disk (by far the most likely scenario), you’ll be playing the partition shuffle for a while to get use of the new space.
It’s highly system dependent, and requires root to extract files. You can’t use your mum’s Wintendo or even your school’s Linux boxes to get out that geography report. And if you’re sick of Linux after it botched your system, you can’t switch to FreeBSD or OSX.
You can’t do incremental backups. You can’t properly back up just the information that has changed. This all but kills network backups, and dramatically reduces the number of snapshots you can keep.

So when is dd a decent choice for backups?

Take a snapshot of a new laptop that doesn’t come with restoration disks, so that you can restore it if you sell the laptop to a non-geek or if the laptop needs servicing (it’ll make life easier for clueless techies, and companies have been known to use Linux as an excuse for not covering hardware repairs).

Create a disk image right before you try something major that you want to be able to reverse, such as upgrading to the latest Ubuntu beta to see if the new video driver works better with your card. Or right before installing Puppy Linux to write a little review about it. Restoring the image will be easier than downgrading/reinstalling, and you won’t have done any work in the mean time.

Image a computer and teach the kids how to install an operating system in a realistic scenario.