Why Bash is like that: Pseudo-syntax

Bash can seem pretty random and weird at times, but most of what people see as quirks have very logical (if not very good) explanations behind them. This series of posts looks at some of them.

 # Why doesn't || work in [ ] ?
 if [ -f /etc/inetd.conf || -d /etc/xinetd.d ]; then .. 

# And why does it work in [[ ]] ?
 if [[ -f /etc/inetd.conf || -d /etc/xinetd.d ]]; then ..

Short answer: [ is a regular command, and can’t override ||. [[ is shell syntax.

[ is a pseudo-syntactical command. That is, it’s a regular command much like cp or grep, the name just happens to be a single opening square bracket (try ls -l /usr/bin/[).

Seeing as how it’s a regular command, it can’t affect shell grammar. In grep -q kittens file || echo kittens >> file, grep can’t know or do anything about the fact that it’s being used with “||”. The same goes for [ in the example.

In Bash (but not necessarily other shells), [ is now a builtin command emulating /usr/bin/[ for efficiency. There’s no reason why Bash couldn’t make [ a || b ] work, but this would break compatibility.

Instead, we have [[ which is not bound by legacy, and does in fact alter shell syntax. [[ interprets ||, &&, globs and unquoted variable expansions in ways that an external [-command couldn’t, and an internal [ therefore can’t either.

To get around it without using [[, we’d do [ a ] || [ b ] or [ a -o b ].

# Why is this always true (should be false when foo is empty)?
if [ -n $foo ]

Short answer: $foo disappears and [ -n ] is shorthand for [ -n “-n” ], which is true

This comes back to the fact that [ is a regular command. If $foo is empty, the shell runs [ -n ], and there’s no way for [ to know that there was a variable that was expanded out.

[ -n x ] checks that x is not empty. [ x ] is shorthand for the same. [ -n ] therefore checks if dash-n is empty, which it isn’t, and thus the expression is always true.

Why doesn’t [ complain if you have -n without a parameter? Same thing again. It wouldn’t be able to tell that you said [ $1 ] rather than [ -n ], and then scripts would break whenever you work with your parameter list or any other strings starting with a dash.

Instead, you’d use [ -n "$foo" ], which will prevent bash from removing foo when it’s empty, and instead send in a zero-length argument. Or you could use [[ -n $foo ]], since [[ is built into the shell grammar and can tell that there is a variable there.

# Why doesn't this work?
if [ grep -q text myfile ]

Short answer: [ is a command, grep is a command. Choose one.

if [ -n "$foo" ] is the same as if test -n "$foo", which makes it more obvious how to use other commands (if grep -q text myfile).

Pseudo-syntactical elements makes it harder to tell language from implementation for better and for worse. It makes code look neat, but can be confusing. Especially when basically all if-statements include [.

A similar effect is seen in here-documents (cmd << EOF), where many people only ever see the delimiter “EOF” and think it’s some kind of keyword. Next time, use cmd << KITTENS to tip off anyone reading your script.

What’s in a SSH RSA key pair?

You probably have your own closely guarded ssh key pair. Chances are good that it’s based on RSA, the default choice in ssh-keygen.

RSA is a very simple and quite brilliant algorithm, and this article will show what a SSH RSA key pair contains, and how you can use those values to play around with and encrypt values using nothing but a calculator.

RSA is based on primes, and the difficulty of factoring large numbers. This post is not meant as an intro to RSA, but here’s a quick reminder. I’ll use mostly the same symbols as Wikipedia: you generate two large primes, p and q. Let φ = (p-1)(q-1). Pick a number e coprime to φ, and let d ≡ e^-1 mod φ.

The public key is then (e, n), while your private key is (d, n). To encrypt a number/message m, let the ciphertext c ≡ m^e mod n. Then m ≡ c^d mod n.

This is very simple modular arithmetic, but when you generate a key pair with ssh-keygen, you instead get a set of opaque and scary looking files, id_rsa and id_rsa.pub. Here’s a bit from the private key id_rsa (no passphrase):


-----BEGIN RSA PRIVATE KEY-----
MIIBygIBAAJhANj3rl3FhzmOloVCXXesVPs1Wa++fIBX7BCZ5t4lmMh36KGzkQmn
jDJcm+O9nYhoPx6Bf+a9yz0HfzbfA5OpqQAyC/vRTVDgHhGXY6HFP/lyWQ8DRzCh
tsuP6eq9RYHnxwIBIwJhAKdf+4oqqiUWOZn//vXrV3/19LrGJYeU

...
-----END RSA PRIVATE KEY-----

How can we get our nice RSA parameters from this mess?

The easy way is with openssl: (I apologize in advance for all the data spam in the rest of the article).


vidar@vidarholen ~/.ssh $ openssl rsa -text -noout < id_rsa
Private-Key: (768 bit)
modulus:
00:d8:f7:ae:5d:c5:87:39:8e:96:85:42:5d:77:ac:
54:fb:35:59:af:be:7c:80:57:ec:10:99:e6:de:25:
...
publicExponent: 35 (0x23)
privateExponent:
00:a7:5f:fb:8a:2a:aa:25:16:39:99:ff:fe:f5:eb:
57:7f:f5:f4:ba:c6:25:87:94:48:64:93:fb:3d:a7:
...
prime1:
...
prime2:
...
exponent1:
...
exponent2:
...
coefficient:
...

Here, modulus is n, publicExponent is e, privateExponent is d, prime1 is p, prime2 is q, exponent1 is dP from the Wikipedia article, exponent2 is dQ and coefficient is qInv.

Only the first three are strictly required to perform encryption and decryption. The latter three are for optimization and the primes are for verification.

It’s interesting to note that even though the private key from RSA’s point of view is (d,n), the OpenSSH private key file includes e, p, q and the rest as well. This is how it can generate public keys given the private ones. Otherwise, finding e given (d,n) is just as hard as finding d given (e,n), except e is conventionally chosen to be small and easy to guess for efficiency purposes.

If we have one of these hex strings on one line, without colons, and in uppercase, then bc can work on them and optionally convert to decimal.


# If you don't want to do this yourself, see end for a script
vidar@vidarholen ~/.ssh $ { echo 'ibase=16'; cat | tr -d ':\n ' | tr a-f A-F; echo; } | bc

00:d8:f7:ae:5d:c5:87:39:8e:96:85:42:5d:77:ac:
54:fb:35:59:af:be:7c:80:57:ec:10:99:e6:de:25:
98:c8:77:e8:a1:b3:91:09:a7:8c:32:5c:9b:e3:bd:
....
Ctrl-d to end input

13158045936463264355006370413708684112837853704660293756254884673628\
63292...

We also need a power-modulo function, since b^e % m is unfeasibly slow if you go by way of b^e. Luckily, bc is programmable.


vidar@vidarholen ~/.ssh $ bc
bc 1.06.94
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.

# Our powermod function:
define pmod(b,e,m) { if(e == 0 ) return 1; if(e == 1) return b%m; rest=pmod(b^2%m,e/2,m); if((e%2) == 1) return (b*rest)%m else return rest; }


#Define some variables (this time unabbreviated)
n=13158045936463264355006370413708684112837853704660293756254884673628\
63292777770859554071108633728590995985653161363101078779505801640963\
48597350763180843221886116453606059623113097963206649790257715468881\
4303031148479239044926138311

e=35

d=10150492579557375359576342890575270601332058572166512326253768176799\
23111571423234513140569517447770196903218153051479115016036905320557\
80231250287900874055062921398102953416891810163858645414303785372309\
5688315939617076008144563059


# Encrypt the number 12345
c=pmod(12345, e, n)


# Show the encrypted number
c

15928992191730477535088375321366468550579140816267293144554503305092\
03492035891240033089011563910196180080894311697511846432462334632873\
53515625


#Decrypt the number
pmod(c, d, n)

12345

Yay, we’ve successfully encrypted and decrypted a value using real life RSA parameters!

What’s in the public key file, then?

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAGEA2PeuXcWHOY6WhUJdd6xU+zVZr758gFfsEJnm3iWYyHfoobORCaeMMlyb472diGg/HoF/5r3LPQd/Nt8Dk6mpADIL+9FNUOAeEZdjocU/+XJZDwNHMKG2y4/p6r1FgefH vidar@vidarholen.spam

This is a very simple file format, but I don’t know of any tools that will decode it. Simply base64-decode the middle string, and then read 4 bytes of length, followed by that many bytes of data. Repeat three times. You will then have key type, e and n, respectively.

Mine is 00 00 00 07, followed by 7 bytes “ssh-rsa”. Then 00 00 00 01, followed by one byte of 0x23 (35, our e). Finally, 00 00 00 61 followed by 0x61 = 97 bytes of our modulus n.

If you want to decode the private key by hand, base64-decode the middle bit. This gives you an ASN.1 encoded sequence of integers.

This is an annotated hex dump of parts of a base64-decoded private key

30 82 01 ca   - Sequence, 0x01CA bytes
    02 01: Integer, 1 byte
        00
    02 61:    - Integer, 0x61 bytes (n).
        00 d8 f7 ae 5d c5 87 39 8e 96 ... Same as from openssl!
    02 01:  - Integer, 1 byte, 0x23=35 (e)
        23
    02 61  - Integer, 0x61 bytes (d)
        00 a7 5f fb 8a 2a aa 25 16 39 ...
    ...

Here’s a bash script that will decode a private key and output variable definitions and functions for bc, so that you can play around with it without having to do the copy-paste work yourself. It decodes ASN.1, and only requires OpenSSL if the key has a passphrase.

When run, and its output pasted into bc, you will have the variables n, e, d, p, q and a few more, functions encrypt(m) and decrypt(c), plus a verify() that will return 1 if the key is valid. These functions are very simple and transparent.

Enjoy!

MP3 to Video using GStreamer visualizations

VLC showing a sparkly shiny visualization
Everyone loves music visualization, but not all apps support it in a sensible way. Maybe you want to shuffle a random assortment of video and audio files in a player that doesn’t handle that well (VLC!), or not at all (mplayer!). Or maybe you want to upload something to youtube, with gorgeous HD visualizations instead of that lame static cover art image?

The few google results on the topic that weren’t spam suggested screencapping software. Yeah, that’s great… until you have more than two files.

Once again, everyone’s favourite multimedia swiss army knife – GStreamer – steps up to the plate.

Here’s an example of encoding an MP3 to an H.264 .mkv file using the gorgeous goom visualizer (requires the mp3 and x264 plugins for gstreamer):


gst-launch filesrc location=input.mp3 ! queue ! tee name=stream ! queue ! mp3parse ! matroskamux name=mux ! filesink location="output.mkv" stream. ! queue ! mp3parse ! mad ! audioconvert ! queue ! goom ! ffmpegcolorspace ! video/x-raw-yuv,width=1280,height=720 ! x264enc ! mux.

It’s beautiful – and the video is pretty sweet as well.

It’s worth noting that this approach does not re-encode the MP3, like some less awesome approaches would do (causing loss of quality). It simply muxes it together with the visualizer’s video stream. x264 even seems to distribute itself well across cores.

No, wait, what? MP3 and H.264? Of course, I meant Vorbis and Theora! Let me rephrase:


gst-launch filesrc location=input.ogg ! queue ! tee name=stream ! queue ! oggdemux ! vorbisparse ! oggmux name=mux ! filesink location="output.ogg" stream. ! queue ! oggdemux ! vorbisdec ! audioconvert ! queue ! goom ! ffmpegcolorspace ! video/x-raw-yuv,width=1920,height=1080 ! theoraenc ! mux.

The same goodness applies, except for the parallelism. If you have a multicore CPU, there’s massive speedup to be had through simple shell script based multithreading. (Why full HD this time? VLC on Windows crashes on 720p Theora!)

And there you have it. A simple, hack-free, modular and flexible way of encoding visualization videos for MP3 and Ogg Vorbis files. Thanks, GStreamer!

Is it terminal?

Applications often behave differently in subtle ways when stdout is not a terminal. Most of the time, this is done so smoothly that the user isn’t even aware of it.

When it works like magic

Consider ls:

vidar@vidarholen ~/src $ ls
PyYAML-3.09      bsd-games-2.17       nltk-2.0b9
alsa-lib-1.0.23  libsamplerate-0.1.7  pulseaudio-0.9.21
bash-4.0         linux                tmp
bitlbee-1.2.8    linux-2.6.32.8
vidar@vidarholen ~/src $

Now, say we want a list of projects in our ~/src dir, ignoring version numbers:

# For novelty purposes only; parsing ls is a bad idea
vidar@vidarholen ~/src $ ls | sed -n 's/-[^-]*$//p'
PyYAML
alsa-lib
bash
bitlbee
bsd-games
libsamplerate
linux
nltk
pulseaudio
vidar@vidarholen ~/src $

Piece of cake, right?

But think about the magic that actually happened there: We started out with three lines of coloured text, ran it through sed to search&replace on each line, and ended up with nine lines of uncoloured text.

How did sed filter the colours? How did it put each filename a separate line, when the same does not happen for echo "foo bar" | sed ..?

The answer, of course, is that it didn’t. ls detected that output wasn’t a terminal and altered its output accordingly.

When outputting to a terminal, you can be fairly sure that the user will be reading it directly, so you can make it as pretty and unparsable as you want. When output is not a terminal, it’s likely going to some program or file where pretty output will just complicate things.

Life without magic

Try the previous example with ls -C --color=always instead of just ls, and see how different life would have been without this terminal detection. You can also try this with xargs, to see how colours could break things:

vidar@vidarholen ~/src $ ls -C --color=always | xargs ls -ld
ls: cannot access PyYAML-3.09: No such file or directory
ls: cannot access alsa-lib-1.0.23: No such file or directory
...

The directories obviously exist, but the ANSI escape codes that give them that cute colour also prevents utilities from working with them. For additional fun, copy-pasting this error message from a terminal strips the colours, so anyone you reported it to would be quite stumped.

Magic efficiency tricks

It’s not all about making output pretty or parsable depending on the situation. Read/write syscalls are notoriously expensive; reading anything less than about 4k bytes at a time will make disk reads CPU bound.

glibc knows this, and will alter write buffering depending on the context. If the output is a terminal, a user is probably watching and waiting for it, so it will flush output immediately. If it’s a file, it’s better to buffer it up for efficiency:


vidar@kelvin ~ $ strace -e write -o log grep God text/bible12.txt
01:001:001 In the beginning God created the heaven and the earth.
...
vidar@kelvin ~ $ wc -l log
3948 log

In other words, grep wrote about god 3948 times (insert your own bible forum jokes).


vidar@kelvin ~ $ strace -e write -o log grep God text/bible12.txt > tmp
vidar@kelvin ~ $ wc -l log
64 log

This time, grep produced the exact same output, but wrote to a file instead. This resulted in 64 writes – about 1% of the more interactive mode!

Spells of confusion

Sometimes magic can confuse and astound. What if output is kinda like a terminal, only not?

ls -l gives the user pretty colours. ls -l | more does not. The reason is not at all obvious for users who just consider ” | more” a way to scroll in output. But it works, even if it’s not as pretty as we’d like.

Here’s a much more confusing example (just go along with the simplified grep):

# Show apache traffic (works)
cat access.log

# Show 404 errors with line numbers (works)
cat access.log | grep 404 | nl

Basic stuff.

# Show apache traffic in realtime (works)
tail -f access.log

# Show 404 errors with line numbers in realtime (FAILS)
tail -f access.log | grep 404 | nl

While the logic is the same as before, our realtime error log doesn’t show anything!

Why? Because grep’s output isn’t a terminal, so it will buffer up about 4k worth of data before writing it all in one go. In the mean time, the command will just seem to hang for no apparent reason!

(Observant readers might ask, “Isn’t tail buffering?”. And it might be or it might not. It depends on your version and distro patches.)

Mastering magic

Ok, so what can we do to take charge of these useful peculiarities?

Many apps have flags for this, though none of them are POSIX.

GNU ls lets you specify -C for columned mode, and --color=always for colours, regardless of the nature of stdout.

sed has -u, grep has a --line-buffered. awk has a fflush function. tail, if yours buffers at all, has a -u since about 2008 which as of now isn’t in debian stable.

If your app doesn’t have such an option, there’s always unbuffer from Expect, the interactive tool scripting package.

unbuffer starts applications within its own pseudo-tty, much like how xterm and sshd does it. This usually tricks the application into not buffering (and maybe to prettify its output).

Obviously, this depends on the app using standard C stdio, or that it checks for a terminal itself. Apps can unintentionally be written to avoid this, like when setting Java’s System.Out to a BufferedOutputStream.

And finally… how can you create such behaviour yourself?

if [[ -t 1 ]] #if stdout is a terminal
then
    tput setaf 3 #Set foreground to yellow
fi
echo "Pure gold"

Why PulseAudio?

PulseAudio has been on the receiving end of quite a bit of flaming. Some deserved, a lot not. Most critics have been appeased as PulseAudio has matured and distros make setup smoother, but there are still a number of troll^Wusers who trash it at every opportunity.

Common objections to Pulseaudio includes:

“We don’t need another sound system!”

First of all, that’s what we said when ALSA was introduced. It played out exactly the same way. I have IRC logs from 2004 where the conversation goes “My sound is gone. God damn ALSA!”. “Yeah, I’m starting to think ALSA is like the emperors new clothes. It never works, but people say it’s because you’re a noob.” Then two people immediately chime in saying “works fine here”. Replace ALSA with PulseAudio, and you could have seen it in any forum today.

Second, we definitely do need another sound system. ALSA works ok as an output mechanism, but it’s not a sound layer worthy of a modern operating system. Even when it came out, it was way behind Windows (but far ahead of OSS). In terms of features, ALSA is to PulseAudio what SVGALib is to X11.

“I just want to play sound, I don’t need all these crap features”

Maybe. This is why OSS worked for so long. 90% of use cases are covered by allowing a single app to play audio at any one time. However, there is a lot more to a smooth audio experience. The following things are a non-exhaustive list of convenient features that a layer like PulseAudio can provide:

  • On-the-fly output device switching: In “the old days”, you could plug in a headset and the hardware would disconnect the speakers and play through the headset only. Today, the headset might be Bluetooth or USB, so no such hacks are possible. PulseAudio lets you handle this in software!
  • Per-application volume control: The ability to turn up the music without also turning up the IM notification sounds. If you use MPlayer with ALSA for example, 9/0 adjusts the system-wide volume, not just MPlayer’s.
  • Automatic muting of other audio on incoming phone calls, such as through Skype
  • Audio forwarding: Like X11 forwarding? Me too. With PulseAudio, you can also get application audio over the network. pax11publish lets you set PulseAudio settings like network forwarding per-display!

 

And here are some features that are less day-to-day useful, but very cool:

  • Networked audio: You can transmit audio from one (or more) apps to another box. Load Spotify on your laptop and play it via the media center’s speakers!
  • Combine two stereo cards into one 4 channel device. The cards don’t even have to be on the same computer!
  • Use the monitor device to record application audio, or connect it to a visualizer (demonstrated in a previous post)!

 

“PulseAudio uses a lot more CPU time than ALSA”

PulseAudio uses a high quality resampling algorithm by default. ALSA supports only one, a simple linear resampler. If, like most people, you can’t hear a difference anyways, you can configure PulseAudio to use a linear algorithm too by adding “resample-method = src-linear” in /etc/pulse/daemon.conf. CPU usage will drop from 10% to 1%!

“Thanks, I love PulseAudio now”

No problem

RAM-loadable Linux on a stick

I wanted to play some SNES games with a friend on one of a dozen public windows boxes, but I didn’t want to start downloading ROMs and installing zsnes on them. The simple solution was to just make a bootable USB memory stick with Ubuntu and boot from that on whichever box was available at the time.

The boxes turned out to have more horsepower than I assumed, and conveniently came with Linux-friendly Intel GPUs, so I wanted to try out OpenArena. Of course, then you need multiple windows boxes, and I just had one memory stick. Time to make it load and run entirely from memory, so the memory stick can be unplugged and used to boot other boxes.

Thanks to the fantastic initramfs mechanism, the best Linux feature since UUID partition selection (initrd wasn’t nearly as sweet), this is very easy to do, even when the distro doesn’t support it. Here are some hints on how to do it:

  1. Install on a memory stick. These days, you can conveniently do this in a VM and still expect it to boot on a PC: kvm -hda /dev/sdb -cdrom somedistro.iso -boot d -m 2200 -net nic -net user -k en-us. A minimal install is preferable, loading GNOME from a slow memory stick just to cover it with OpenArena is a waste.
  2. Ubuntu installs GRUB with UUID partitions, but Debian does not, so in that case you have to update menu.list: replace root=/dev/hda1 with root=UUID=<uuid from tune2fs -l here>
  3. Debian has a fancy system for adding initramfs hooks (see /etc/initramfs-tools) that will survive kernel upgrades, but for generality (and not lazyness at all, no siree), we’ll do it the hacked up manual way: Make a new directory and unpack the initramfs: gzip -d < /boot/initrd.img-2.6.26-2-686 | cpio -i
  4. vim init. Find the place where the root fs has just been mounted, and add some code to mount --move it, mount a tmpfs big enough to hold all the files, copy all the files from the real root and then unmount it:

    echo "Press a key to not load to RAM"
    if ! read -t 3 -n 1 k
    then
        realroot=/tmp/realroot
    
        mkdir "$realroot"
        mount --move "$rootmnt" "$realroot"
        mount -t tmpfs -o size=90% none "$rootmnt"
        echo
        echo "Copying files, wait..."
        cp -a "$realroot"/* "$rootmnt"
        umount "$realroot"
        echo "Done"
    fi
    

    Exercises for the reader: Add a progress meter to make the 1-2 minute load time more bearable.

  5. Pack the initramfs back up: find . | cpio -o -H newc | gzip -9 > /boot/initrd.img-2.6.26-2-686
  6. Boot (still in the VM, if you want) and hit a key when prompted so you're running straight from the stick, install all the packages you want, and configure them the way you want them. In my case, I made the stick boot straight into X, running fluxbox and iDesk to make a big shiny Exit icon that would reboot the box (returning it to Windows), just in case any laymen wandered in on it.
  7. Very important: apt-get clean. I had 500MB of cached packages the first time around, which is half a gig of lost memory and an additional minute of load time.
  8. Try booting it from RAM. Make sure you remember if you're running in RAM or not when configuring, or all changes will be lost.

Debian required some kludges in the checkroot.sh init script to make it not die when the root fs wasn't on disk and thus failed to check, but Ubuntu was very smooth about it. Still, no big deal.

In the end, I had a 1000MB installation that could easily turn a dull park of windows web browsing boxes into a LAN party with no headaches for the administrator. Game on.