MP3 to Video using GStreamer visualizations

VLC showing a sparkly shiny visualization
Everyone loves music visualization, but not all apps support it in a sensible way. Maybe you want to shuffle a random assortment of video and audio files in a player that doesn’t handle that well (VLC!), or not at all (mplayer!). Or maybe you want to upload something to youtube, with gorgeous HD visualizations instead of that lame static cover art image?

The few google results on the topic that weren’t spam suggested screencapping software. Yeah, that’s great… until you have more than two files.

Once again, everyone’s favourite multimedia swiss army knife – GStreamer – steps up to the plate.

Here’s an example of encoding an MP3 to an H.264 .mkv file using the gorgeous goom visualizer (requires the mp3 and x264 plugins for gstreamer):

gst-launch filesrc location=input.mp3 ! queue ! tee name=stream ! queue ! mp3parse ! matroskamux name=mux ! filesink location="output.mkv" stream. ! queue ! mp3parse ! mad ! audioconvert ! queue ! goom ! ffmpegcolorspace ! video/x-raw-yuv,width=1280,height=720 ! x264enc ! mux.

It’s beautiful – and the video is pretty sweet as well.

It’s worth noting that this approach does not re-encode the MP3, like some less awesome approaches would do (causing loss of quality). It simply muxes it together with the visualizer’s video stream. x264 even seems to distribute itself well across cores.

No, wait, what? MP3 and H.264? Of course, I meant Vorbis and Theora! Let me rephrase:

gst-launch filesrc location=input.ogg ! queue ! tee name=stream ! queue ! oggdemux ! vorbisparse ! oggmux name=mux ! filesink location="output.ogg" stream. ! queue ! oggdemux ! vorbisdec ! audioconvert ! queue ! goom ! ffmpegcolorspace ! video/x-raw-yuv,width=1920,height=1080 ! theoraenc ! mux.

The same goodness applies, except for the parallelism. If you have a multicore CPU, there’s massive speedup to be had through simple shell script based multithreading. (Why full HD this time? VLC on Windows crashes on 720p Theora!)

And there you have it. A simple, hack-free, modular and flexible way of encoding visualization videos for MP3 and Ogg Vorbis files. Thanks, GStreamer!

Visualization fun with GStreamer

I have a Mini-ITX box connected to my TV. It worked very well on my old CRT TV, but now I have a Full HD TV. It went from 720×576 to 1920×1080: five times the pixel count (exactly!), or over 100MB/s (bytes, not bits of course) of raw video. It’s not all that much, but it’s way more than what the Mini-ITX can handle. With the magic of graphics hardware, however, it can show lower resolutions scaled up to 1920×1080 without breaking a sweat.

I had a lot of issues trying to get some music visualization running on it. There’s no way the poor thing can generate 1920×1080 pixels worth of visualization, let alone push it out to the TV. libvisual, the closest thing to a visualization standard there is, didn’t appear to have simple command line apps that you could point to a music file and hardware scale the visualizations to fullscreen.

There was projectM though, which has a clever system of capturing audio from the Pulse audio system and visualizing it with OpenGL scaling. That way you can use any music playing app you want, at any resolution you care to render and display. Unfortunately, the Openchrome drivers for the VIA hardware and the Qt OpenGL component really hated each other.

But hey, we have GStreamer!

gst-launch-0.10 pulsesrc device=alsa_output.hw_0.monitor ! queue ! audioconvert ! libvisual_infinite ! video/x-raw-rgb,width=640,height=360,framerate=25/1 ! ffmpegcolorspace ! queue ! xvimagesink

Grab audio from the pulse monitor device, run it through libvisual to get a visualization at the specified resolution, and show it through xvideo. All my requirements summed up in about two lines of gstreamer goodness!

Webcam fun with GStreamer

I have yet to find a proper linux tool for recording video from a webcam while showing it on screen at the same time. The typical hack is to use mencoder to encode, and mplayer to play the encoded file, but the latency is typically a full second or more:

{ tail --follow=name -n +0 --retry "lulz.avi" | mplayer -cache 320 -vo x11 -; killall -INT mencoder; } & mencoder tv:// -tv width=640:height=480:fps=15 -ovc lavc -o lulz.avi

GStreamer does to video/audio what Bash does to text and NetPBM does to images, and it’s just as brilliant (possibly more). So let’s instead use it instead:

gst-launch-0.10 v4l2src ! tee name=videoout ! queue ! videorate ! video/x-raw-yuv,fps=15 ! queue ! theoraenc quality=60 ! queue ! muxout. pulsesrc ! audio/x-raw-int,rate=22000,channels=1,width=16 ! queue ! audioconvert ! vorbisenc ! queue ! muxout. oggmux name=muxout ! filesink location=lulz.ogg videoout. ! queue ! ffmpegcolorspace ! ximagesink

Voila. While long and seemingly convoluted, it’s not really worse than the mplayer line, and it works a lot better.

While a gst pipeline looks scary to begin with, it’s really self explanatory when you start reading it. Still, I’ll do a little dance about it:

#Get a v4l2 video source, split it and put one end though a 
#theora codec and send the other to videoout (defined later)
v4l2src ! tee name=videoout ! queue ! videorate ! video/x-raw-yuv,fps=15 \
        ! queue  ! theoraenc quality=60 ! queue ! muxout.   

#Get audio from a pulseaudio stream, run it through the vorbis encoder
pulsesrc ! audio/x-raw-int,rate=22000,channels=1,width=16 \
         ! queue ! audioconvert ! vorbisenc ! queue !muxout.  

#Mux the audio and video together, and put it in "media.ogg"
oggmux name=muxout ! filesink location=media.ogg  

#Put the other end of the video split out on the screen
videoout. ! queue ! ffmpegcolorspace ! ximagesink

Easy to see why this is one of my new favourite toys.