{"id":22,"date":"2010-11-09T09:30:47","date_gmt":"2010-11-09T09:30:47","guid":{"rendered":"http:\/\/www.vidarholen.net\/contents\/blog\/?p=22"},"modified":"2011-03-22T12:47:26","modified_gmt":"2011-03-22T12:47:26","slug":"is-it-terminal","status":"publish","type":"post","link":"https:\/\/www.vidarholen.net\/contents\/blog\/?p=22","title":{"rendered":"Is it terminal?"},"content":{"rendered":"<p>Applications often behave differently in subtle ways when stdout is not a terminal. Most of the time, this is done so smoothly that the user isn&#8217;t even aware of it.<\/p>\n<p><strong>When it works like magic<\/strong><\/p>\n<p>Consider <code>ls<\/code>:<\/p>\n<pre>vidar@vidarholen ~\/src $ ls\r\n<span style=\"color: #000080;\">PyYAML-3.09      bsd-games-2.17       nltk-2.0b9\r\nalsa-lib-1.0.23  libsamplerate-0.1.7  pulseaudio-0.9.21\r\nbash-4.0         <\/span><span style=\"color: #008080;\">linux<\/span>                tmp\r\n<span style=\"color: #000080;\">bitlbee-1.2.8    linux-2.6.32.8<\/span>\r\nvidar@vidarholen ~\/src $<\/pre>\n<p>Now, say we want a list of projects in our ~\/src dir, ignoring version numbers:<\/p>\n<pre># For novelty purposes only; parsing ls is a bad idea\r\nvidar@vidarholen ~\/src $ ls | sed -n 's\/-[^-]*$\/\/p'\r\nPyYAML\r\nalsa-lib\r\nbash\r\nbitlbee\r\nbsd-games\r\nlibsamplerate\r\nlinux\r\nnltk\r\npulseaudio\r\nvidar@vidarholen ~\/src $<\/pre>\n<p>Piece of cake, right?<\/p>\n<p>But think about the magic that actually happened there: We started out with three lines of coloured text, ran it through sed to search&amp;replace on each line, and ended up with nine lines of uncoloured text.<\/p>\n<p>How did sed filter the colours? How did it put each filename a separate line, when the same does not happen for <code>echo \"foo bar\" | sed ..<\/code>?<\/p>\n<p>The answer, of course, is that it didn&#8217;t. <code>ls<\/code> detected that output wasn&#8217;t a terminal and altered its output accordingly.<\/p>\n<p>When outputting to a terminal, you can be fairly sure that the user will be reading it directly, so you can make it as pretty and unparsable as you want. When output is not a terminal, it&#8217;s likely going to some program or file where pretty output will just complicate things.<\/p>\n<p><strong>Life without magic<\/strong><\/p>\n<p>Try the previous example with <code>ls -C --color=always<\/code> instead of just <code>ls<\/code>, and see how different life would have been without this terminal detection. You can also try this with xargs, to see how colours could break things:<\/p>\n<pre>vidar@vidarholen ~\/src $ ls -C --color=always | xargs ls -ld\r\nls: cannot access <span style=\"color: #000080;\">PyYAML-3.09<\/span>: No such file or directory\r\nls: cannot access <span style=\"color: #000080;\">alsa-lib-1.0.23<\/span>: No such file or directory\r\n...<\/pre>\n<p>The directories obviously exist, but the ANSI escape codes that give them that cute colour also prevents utilities from working with them. For additional fun, copy-pasting this error message from a terminal strips the colours, so anyone you reported it to would be quite stumped.<\/p>\n<p><strong>Magic efficiency tricks<\/strong><\/p>\n<p>It&#8217;s not all about making output pretty or parsable depending on the situation. Read\/write syscalls are notoriously expensive; reading anything less than about 4k bytes at a time will make disk reads CPU bound.<\/p>\n<p>glibc knows this, and will alter write buffering depending on the context. If the output is a terminal, a user is probably watching and waiting for it, so it will flush output immediately. If it&#8217;s a file, it&#8217;s better to buffer it up for efficiency:<\/p>\n<p><code><br \/>\nvidar@kelvin ~ $ strace -e write -o log grep God text\/bible12.txt<br \/>\n01:001:001 In the beginning God created the heaven and the earth.<br \/>\n...<br \/>\nvidar@kelvin ~ $ wc -l log<br \/>\n3948 log<br \/>\n<\/code><\/p>\n<p>In other words, grep wrote about god 3948 times (insert your own bible forum jokes).<\/p>\n<p><code><br \/>\nvidar@kelvin ~ $ strace -e write -o log grep God text\/bible12.txt &gt; tmp<br \/>\nvidar@kelvin ~ $ wc -l log<br \/>\n64 log<br \/>\n<\/code><\/p>\n<p>This time, grep produced the exact same output, but wrote to a file instead. This resulted in 64 writes \u2013 about 1% of the more interactive mode!<\/p>\n<p><strong>Spells of confusion<\/strong><\/p>\n<p>Sometimes magic can confuse and astound. What if output is kinda like a terminal, only not?<\/p>\n<p><code>ls -l<\/code> gives the user pretty colours. <code>ls -l | more<\/code> does not. The reason is not at all obvious for users who just consider &#8221; | more&#8221; a way to scroll in output. But it works, even if it&#8217;s not as pretty as we&#8217;d like.<\/p>\n<p>Here&#8217;s a much more confusing example (just go along with the simplified grep):<\/p>\n<pre># Show apache traffic (works)\r\ncat access.log\r\n\r\n# Show 404 errors with line numbers (works)\r\ncat access.log | grep 404 | nl<\/pre>\n<p>Basic stuff.<\/p>\n<pre># Show apache traffic in realtime (works)\r\ntail -f access.log\r\n\r\n# Show 404 errors with line numbers in realtime (<strong>FAILS<\/strong>)\r\ntail -f access.log | grep 404 | nl<\/pre>\n<p>While the logic is the same as before, our realtime error log doesn&#8217;t show anything!<\/p>\n<p>Why? Because grep&#8217;s output isn&#8217;t a terminal, so it will buffer up about 4k worth of data before writing it all in one go. In the mean time, the command will just seem to hang for no apparent reason!<\/p>\n<p>(Observant readers might ask, &#8220;Isn&#8217;t tail buffering?&#8221;. And it might be or it might not. It depends on your version and distro patches.)<\/p>\n<p><strong>Mastering magic<\/strong><\/p>\n<p>Ok, so what can we do to take charge of these useful peculiarities?<\/p>\n<p>Many apps have flags for this, though none of them are POSIX.<\/p>\n<p>GNU <code>ls<\/code> lets you specify <code>-C<\/code> for columned mode, and <code>--color=always<\/code> for colours, regardless of the nature of stdout.<\/p>\n<p><code>sed<\/code> has <code>-u<\/code>, <code>grep<\/code> has a <code>--line-buffered<\/code>. <code>awk<\/code> has a <code>fflush<\/code> function. <code>tail<\/code>, if yours buffers at all, has a <code>-u<\/code> since about 2008 which as of now isn&#8217;t in debian stable.<\/p>\n<p>If your app doesn&#8217;t have such an option, there&#8217;s always <code><a href=\"http:\/\/expect.sourceforge.net\/example\/unbuffer.man.html\">unbuffer<\/a><\/code> from <a href=\"http:\/\/expect.sourceforge.net\/\">Expect<\/a>, the interactive tool scripting package.<\/p>\n<p><code>unbuffer<\/code> starts applications within its own pseudo-tty, much like how xterm and sshd does it. This usually tricks the application into not buffering (and maybe to prettify its output).<\/p>\n<p>Obviously, this depends on the app using standard C stdio, or that it checks for a terminal itself. Apps can unintentionally be written to avoid this, like when setting Java&#8217;s System.Out to a BufferedOutputStream.<\/p>\n<p>And finally&#8230; how can you create such behaviour yourself?<\/p>\n<pre>if [[ -t 1 ]] #if stdout is a terminal\r\nthen\r\n    tput setaf 3 #Set foreground to yellow\r\nfi\r\necho \"Pure gold\"<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Applications often behave differently in subtle ways when stdout is not a terminal. Most of the time, this is done so smoothly that the user isn&#8217;t even aware of it. When it works like magic Consider ls: vidar@vidarholen ~\/src $ ls PyYAML-3.09 bsd-games-2.17 nltk-2.0b9 alsa-lib-1.0.23 libsamplerate-0.1.7 pulseaudio-0.9.21 bash-4.0 linux tmp bitlbee-1.2.8 linux-2.6.32.8 vidar@vidarholen ~\/src $ &hellip; <a href=\"https:\/\/www.vidarholen.net\/contents\/blog\/?p=22\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Is it terminal?&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[5,4],"tags":[29,53,21],"class_list":["post-22","post","type-post","status-publish","format-standard","hentry","category-advanced-linux","category-linux","tag-buffering","tag-linux","tag-shell-script"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/posts\/22","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=22"}],"version-history":[{"count":0,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/posts\/22\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=22"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=22"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=22"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}