{"id":1035,"date":"2021-04-12T00:08:48","date_gmt":"2021-04-12T00:08:48","guid":{"rendered":"http:\/\/www.vidarholen.net\/contents\/blog\/?p=1035"},"modified":"2024-10-04T18:17:33","modified_gmt":"2024-10-04T18:17:33","slug":"what-exactly-was-the-point-of-xvar-xval","status":"publish","type":"post","link":"https:\/\/www.vidarholen.net\/contents\/blog\/?p=1035","title":{"rendered":"What exactly was the point of [ &#8220;x$var&#8221; = &#8220;xval&#8221; ]?"},"content":{"rendered":"\n<div class=\"wp-block-jetpack-markdown\"><p>In shell scripting you sometimes come across comparisons where each value is prefixed with &quot;x&quot;. Here are some examples from GitHub:<\/p>\n<pre><code>if [ &quot;x${JAVA}&quot; = &quot;x&quot; ]; then\nif [ &quot;x${server_ip}&quot; = &quot;xlocalhost&quot; ]; then\nif test x$1 = 'x--help' ; then\n<\/code><\/pre>\n<p>I&#8217;ll call this the x-hack.<\/p>\n<p>For any <a href=\"https:\/\/pubs.opengroup.org\/onlinepubs\/9699919799\/utilities\/test.html\">POSIX compliant shell<\/a>, the value of the x-hack is exactly zero: this comparison works without the <code>x<\/code> 100% of the time. But why was it a thing?<\/p>\n<p>Online sources like <a href=\"https:\/\/stackoverflow.com\/questions\/174119\/why-do-shell-script-comparisons-often-use-xvar-xyes\">this stackoverflow Q&amp;A<\/a> are a little handwavy, saying it&#8217;s an alternative to quoting (most definitely NOT the case!), pointing towards issues with &quot;some versions&quot; of certain shells, or generally cautioning against the mystic behaviors of especially ancient Unix system without concrete examples.<\/p>\n<p>To determine whether or not <a href=\"https:\/\/www.shellcheck.net\">ShellCheck<\/a> should warn about this, and if so, what its <a href=\"https:\/\/www.shellcheck.net\/wiki\/SC2268\">long form rationale<\/a> should be, I decided to dig into the history of Unix with the help of <a href=\"https:\/\/www.tuhs.org\/\">The Unix Heritage Society<\/a>&#8216;s archives. I was unfortunately unable to peer into the closely guarded world of the likes of HP-UX and AIX, so dinosaur herders beware.<\/p>\n<p>These are the cases I found that can fail.<\/p>\n<h3>Left-hand side matches a unary operator<\/h3>\n<p>The AT&amp;T Unix v6 shell from 1973, at least as found in PWB\/UNIX from 1977, would fail to run test commands whose left-hand side\nmatched a unary operator. This must have been immediately obvious to anyone who tried to check for command line parameters:<\/p>\n<pre><code>% arg=&quot;-f&quot;\n% test &quot;$arg&quot; = &quot;-f&quot;\nsyntax error: -f\n% test &quot;x$arg&quot; = &quot;x-f&quot;\n(true)\n<\/code><\/pre>\n<p>This was fixed in the AT&amp;T Unix v7 Bourne shell builtin in 1979. However, <code>test<\/code> and <code>[<\/code> were also available as separate\nexecutables, and appear to have retained a variant of the buggy behavior:<\/p>\n<pre><code>$ arg=&quot;-f&quot;\n$ [ &quot;$arg&quot; = &quot;-f&quot; ]\n(false)\n$ [ &quot;x$arg&quot; = &quot;x-f&quot; ]\n(true)\n<\/code><\/pre>\n<p>This happened because the utility used a simple recursive descent parser without backtracking, which gave unary operators precedence over binary operators and ignored trailing arguments.<\/p>\n<p>The &quot;modern&quot; Bourne shell behavior was copied by the Public Domain KornShell in 1988, and made part of POSIX.2 in 1992. GNU Bash 1.14 did the same thing for its builtin <code>[<\/code>, and the GNU shellutils package that provided the external <code>test<\/code>\/<code>[<\/code> binaries followed POSIX, so the early GNU\/Linux distros like SLS were not affected, nor was FreeBSD 1.0.<\/p>\n<p>The x-hack is effective because no unary operators can start with <code>x<\/code>.<\/p>\n<h3>Either side matches string length operator <code>-l<\/code><\/h3>\n<p>A similar issue that survived longer was with the string length operator <code>-l<\/code>. Unlike the normal unary predicates, this one was only parsed as part as part of an operand to binary predicates:<\/p>\n<pre><code>var=&quot;helloworld&quot;\n[ -l &quot;$var&quot; -gt 8 ] &amp;&amp; echo &quot;String is longer than 8 chars&quot;\n<\/code><\/pre>\n<p>It did not make it into POSIX because, as the rationale puts it, &quot;it was undocumented in most implementations, has been removed from some implementations (including System V), and the functionality is provided by the shell&quot;, referring to <code>[ ${#var} -gt 8 ]<\/code>.<\/p>\n<p>It was not a problem in UNIX v7 where <code>=<\/code> took precedence, but Bash 1.14 from 1996 would parse it greedily up front:<\/p>\n<pre><code>$ var=&quot;-l&quot;\n$ [ &quot;$var&quot; = &quot;-l&quot; ]\ntest: -l: binary operator expected\n$ [ &quot;x$var&quot; = &quot;x-l&quot; ]\n(true)\n<\/code><\/pre>\n<p>It was also a problem on the right-hand side, but only in nested expressions.\nThe <code>-l<\/code> check made sure there was a second argument, so you would need an\nadditional expression or parentheses to trigger it:<\/p>\n<pre><code>$ [ &quot;$1&quot; = &quot;-l&quot; -o 1 -eq 1 ]\n[: too many arguments\n$ [ &quot;x$1&quot; = &quot;x-l&quot; -o 1 -eq 1 ]\n(true)\n<\/code><\/pre>\n<p>This operator was removed in Bash 2.0 later that year, eliminating the problem.<\/p>\n<h3>Left-hand side is <code>!<\/code><\/h3>\n<p>Another issue in early shells was when the left-hand side was the negation operator <code>!<\/code>:<\/p>\n<pre><code>$ var=&quot;!&quot;\n$ [ &quot;$var&quot; = &quot;!&quot; ]\ntest: argument expected            (UNIX v7, 1979)\ntest: =: unary operator expected   (bash 1.14, 1996)\n(false)                            (pd-ksh88, 1988)\n$ [ &quot;x$var&quot; = &quot;x!&quot; ]\n(true)\n<\/code><\/pre>\n<p>Again, the x-hack is effective by preventing the <code>!<\/code> from being recognized as a negation operator.<\/p>\n<p>ksh treated this the same as <code>[ ! &quot;=&quot; ]<\/code>, and ignored the rest of the arguments. This quiety returned false, as <code>=<\/code> is not a null string.\nKsh continues to ignore trailing arguments to this day:<\/p>\n<pre><code>$ [ -e \/ random words\/ops here ]\n(true)                              (ksh93, 2021)\nbash: [: too many arguments         (bash5, 2021)\n<\/code><\/pre>\n<p>Bash 2.0 and ksh93 both fixed this problem by letting <code>=<\/code> take precedence in the 3-argument case, in accordance with POSIX.<\/p>\n<h3>Left-hand side is &quot;(&quot;<\/h3>\n<p>This is by far my favorite.<\/p>\n<p>The UNIX v7 builtin failed when the left-hand side was a left-parenthesis:<\/p>\n<pre><code>$ left=&quot;(&quot; right=&quot;(&quot;\n$ [ &quot;$left&quot; = &quot;$right&quot; ]\ntest: argument expected\n$ [ &quot;x$left&quot; = &quot;x$right&quot; ]\n(true)\n<\/code><\/pre>\n<p>This happens because the <code>(<\/code> takes precedence over the <code>=<\/code>, and becomes an invalid parenthesis group.<\/p>\n<p>Why is this my favorite? Behold Dash 0.5.4 up until 2009:<\/p>\n<pre><code>$ left=&quot;(&quot; right=&quot;(&quot;\n$ [ &quot;$left&quot; = &quot;$right&quot; ]\n[: 1: closing paren expected\n$ [ &quot;x$left&quot; = &quot;x$right&quot; ]\n(true)\n<\/code><\/pre>\n<p>That was an active bug when the StackOverflow Q&amp;A was posted.<\/p>\n<p>But wait, there&#8217;s more!<\/p>\n<p>Here&#8217;s Zsh in <a href=\"https:\/\/github.com\/zsh-users\/zsh\/commit\/67877f60552019226e93f56b108f7b61a60ea11b\"><strong>late 2015<\/strong><\/a>, right before version 5.3:<\/p>\n<pre><code>% left=&quot;(&quot; right=&quot;)&quot;\n% [ &quot;$left&quot; = &quot;$right&quot; ]\n(true)\n% [ &quot;x$left&quot; = &quot;x$right&quot; ]\n(false)\n<\/code><\/pre>\n<p>Amazingly, the x-hack could be used to work around certain bugs all the way up until 2015, seven years after StackOverflow wrote it off as an archaic relic of the past!<\/p>\n<p>The bugs are of course increasingly hard to come across. The Zsh one only triggers when comparing left-paren against right-paren, as otherwise the parser will backtrack and figure it out.<\/p>\n<p>Another late holdout was Solaris, whose \/bin\/sh was the legacy Bourne shell as\nlate as Solaris 10 in 2009. However, this was undoubtedly for compatibility, and\nnot because they believed this was a viable shell. A &quot;standards compliant&quot; shell\nhad been an option for a long time before Solaris 11 dragged it kicking and screaming\ninto 21th century &#8212; or at least into the 90s &#8212; by switching to ksh93 by default in 2011.<\/p>\n<p>In all cases, the x-hack is effective because it prevents the operands from being recognized as parentheses.<\/p>\n<h3>Conclusion<\/h3>\n<p>The x-hack was indeed useful and effective against several real and practical\nproblems in multiple shells.<\/p>\n<p>However, the value was mostly gone by the mid-to-late 1990s, and the few\nremaining issues were cleaned up before 2010 &#8212; shockingly late, but still over a\ndecade ago.<\/p>\n<p>The last one managed to stay until 2015, but only in the very specific case of\ncomparing opening parenthesis to a closed parenthesis in one specific non-system shell.<\/p>\n<p>I think it&#8217;s time to retire this idiom, and ShellCheck\n<a href=\"https:\/\/github.com\/koalaman\/shellcheck\/commit\/5669eb22037980c5a6b74b0d420cb452990bcf88\">now offers<\/a> a style suggestion by default.<\/p>\n<h3>Epilogue<\/h3>\n<p>The Dash issue of <code>[ &quot;(&quot; = &quot;)&quot; ]<\/code> was originally reported in a form that affected both Bash 3.2.48 and Dash 0.5.4 in 2008. You can still see this on macOS bash today:<\/p>\n<pre><code>$ str=&quot;-e&quot;\n$ [ \\( ! &quot;$str&quot; \\) ]\n[: 1: closing paren expected     # dash\nbash: [: `)' expected, found ]   # bash\n<\/code><\/pre>\n<p>POSIX fixes all these ambiguities for up to 4 parameters, ensuring that shells conditions work the same way, everywhere, all the time.<\/p>\n<p>Here&#8217;s how Dash maintainer Herbert Xu put it <a href=\"https:\/\/git.kernel.org\/pub\/scm\/utils\/dash\/dash.git\/commit\/?id=4df1e776cd079357f877f0d491a80a234f670452\">in the fix<\/a>:<\/p>\n<pre><code>\/*\n * POSIX prescriptions: he who wrote this deserves the Nobel\n * peace prize.\n *\/\n<\/code><\/pre>\n<\/div>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In shell scripting you sometimes come across comparisons where each value is prefixed with &quot;x&quot;. Here are some examples from GitHub: if [ &quot;x${JAVA}&quot; = &quot;x&quot; ]; then if [ &quot;x${server_ip}&quot; = &quot;xlocalhost&quot; ]; then if test x$1 = &#8216;x&#8211;help&#8217; ; then I&#8217;ll call this the x-hack. For any POSIX compliant shell, the value of &hellip; <a href=\"https:\/\/www.vidarholen.net\/contents\/blog\/?p=1035\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;What exactly was the point of [ &#8220;x$var&#8221; = &#8220;xval&#8221; ]?&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[5,4],"tags":[60,21,46,38],"class_list":["post-1035","post","type-post","status-publish","format-standard","hentry","category-advanced-linux","category-linux","tag-history","tag-shell-script","tag-shellcheck","tag-unix"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1035","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1035"}],"version-history":[{"count":24,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1035\/revisions"}],"predecessor-version":[{"id":1156,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1035\/revisions\/1156"}],"wp:attachment":[{"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1035"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1035"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.vidarholen.net\/contents\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1035"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}