Skip to main content

On the Unreliability of Programs

<div class="abstract">
<blockquote>
<p>I have never written an equation or line of code that I was 100% confident of, or which I thought had less than a 1-in-trillions chance of it being wrong in some important way. Software &amp; real-world systems are too complex &amp; fragile.</p>
<p>Every part of my understanding, the hardware, or the real-world context is less reliable than 1-in-trillions.</p>
<p>Let’s consider potential problems with our understanding of even the most trivial seeming arithmetic comparison checking that ‘<em>x</em> + <em>x</em> = 2<em>x</em>’…</p>
</blockquote>
</div>
<div class="epigraph">
<blockquote>
<p>Numbers that fool the <a href="!W">Fermat primality test</a> are called <a href="!W"><em>Carmichael numbers</em></a>, and little is known about them other than that they are extremely rare. There are 255 Carmichael numbers below 100,000,000…In testing primality of very large numbers chosen at random, the chance of stumbling upon a value that fools the Fermat test is less than the chance that <a href="!W">cosmic radiation</a> will cause the computer to <a href="!W" title="Soft error#Cosmic rays creating energetic neutrons and protons">make an error</a> in carrying out a ‘correct’ algorithm.</p>
<p>Considering an algorithm to be inadequate for the first reason but not for the second illustrates the difference between mathematics and engineering.</p>
<a href="!W">Hal Abelson</a> &amp; <a href="!W">Gerald Sussman</a> (<a href="!W"><em>Structure And Interpretation of Computer Programs</em></a>)
</div>
</blockquote>
<p>Consider a simple-seeming line of conditional code for the arithmetical tautology: <code class="sourceCode haskell">x <span class="op">+</span> x <span class="op">==</span> <span class="dv">2</span><span class="op">*</span>x</code>. How could this <em>possibly</em> ever go wrong? Well…</p>
<ul>
<li><p>Where did you initialize <em>x</em>? Was it ever initialized to a non-null value?</p>
<p>(Or has it been <a href="https://research.swtch.com/openssl">working <em>accidentally</em></a> because it uses uninitialized memory which <a href="https://jsomers.net/blog/gettiers">just happened</a> to have a workable value?)</p></li>
<li><p>Is this <a href="!W" title="Evaluation strategy#Strict binding strategies">comparison by</a> reference, equality, hash, or some other way entirely?</p></li>
<li><p>Which integer type is this? Does that integer type <a href="!W" title="Integer overflow">overflow</a>?</p>
<ul>
<li><p>In some languages, <em>x</em> might be a string being parsed as a number. (JavaScript is infamous for this due to its type coercion and redefining operators; this will evaluate to <code class="sourceCode javascript"><span class="kw">true</span></code>: <code class="sourceCode javascript">x <span class="op">=</span> <span class="st">&quot;1&quot;</span><span class="op">;</span> <span class="dv">2</span><span class="op">*</span>x <span class="op">==</span> <span class="dv">2</span> <span class="op">&amp;&amp;</span> x <span class="op">+</span> x <span class="op">==</span> <span class="st">&quot;11&quot;</span><span class="op">;</span></code>.)
<li><p>In highly dynamic or object-oriented languages, <code>+</code>, <code>==</code>, and <code>*</code> could all have been redefined per <em>x</em> and mean… just about anything, and do anything as side-effects of methods like getters.</p></li>
<li><p><a href="!W">homoglyph attack</a>: ‘x’ might not be ‘х’… because the second one is actually a different letter (CYRILLIC SMALL LETTER HA)</p>
<ul>
<li><a href="!W">“Trojan Source”</a> attack: OK, you are sure ‘x’ is in fact ‘x’; but are you reading the <em>right</em> ‘x’? What you think is a line of source code might actually be a commented out line, and the comment is the source code, due to Unicode right-to-left/bidirectional characters</li>
</ul></li>
</ul></li>
<li><p>Does multiplying integers like this potentially trigger <a href="!W">undefined behavior</a> and arbitrary compiler ‘optimizations’?</p>
<ul>
<li>If it can never overflow because it’s a “big int” with <a href="!W">arbitrary-precision arithmetic</a>, how much RAM does this allocate? What happens if the result is larger than fits in RAM? (How would evaluation order like <a href="!W" title="Lazy evaluation">laziness</a> affect this?)</p>
<p>How much do you know about your big-integer library to begin with? (They <em>do</em> have bugs, like all software.)</li>
</ul></li>
<li><p>If this is <a href="!W">floating point</a> (do you know for
sure?), won’t this usually be <em>false</em> at larger/smaller
numbers?</p>
<ul>
<li>What about floating point rounding or other exotic modes?</li>
<li>Or multiplying special values like <a href="!W">NaN</a> or +Inf vs −Inf?</li>
<li>If you know about all this and really did want that… how sure are you that <em>the compiler</em> isn’t incorrectly equationally-reasoning that they are equal and rewriting them behind your back?</li>
<li>Did the compiler optimization away this check entirely because “obviously” it is true?</li>
</ul></li>
<li><p>What is the <a href="!W" title="Order of operations#Programming languages">operator precedence</a> of this code?</p>
<ul>
<li>By the way, are you <em>sure</em> it’s a conditional at all? Perhaps it was parsed as <code class="sourceCode haskell">(x <span class="op">+</span> x <span class="op">==</span> <span class="dv">2</span>) <span class="op">*</span> x</code>?</li>
<li>Preprocessor/macro-expansion problems: In C, <code class="sourceCode c">x</code> might be a macro that expands to an expression with side effects, evaluated a different number of times on each side. Or <code class="sourceCode c"><span class="dv">2</span><span class="op">*</span>x</code> might lack parenthesizing: <code class="sourceCode c"><span class="pp">#define x </span><span class="dv">1</span><span class="op">+</span><span class="dv">1</span></code> makes <code class="sourceCode c"><span class="dv">2</span><span class="op">*</span>x</code> evaluate to 3.</li>
</ul></li>
<li><p>What is the evaluation order of this code?</p></li>
<li><p>This is serial-threaded code, right? No parallelism anywhere? If there is…</p>
<ul>
<li><p>Signal handlers can interrupt your code and change stuff. So your code is de facto concurrent anyway. Oh, but you solved that? So there’s no more parallelism.</p></li>
<li><p>Trick question: you thought there wasn’t, but there was anyway because <em>all</em> systems are inherently parallel now. So there are dangers around cache coherency &amp; races, leading to many classes of attacks/errors like <a href="!W"
title="Spectre (security vulnerability)">Spectre</a>.</p>
<p>And <em>x</em> here can change: the direct way of computing it would involve at least 5 values being stored &amp; referenced somewhere. (The 3 written <em>x</em> in the equation, then the sum of two, and then the multiplied version.)</p></li>
</ul></li>
<li><p>Are you running the right code at all? Maybe you’re running a cached binary, the build system didn’t rebuild the code, you are in the wrong working directory or the wrong VCS branch…</p></li>
<li><p>How likely is your computation to be corrupted or subverted by an attacker doing something like a <a href="!W">buffer overflow</a> attack or a <a href="!W">row hammer</a> attack, which winds up clobbering your <em>x</em>?</p></li>
<li><p>What happens if the computer halts or freezes or is DoSed or the power goes out halfway through the computation?</p></li>
<li><p>Why do you believe the hardware will always store &amp; execute everything correctly?</p>
<ul>
<li><p>“Neo, you think that’s real hardware you’re running on?” And VMs can be buggy and implement stuff like <a
href="https://bugs.launchpad.net/qemu/+bug/1815423">floating point</a> incorrectly, of course.</p></li>
<li><p>What are the odds that the hardware will be hit by a cosmic ray during any of these operations? Even <a href="!W">ECC RAM</a> is increasingly unreliable.</p></li>
<li><p>Or that your RAM has a permanent fault in it?</p>
  <p>(For several years, compiling the Gwern.net website would occasionally result in strange segfaults in apparently correct old stable regexp code; this turned out to be a bad RAM chip... where ordinary computer use simply didn’t stress RAM enough to crash noticeably often.)</p></li>
<li><p>What are the odds that the <a href="/doc/cs/hardware/2021-hochschild.pdf#google" title="‘Cores that don’t count’, Hochschild et al 2021">CPU core in question</a> is <em>sometimes</em> unable to add <em>or</em> multiply correctly? (If you’re a hyperscaler, they exist in your fleet of servers somewhere!)</p></li>
<ul>
<li>What are the odds you will discover another <a href="!W">Pentium FDIV bug</a>?</li>
</ul></li>
<li><p>How do you know all instances of <em>x</em> were never <a href="https://arxiv.org/abs/2102.11245#facebook" title="‘Silent Data Corruptions at Scale’, Dixit et al 2021">corrupted</a> <a href="https://arxiv.org/abs/2203.08989#facebook" title="‘Detecting silent data corruptions in the wild’, Dixit et al 2022"><em>anywhere</em></a> during storage or transmission?</p></li>
</ul></li>
</ul>
<p>I can safely say that in my programming life, I have written many fewer than trillions of lines of code, and I have made many more of these errors than 0.</p>
<p>So I infer that for even the simplest-seeming code, I am unable to write code merely as reliable as a 1-in-trillions error rate.</p>
<p>(See also: <a href="/esp-hacking">"Hacking Smartphone ESP Apps"</a>, <a href="/blog/2025/pinball-hacking" title="‘Hacking Pinball High Scores’, Gwern 2025">"Pinball High Scores"</a>, <a href="/unseeing">"On Seeing Through and Unseeing: The Hacker Mindset"</a>.)</p>

<div class='text-center' id='return-to-blog-index-link'>[<a href='/blog/index' class='link-page link-tag directory-indexes-upwards ' data-link-icon='arrow-up-left' data-link-icon-type='svg' rel='tag' title='Link to blog directory'>Return to blog index</a>]</div>