Skip to content

Latest commit

 

History

History
233 lines (167 loc) · 7.91 KB

arith-overflow-buglist.md

File metadata and controls

233 lines (167 loc) · 7.91 KB

List of Bugs uncovered in Rust via arithmetic overflow checking

This document is a list of bugs that were uncovered during the implementation and deployment of arithmetic overflow checking.

This list is restricted solely to legitimate bugs. Cases where the overflow was benign (e.g. the computed value is unused), transient (e.g. the computed wrapped value is guaranteed to be brought back into the original range, such as in unsigned - 1 + provably_positive), or silly (random non-functional code in the tests or documentation) are not included in the list.

However, extremely rare or obscure corner cases are considered legitimate bugs. (We begin with such a case.)

impl core::iter::RandomAccessIter for core::iter::Rev

if one calls the iter.idx(index) with index <= amt, then it calls the wrapped inner iterstor with a wrapped around value. The contract for idx does say that it does need to handle out-of-bounds inputs, so this appeared benign at first, but there is the corner case of an iterator that actually covers the whole range of indices, which would then return Some(_) here when (pnkfelix thinks) None should be expected.

reference: rust-lang/rust#22532 (comment)

std::sys::windows::time::SteadyTime

fn ns was converting a tick count t to nanoseconds via the computation t * 1_000_000_000 / frequency(); but the multiplication there can overflow, thus losing the high-order bits.

Full disclosure: This bug was known prior to landing arithmetic overflow checks, and filed as:

rust-lang/rust#17845

Despite being filed, it was left unfixed for months, despite the fact that the overflow would start occurring after 2 hours of machine uptime, according to:

rust-lang/rust#22788

pnkfelix included it on this list because having arithmetic overflow forces such bugs to be fixed in some manner rather than ignored.

std::rt::lang_start

The runtime startup uses a fairly loose computation to determine the stack extent to pass to record_os_managed_stack_bounds (which sets up guard pages and fault handlers to deal with call stack over- or underflows).

In this case, the arithmetic involved was actually overflowing, in this calculation:

let top_plus_20k = my_stack_top + 20000;

pnkfelix assumes that in practice this would lead to us attempting to install a guard page starting from some random location, rather than the actual desired address range. While the lack of a guard page in the right spot is probably of no consequence here (assuming that the OS is already going to stop us from actually attempting to write to stack locations resulting from overflow if that ever occurs), attempting to install a guard page on a random unrelated address range seems completely bogus. pnkfelix only observed this bug when building a 32-bit Rust on a 64-bit Linux host via cross-compilation.

So, probably qualifies a rare bug. reference:

rust-lang/rust#22532 (comment)

UPDATE: In hindsight, one might argue this should be reclassified as a transient overflow, because the whole computation in context is:

let my_stack_bottom =
    my_stack_top + 20000 - OS_DEFAULT_STACK_ESTIMATE;

where OS_DEFAULT_STACK_ESTIMATE is a large value (> 1mb).

However, my claim is that this code is playing guessing games; do we really know that the stack is sufficiently large that the computation above does not underflow?

So pnkfelix is going to leave it on this list, at least for now. (pnkfelix subsequently changed the code to use saturated arithmetic in both cases, though obviously that could be tweaked a bit.)

struct order of evaluation

There is an explanatory story here:

rust-lang/rust#23112

In short, one of our tests was quite weak and not actually checking the computed values. But arithmetic-overflow checking immediately pointed out an attempt to reserve a ridiculous amount of space within a Vec. (This was on an experimental branch of the codebase where we would fill with a series of 0xC1 bytes when a value was dropped, rather than filling with 0x00 bytes.)

It is actually quite likely that this test would still have failed without the arithmetic overflow checking, but it probably would have been much harder to diagnose since the panic would have happened at some arbitrary point later in the control flow.

Invalid Span constructions:

rust-lang/rust#23480

Experts in the syntax system probably already knew about this bug, but overflow checking forced everyone to be reminded of it, repeatedly, in:

rust-lang/rust#23115

See in particular:

rust-lang/rust#23115 (comment)

We have not actually fixed issue #23480 at the time of this writing. Instead we put in a workaround:

rust-lang/rust#23489

Overflow in benchmark iteration increment

The implementation of #[bench] was not prepared to deal with an overflow in the iteration count.

The arithmetic overflow checking forced that to fail on a benchmark like

#[bench]
fn foo(b: &mut test::Bencher) {}

rather than return bogus results.

The iteration count is stored in a u64, so overflow sounds unlikely -- even though n is being multiplied by 2 on every increment, so it sounds plausible that we could get to very large numbers indeed, the fact that we have to actually run that many iterations is what gives me pause.

At this point I am assuming that the compiler is doing something to optimize away the looping when the benchmark code is empty: keep in mind that the #[bench] code above is not calling the black_box method of b. Maybe we should be requring that?

Anyway, the chosen fix was to just return the results so far before hitting the overflow:

rust-lang/rust#23127

multiline error spans

This seems like a combination of a typo (or "think-o" perhaps) doing lo - hi instead of hi - lo, but also a fencepost error.

rust-lang/rust#31281

Type checking of tuple patterns

rust-lang/rust#33864 (comment)

/~https://github.com/rust-lang/rust/pull/33864/commits/0ca9bf394006fe635a4a76ca6fa78a70633666f0

This code passed in an expected_len rather than max_len, a logic error, and the arithmetic checking code caught it. (I'm not quite clear on whether there is a unit test that also could have caught it independently? It sounds like if such a test exists, it was not part of the test suite...)

Underflow when counting registers

rust-lang/rust#29091

Apparently our C ABI code for x86_64 was determining where an argument needed to be passed in memory or in a register via some logic that computes the number of available registers via cumulative subtraction from a running total, and compares the number available to how many registers we need. If available is less than the the number needed, we know its passed in memory.

The problem was that it originally used an unsized integer type for the accumulated number available, and thus if underflowed, we would stop indicating that those values needed to be passed via memory.

(Definitely seems like a legit and subtle bug to me.)

rustc session error count

This was a simple logic error that produced bogus values for the accumulated error count. The underflow checker caught it, our compile-fail/ test suite at some point started hitting it (but only on debug builds, which at that time were not tested by bors/homu I guess).

rust-lang/rust#31257

Servo /dom/html/reflection-embedded.html

I am not an expert on the CSS specification, but my intuition is that this is a legitimate bug (which was uncovered by the overflow checker):

servo/servo#11175

servo/servo#10544 (comment)