Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC for unsafe blocks in unsafe fn #2585

Merged
merged 15 commits into from
Apr 29, 2020
135 changes: 135 additions & 0 deletions text/0000-unsafe-block-in-unsafe-fn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
- Feature Name: unsafe_block_in_unsafe_fn
- Start Date: 2018-11-04
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

No longer treat the body of an `unsafe fn` as being an `unsafe` block. To avoid
a breaking change, this is a warning now and may become an error in a future
edition.

# Motivation
[motivation]: #motivation

Marking a function as `unsafe` is one of Rust's key protections against
undefined behavior: Even if the programmer does not read the documentation,
calling an `unsafe` function (or performing another unsafe operation) outside an
`unsafe` block will lead to a compile error, hopefully followed by reading the
documentation.

However, we currently entirely lose this protection when writing an `unsafe fn`:
If I, say, accidentally call `offset` instead of `wrapping_offset`, or if I
dereference a raw pointer thinking it is a reference, this happens without any
further notice when I am writing an `unsafe fn` because the body of an `unsafe
fn` is treated as an `unsafe` block.

For example, notice how
[this PR](/~https://github.com/rust-lang/rust/pull/55043/files) significantly
increased the amount of code in the thread spawning function that is considered
to be inside an `unsafe` block.

The original justification for this behavior (according to my understanding) was
that calling this function is anyway unsafe, so there is no harm done in
allowing *it* to perform unsafe operations. And indeed the current situation
*does* provide the guarantee that a program without `unsafe` cannot be UB.
However, this neglects the other aspect of `unsafe` that I described above: To
make the programmer aware that they are treading dangerous ground even when they
may not realize they are doing so.

Using some more formal terminology, an `unsafe` block generally comes with a
Centril marked this conversation as resolved.
Show resolved Hide resolved
proof *obligation*: The programmer has to ensure that this code is actually safe
to execute in the current context, because the compiler just trusts the
programmer to get this right. In contrast, `unsafe fn` represents an
*assumption*: As the author of this function, I make some assumptions that I
expect my callees to uphold. Making `unsafe fn` also implicitly play the role
of an `unsafe` block conflates these two dual aspects of unsafety (one party
making an assumption, another party having the obligation to prove that
assumption). There is no reason to believe that the assumption made by the
`unsafe fn` is the same as the obligation incurred by unsafe operations inside
this function, and hence the author of the `unsafe fn` should better carefully
check that their assumptions are sufficient to justify the unsafe operations
they are performing. This is what an `unsafe` block would indicate.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

When you perform an unsafe operation, like dereferencing a raw pointer or
calling an `unsafe` function, you must enclose that code in an `unsafe` block.
The purpose of this is to acknowledge that the operation you are performing here
has *not* been checked by the compiler, you are responsible yourself for
upholding Rust's safety guarantees. Generally, unsafe operations come with
detailed documentation for the conditions that must be met when this operation
is executed; it is up to you to check that all these conditions are indeed met.

When you are writing a function that itself has additional conditions to ensure
safety (say, it accesses some data without making some necessary bounds checks,
or it takes some raw pointers as arguments and performs memory operations based
on them), then you should mark this as an `unsafe fn` and it is up to you to
document the conditions that must be met for the arguments.

Your `unsafe fn` will likely perform unsafe operations; these have to be
enclosed by an `unsafe` block as usual. This is the place where you have to
check that the requirements you documented for your own function are sufficient
to satisfy the conditions required to perform this unsafe operation.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

First of all, we no longer warn that an `unsafe` block is unnecessary when it is
nested immediately inside an `unsafe fn`. So, the following compiles without
any warning:

```rust
unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T {
unsafe { x.get_unchecked(i) }
}
```

However, nested `unsafe` blocks are still redundant, so this warns:

```rust
unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T {
unsafe { unsafe { x.get_unchecked(i) } }
}
```

In a next step, we have a lint that fires when an unsafe operation is performed
inside an `unsafe fn` but outside an `unsafe` block. So, this would trigger the
lint:

```rust
unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T {
x.get_unchecked(i)
}
```

This gets us into a state where programmers are much less likely to accidentally
perform undesired unsafe operations inside `unsafe fn`.

Even later, it might be desirable to turn this warning into an error.

# Drawbacks
[drawbacks]: #drawbacks

This new warning will likely fire for the vast majority of `unsafe fn` out there.
Copy link
Contributor

@oli-obk oli-obk Nov 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can start as allow by default. The fact that const unsafe fn already behaves this way and that clippy can uplift this lint to warn, will already make sure to migrate large parts of the ecosystem.

Oh... you are already mentioning that below in the unresolved questions...

Copy link
Member Author

@RalfJung RalfJung Nov 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had to follow the RFC structure, didn't I? ;)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other possible drawbacks to list:

  1. It will become less ergonomic to write unsafe code (it's justified I think, but worth mentioning...).

  2. People might just do this:

unsafe fn frobnicate(x: T, y: U, ...) -> R {
    unsafe {
        ... // Actual code.
    }
}

and then nothing has been gained. I don't know what the risk of this is, but worth mentioning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added (a variant of) 1.

For 2., I think something has been gained: It is not possible to incrementally improve this function's unsafe locality. Or maybe it is not worth it, then that has at least been explicitly documented in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not entirely sure 2. is a drawback or not; I usually try to write the section as what someone else might think is a potential drawback (but not necessarily me) -- i.e. this is the section where I try to bring out my inner Dr. Phil / empathy =P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The drawbacks section now says

Many unsafe fn are actually rather short (no more than 3 lines) and will
likely end up just being one large unsafe block. This change would make such functions less ergonomic to write.

That should cover 2, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung the concern is actually slightly different here; it is that people will just go and write one big unsafe { ... } when they shouldn't.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that already possible today?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mark-i-m yes; sure -- the concern is that the change we do here might not have any noticable effect cause people could be lazy and...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be good that people can explicitly choose to not write a big unsafe block instead of being forced into one automatically?

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

I explained the rationale in the motivation section.

The alternative is to not do anything, and live with the current situation.
RalfJung marked this conversation as resolved.
Show resolved Hide resolved

# Prior art
[prior-art]: #prior-art

None that I am aware of: Other languages do not have `unsafe` blocks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C# has unsafe blocks in addition to unsafe methods. Though it's not super helpful since I'm not aware of the C# community ever discussing burden of proof issues like this RFC does, probably because 99.99% of the time the answer in that language is "unsafe just isn't worth it". I couldn't even find a C# style guide that mentions the existence of unsafe code, much less has guidelines for making it less dangerous.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Centril But this RFC is specifically about blocks and nesting of unsafe escape hatches. I do not think any of the examples you mention apply there, do they?

@Ixrec Thanks, I had no idea! And it looks like unsafe operations can be used freely in unsafe functions. :/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung not with that specificity no; the languages have the "block" form, e.g.:

x = unsafePerformIO $ do
    foo
    bar
    ...

what they lack is the unsafe function form.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's still quite different from Rust. It's just a normal function to the compiler, no checks for "unsafe operations" or so are performed. I do not see a close enough relation to this RFC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung alright; fair enough. Let's leave this bit (the comment thread) open for interested readers who want to see the associated material linked. :)


# Unresolved questions
[unresolved-questions]: #unresolved-questions

Should this lint be in clippy first before in becomes warn-by-default in rustc,
to avoid a huge flood of warnings showing up at once? Should the lint ever
become a hard error (on newer editions), or remain a warning indefinitely?
Centril marked this conversation as resolved.
Show resolved Hide resolved