Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoreferencing Copy Types #2111

Closed
wants to merge 5 commits into from
Closed
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
200 changes: 200 additions & 0 deletions text/0000-autoref-copy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
- Feature Name: autoref_copy
- Start Date: (fill me in with today's date, YYYY-MM-DD)
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

This RFC allows autoreferencing of `T` where `T: Copy`.

Example:

```rust
[derive(Copy, Clone)]
struct S(u8);

fn by_val(_: S) {}
fn by_ref(_: &S) {}
fn by_mut_ref(_: &mut S) {}

fn main() {
let s = S(5);
by_val(s);
by_ref(s); // This works after this RFC.
// by_mut_ref(s); // ERROR -- expected `&mut S`, found `S`
}
```

# Motivation
[motivation]: #motivation

When working with `Copy` data, the distinction between borrowed and owned data
is often unimportant. However, generic code often results in code which will
only accept a particular variant of a `Copy` type (`T`, `&T`, `&&T`, ...).
This is a frustration in many places:

```rust
// Comparisons:
let v = vec![0, 1, 2, 3];
v.iter().filter(|x| x > 1); // ERROR: expected &&_, found integral variable
// These work:
0 > 1;
&0 > &1;
// But these don't:
&0 > 1;
0 > &1;

// Trait instantiations:
let mut map: HashMap::new();
map.insert(0, "Hello");
map.insert(1, "world!");
map[0]; // ERROR: expected &{integer}, found integral variable
// or
map.get(1); // ERROR: expected `&{integer}`, found integral variable

// Numeric operators:
// These work:
&0 + &1;
0 + &1;
&0 + 1;
// But these don't:
&&0 + 1;
&&0 + &1;
&&0 + &&1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC should note that the proposed coercion will not make the examples with binary operations compile. Or the examples should be removed, they don't really motivate this RFC.

```

These interactions confuse both new and experienced users without providing
any significant value. It's clear what's intended by `map.get(1)` or
`vec.iter().filter(|x| x > 1)`. When users encounter these errors in practice,
the only reasonable thing they can do is add `&` and `*` as necessary to make
their code compile.

This RFC seeks to address one particular variant of this problem: passing
owned `Copy` data where a reference was expected.

Example:

```rust
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.insert(1, "hello!");

println!("{:?}", map.get(1)); // ERROR: expected `&{integer}`

// Instead, users have to write this:
println!("{:?}", map.get(&1));
}
```

This is an unnecessary frustration. This RFC proposes to prevent this issue
by auto-referencing `Copy` types.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

When calling a function which expects a reference to a `Copy` type, you can
pass an owned `Copy` type and a reference will be created automatically:

```rust
fn print_num(x: &usize) { println!("{:?}", x); }

fn main() {
let x = 5;
print_num(x);
}
```

This is particularly convenient when working with some generic types whose
methods take references as arguments:

```rust
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.insert(1, "hello!");

// We can write this:
println!("{:?}", map.get(1));

// Instead of having to write this:
println!("{:?}", map.get(&1));
}
```

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

`Copy` types will autoreference: at
[coercion sites](/~https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md#coercions),
when a reference is expected, but an owned
`Copy` type is found, an `&` will be automatically inserted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this will work?

trait Trait {}
impl Trait for u8 {}

let x: &Trait = 0u8;
// A reference to `Trait` is expected and `u8: Copy` is provided (but `u8 != Trait`) => a `&` is still inserted
// let x: &Trait = &0u8;
// Now coercion `&u8 -> &Trait` can apply.
// Nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my answer below, I would expect this to work since coercions are supposed to apply transitively.


If the `Copy` value is a temporary, its lifetime will be promoted to meet the
required lifetime of the reference (where possible).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this sentence at odds with the behavior specific in RFC 66 etc. Specifically, the sentence as written suggests to me that we will infer the appropriate lifetime for the temporary, which would be quite at odds with the rest of the language. I think I would say it more like this:

If the value to be borrowed is an rvalue (e.g., a constant like 1), then a temporary will be created for it, just as if one explicitly wrote the & (e.g., &1). This temporary will have the same lifetime that would result from an explicit &, which generally means that it lives until the end of the innermost enclosing statement (until the next ;) -- but in some cases it may live longer. See RFC 66 and @nikomatsakis's amendment for more details.

Note that a consequence of this is that in some cases there will be implicit borrows that might be surprising. An example:

fn foo(x: &i32) -> &i32 { x }

let mut x = 22
let p = foo(x); // implicitly: borrows `x`
x += 1; // ERROR -- `x` is borrowed
print(p);

This behavior is specified in detail in
[RFC 66](/~https://github.com/rust-lang/rust/issues/15023) and @nikomatsakis's
[amendment](/~https://github.com/nikomatsakis/rfcs/blob/rfc66-amendment/text/0066-better-temporary-lifetimes.md).

When it is not possible to promote the `Copy` value's lifetime to the lifetime
required by the reference, a customized error will be issued:

```rust
struct u8Ref<'a>(&'a u8);

fn new_static(x: u8) -> u8Ref<'static> {
u8Ref(x) // ERROR: borrowed value does not live long enough
// ^autoreference occurs here, but `x` does not live long enough
}
```

If the lifetime of the reference would overlap with a mutable reference or a
mutation of the referenced value, a custom error will be issued:

```rust
struct u8Ref<'a>(&'a u8);

fn main() {
let mut x = 5;
let y = u8Ref(x);
Copy link
Contributor

@oli-obk oli-obk Aug 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this rather unexpected. I thought this RFC meant that Copy types are copied at the site and create a reference to the copy. That is how such a function call looks to me, I'm moving the value into the function, but since it's Copy, it will be copied and I retain the original.

Copy link

@repax repax Aug 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the owner of x, you would have to trust u8Ref(x) not to mutate x.
(sorry for the misunderstanding)

That x is borrowed is indeed surprising.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@repax: this RFC only allows coercion from T -> &T, not &mut T, so u8Ref can't mutate x.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the lifetime of the borrow was limited to a function call I think I would be less confused:

    let mut x = 5;
    let y = foo(x);
    //          ^ autoreference of `x` occurs here
    x = 7; // `y` is unrelated to `x` so this is ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cramertj is there a guarantee that we'll never get a Copy-able UnsafeCell like wrapper?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rust-lang/rust#25053 was active not that long ago, seems there's at least some support for making UnsafeCell implement Copy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the copy-ability of UnsafeCell is a somewhat distinct question. The last time we discussed it in depth, the conclusion was that we ought to address with a lint. i.e., we should have the ability to make things copy, but warn if they are actually copied. =) I still think this makes sense, and it would apply here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am torn about whether to treat this auto-ref as borrowing the actual variable. I too find it somewhat surprising; I think I might prefer that we always create a temporary (with the standard lifetime). It would certainly suffice for the cases that are motivating the RFC; it seems like those cases where it would not suffice are also kind of confusing and unexpected to me (i.e., a reference that lives longer than the current statement).

It also seems more compatible with possibly extending this "auto-ref" to types beyond Copy types. At least the way we had previously talked about that, we wanted foo(x) to be loosely equivalent to { let y = x; foo(&y) }, but with better error messages indicating that one can add an &x if one must retain ownership. I am still interested in thinking that over, but it seems incompatible with the approach of this RFC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently also leaning towards making a copy every time instead of borrowing. That seems like:

  • the only way to prevent the combination of Copy+UnsafeCell and this RFC from allowing "implicit mutability", which seems like a terrible footgun, and
  • the only way to prevent the "found T, expected &T" errors we get today from turning into borrow checker errors (which are typically much harder to understand and fix, no matter how good the error message)

If that extra copy is somehow undesirable, iiuc you can always add the explicit & even after this RFC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make an implicit copy, but promise that the optimizer will almost always eliminate that copy in practice, except in cases like UnsafeCell?

// ^ autoreference of `x` occurs here
x = 7; // ERROR: cannot assign to `x` because it is borrowed
}
```

# Drawbacks
[drawbacks]: #drawbacks

- This increases the special behavior of `Copy` types, making it potentially
confusing to new users. Some simple, non-generic code becomes more confusing
because `fn foo(x: &i32) { ... }` can now be called like `foo(5)`;
- The existing coercion mechanism doesn't allow for coercion of generic
arguments, nor does it use coercions in trait lookup. Because of this, users
will still need to manually add `&` when trying to, for example, index into
a `HashMap<usize, T>`
(users will still need to write `x[&5]` instead of `x[5]`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do coercions in indexing? At least in the "single impl" case? You can write this today:

use std::collections::HashMap;

fn main() {
    let s : HashMap<u32, u32> = HashMap::new();
    let i: &&u32 = &&0;
    s[i];
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. I don't know why I thought we didn't. In particular, AsRef examples like &Rc<i32> work just fine:

use std::collections::HashMap;
use std::rc::Rc;

fn main() {
    let mut x = HashMap::new();
    x.insert(5, 5);
    x[&Rc::new(5)];
}

I will remove indexing from the examples. Thanks for catching me on that!

- Autoreferencing may produce surprising errors when attempting to mutate data.

# Rationale and Alternatives
[alternatives]: #alternatives

One alternative would be to do nothing. However, this issue is frequently
annoying for new users, and the solution is relatively simple.

Another alternative would be to also add the following conversions for
`T: Copy`:
- `T` to `&mut T`: This conversion has the potential to introduce hidden
mutations. With this change, passing a variable to a function could allow
the function to change the value of the variable, even if no `&mut` is present.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already mentioned elsewhere I think, using & does not exclude this. For example, Cell could well be Copy, and it is possible for a user crate to soundly implement a Copy Cell today.

I think this RFC should take the possibility of Copy interior mutable types into account.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically, this is not possible, because UnsafeCell is not Copy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't this get fixed recently? Or am I imagining it?

Copy link
Member

@RalfJung RalfJung Aug 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not Copy yet, but AFAIK there is any good reason for that. So this RFC should work under the assumption that UnsafeCell can be Copy.

EDIT: Also see rust-lang/rust#25053

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find @RalfJung's logic compelling -- we should consider interior mutability. That said, when we last discussed the Copy question, we had thought to add a lint to reflect the "can copy" vs "should copy" distinction (similar to "must use"). I thought this was a good plan, but it never happened. If it had -- and of course it still can! -- it could certainly apply here.

- `&mut T` and `&T` to `T`: This conversion would cause extra copies to occur
which could be difficult to identify when attempting to optimize code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the fact that copying is already implicit in many places, I don't think this is a strong concern. I would argue that if copying a type is actually noticeable in cost, it should not be Copy -- then the clone calls will make it obvious where the cost is incurred.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree with everything you said, it seems like the PoR is for [i128; 1_000_000] to be Copy...


# Unresolved questions
[unresolved]: #unresolved-questions
- Can we make it easier to use copy values where references are expected when
working with traits and generic code? In particular, it would be nice to make
operators such as `Index` or `Add` auto-reference and auto-dereference.
It would also be great if we could find some way to trigger existing deref
coercions in generic cases such as passing `&Rc<MyT>` to
`fn foo<T: SomeTrait>(t: &T)`.