Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Statics in patterns #3305
RFC: Statics in patterns #3305
Changes from 5 commits
840f5ed
e46df7f
2bfc0df
3858c98
7f577d6
24b6ffb
d4cd41c
7a6b6da
672b126
9d7130e
e076ae1
9f60fcb
7e9b3f9
1dc1ad4
efa8579
4b9417f
81232f7
1d564f3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignoring the
unsafe
issue, is there a reason that the pattern can't reference information that can change at runtime? My understanding of how cosmopolitan is going to use this is that it references information that changes at runtime (but only once, beforemain
), the values of the statics are not available at link time like they normally are. Given that why would it not be possible for a developer to do the same thing usingstatic mut
where they set the value once early inmain
before the pattern is ever tested against.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "patterns can't reference information that can change at runtime" rule is not something that I am proposing, it's a rule that Rust already has and that I see no reason to change. Tearing down this Chesterton's Fence may or may not be a good idea, but is out of scope for this RFC.
This is where the exact definition of "at runtime" comes into play. If you define "runtime" as "the time when Rust code runs", then the values of the statics don't change at runtime. I argue that for the purposes of this feature, it doesn't really matter whether the values of the statics are set by the linker, the operating system, the C runtime, etc; all of these things fall into the common category of "
rustc
doesn't know what the value is, but Rust code can rely on the value being constant." Note that Cosmopolitan sets the values of the statics in assembly (that I don't presume to fully understand); C code written against Cosmopolitan sees them as standard Cextern const
s, and doesn't need to worry about their value changing, either.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fundamentally this is how all rust static values work. Officially Rust "doesn't have life before
main
", but it still requires someone (an OS or an assembly boot script) to initialize all the pre-conditions before Rust code actually begins executing.So this part of the RFC is basically nothing new.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel a good analogy is dynamic linking. When you call a dynamically linked function in Rust, the compiler doesn't know the address of that function; that only gets determined when the program runs. But you can still write a safe wrapper for the dynamically linked function, and then treat it pretty much like any other Rust function (though not like a
const fn
, of course).Extern statics are the same. Unlike
const
s,rustc
doesn't know their value. But, once the necessary#[trusted_static]
ceremonies are performed, it should still be possible (IMO) to treat them"pretty much" like any other constant value.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not normally, the vast majority of Rust code is statically linked which means the values are known at link time (and LTO would then be able to make
const
andstatic
patterns generate the same code).Would it be unsound to use
#[trusted_static]
on a dynamically linked extern static that gets initialized early-in-main
before any code that reads the static runs? It just seems like it should be possible to unsafely promise that a value is constant once it starts being used in patterns, given that is essentially the same behavior you get with dynamically linked statics (where the dynamic loader has unsafely promised to the executable that all statics have been initialized and will no longer be written to).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh whoops, I'm the one in charge of trusted statics? That kinda stalled out because of like, long term correctness discussion XD
uhm, don't expect that one soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a list of the unresolved questions somewhere? (Or does one need to be compiled?) Might it be possible to ship an obviously correct MVP sooner, and consider loosening safety requirements later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the zulip thread, which the project tracking issue should link to.
The problem is really simple: It's hard to tell people that the unsafe-adjacent code they've been writing the past 6 years is just barely wrong, and that they need to start doing everything the opposite way because of small edge cases. That's a hard sell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we figure out the shiny new Right Way of Doing Things ® now, worry about migrating/accommodating old code later once people are used to the new syntax? The longer we wait before offering better tools, the more technically-bad code gets written, and the more painful any necessary changes will be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's easy if migration of code and re-training of programmers isn't a concern ;3
Literally we just would have to say one day "hey declaring extern static/fn is unsafe now". And like, a compiler patch to enforce that. That's the entire end goal for Rust 3015. It's just the migration that needs to be designed really.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this is enough for soundness. The reason
const
are not allowed to even point to immutablestatic
is pattern matching soundness: if you have aconst
of type&i32
, we better make sure that the integer it points to is truly immutable. But if you define astatic X: &i32
by doing&mut MUTABLE_STATIC
and then transmuting that, and then you doconst C: &i32 = X
, now you have yourself a potentially unsound pattern.But for
extern static
we don't know their values, so we can't know if they do shenanigans like that. Also for all we know their type may be a lie, so maybe they actually do get mutated? There is an entire lion's den of issues here that we have so far worked very hard to avoid (e.g. grep the compiler source code forcan_access_statics
and read all the comments associated with it), and all of this work is for naught if we allowextern static
in patterns.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I hope we can fix this in a future edition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this needs a new edition, we just need to be okay with semver hazards and post-monomorphization errors.
But anyway, that won't help for
extern static
. I can see some ways of resolving the structural match situation that would probably also make other kinds of "opaque consts" likeextern static
be fine (*), but as of now I don't know where the structural match stuff will go so I'd rather not rely on it going any particular way.(*) Well, there are also the issues around "evading match arms" by changing the pattern mid-evaluation, but that can probably be avoided by not doing any exhaustiveness checking and clearly documenting when opaque patterns vs valtree patterns are used (because they will probably not always be equivalent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant specifically that with
extern static
(non-mut) I would like declaring them to be the unsafe part, and it would require precise type and truly immutable memory and such, then the accesses themselves can be the safe part and people can worry a lot less.Doesn't help the actual valtree matching (or not) part of it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition we need here is that any pointer we reach through this memory points to immutable memory, recursively. This is required for matching on references. It is why I keep saying this is a bigger soundness issue than the RFC lets on.
I doubt trusted statics want to require that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How? The current stuctural match rules oly allow constants (and, with this feature, statics) if
PartialEq::eq
is guaranteed to give the same results as a valtree-based match.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's probably not correct. You certainly shouldn't rely on it. The rules for UB on references will probably not recursively follow references to arbitrary depth, but instead work on a 'shallow' level -- mutation is only UB one level down.
As a concrete example, Miri sees no UB here.
The current rules are a mess and before anyone relies on them for anything, we need to re-do those rules. See rust-lang/rust#74446.
The future rules will, hopefully, give users the ability to
unsafe impl StructuralEq for MyType
. (And in fact they can already do that, even withoutunsafe
, by using unstable features.)Users of course cannot expect things to behave completely sanely when they do that, but we still need to specify precisely what happens. Note in particular that saying "it is UB to have such an instance" does not work, that's simply not how we are using UB in Rust -- UB can only arise as a consequence of the program executing a statement in the wrong way, and only under conditions that can actually be decided automatically. "This type has a PartialEq that behavesd equivalently to valtrees" is not a decidable question (halting problem, yada yada), and hence cannot be grounds for UB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But "at the point this specific match statement was executed, this specific call to
PartialEq::eq
completed and returned a value without panicking, and that return value was different from what a valtree-based match would have returned" is decidable, no?(This would mean
loop {}
is a validPartialEq::eq
impl, but I don't think that's a problem).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the issues that arise with "a static of type
&&&i32
could change during a pattern match" already occur with "a scrutinee of type&&&i32
could change during a pattern match", no?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now we need a way to express that in MIR, and carefully specify exactly what that does and when.
That might be possible, but it'll be a lot of work, and it'll also make the core language spec more complicated for a niche feature.
Instead I think we should just specify that statics are treated as opaque for pattern matching and always use
PartialEq
no matter their type. We don't have any other "UB due to valtree inconsistencies", and IMO there is no good reason to introduce such UB here.No, patterns and scrutinees are treated very differently in pattern matching, because the pattern is inherently known at compile-time and the scrutinee isn't. This is not a symmetric situation. This RFC is proposing to make it more symmetric, but that is exactly why this is a bug fundamental change to pattern matching.
See here for a long discussion about how to ensure that we have UB when the scrutinee changes during pattern matching. The same approach does not obviously work for patterns, since it relies heavily on the fact that we compare the same value X against a number of things -- Rust has some ingredients that make it possible to ensure that X doesn't change when it is used multiple times. But with patterns that just happen to use the same extern static in several places across several patterns, we have none of that structure, so it'd be even more complicated.
So, again, we should just specify that extern statics are treated as opaque, and we should make no claims that this is equivalent to a const of the same type and value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very subtle guarantee, since the way these patterns are treated will have to be completely different. Matching
const
s relies heavily on knowing their value.const
will be taken apart to a 'valtree' that is matched on field-by-field, as least if the type permits, whereasextern static
will necessarily be compared viaPartialEq
. This may or may not produce the same result, depending on howPartialEq
is implemented.Above you reference the structural match RFC, but the rules in that RFC have little to do with what the compiler actually does. The real rules are quite the mess and need to be cleaned up, see rust-lang/rust#74446. But it seems inevitable that at least in unsafe code it will be possible to have "structural match" types that actually violate the structural match contract, and we need to say what happens then. This is not an operational requirement, so it probably cannot become UB.
So, I think guaranteeing they behave exactly the same is not feasible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If future relaxations of the structural match requirements are somehow impossible to port to static matches, we can always require the stricter rules for static patterns while loosening them everywhere else. It's impossible to say more without knowing what such relaxations might look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Future relaxation" makes it sound like we currently have some rules that we understand, and we want to change them. That is not the case. As I said, what the compiler does and what the RFC says are already different, and last time this came up, there was little interest in adjusting the compiler to the spec -- rather, the spec will be adjusted to the compiler.
So, you simply cannot use that RFC as a basis for further language evolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I referred to the structural match RFC, I was mostly concerned about the guarantee (as of right now) that, for any local
x
andconst
Y
of the same type,match x { Y => true, _ => false }
evaluates totrue
iffPartialEq::eq(&x, &Y)
evaluates totrue
. Do you know of a counter-example to this that works in stable Rust?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have patterns where
PartialEq::eq(&x, &Y)
doesn't even compile, see e.g. this example from the issue that I linked twice already (rust-lang/rust#74446).Also your stated equivalence is insufficient to substantiate the claim in the RFC since it fails to take into account UB. There are some fundamental reasoning principles for patterns that are currently true, that are no longer true with this RFC -- as explained here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I definitely need to account for that then. But are there cases where it does compile, but gives a different answer?
(Also, is "comparing function pointers is weird" just a potentially-fixable LLVM quirk, or is there some deep reason behind it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't think so. But I already explained why that is not sufficient grounds to claim that static and const matching are equivalent.
It's because of
unnamed_addr
, see e.g. rust-lang/rust#70861There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"all safe values" is definitely impossible to implement. I could define
MyWeirdUnit
has only one safe value, but the compiler has no way of knowing that.Maybe you meant "all valid values", but even that seems way too subtle to include here. Do we have any precedence for a rule like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, "is allowed to see" is too weak; I assume you mean that the compiler must accept the code below, not just that it is allowed to accept the code below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And finally, given that this is a large fraction of the RFC, there also needs to be motivation for all this complexity caused by "1-valued types". The motivating example certainly doesn't need any of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I basically interpreted this as exclusively applying to ZSTs, but wanting to be more general in case we somehow added the ability to increase the number of cases we could detect this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, is isn't possible now, but it eventually could be possible to mark fields as ignored when deriving
Eq
, which would allow such a type to be non-zero-sized while still having a zero-sized equivalence class.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should be "valid". I had a reason in my head for why this should be "safe", but upon writing it out I realized that my logic made no sense.
As stated in the RFC, there's no practical use-case (AFAIK) for the "one valid value" rule. It's just that, on general principle, I feel the compiler should never force you to write useless code that it statically knows to be unreachable, unless there's a good justification for it.