Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscellaneous cleanup to formatting #69209

Merged
merged 4 commits into from
Feb 26, 2020
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Move the show_usize marker function to a static
Currently, function items are always tagged unnamed_addr, which means that
casting a function to a function pointer is not guaranteed to produce a
deterministic address. However, once a function pointer is created, we do expect
that to remain stable. So, this changes the show_usize function to a static
containing a function pointer and uses that for comparisons.

Notably, a *static* may have 'unstable' address, but the function pointer within
it must be constant.

Resolves issue 58320.
  • Loading branch information
Mark-Simulacrum committed Feb 17, 2020
commit f6bfdc95445180aee579dcacc6e6bdc4e6ecf56f
23 changes: 16 additions & 7 deletions src/libcore/fmt/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -255,12 +255,19 @@ pub struct ArgumentV1<'a> {
formatter: fn(&Opaque, &mut Formatter<'_>) -> Result,
}

impl<'a> ArgumentV1<'a> {
#[inline(never)]
fn show_usize(x: &usize, f: &mut Formatter<'_>) -> Result {
Display::fmt(x, f)
}
// This gurantees a single stable value for the function pointer associated with
// indices/counts in the formatting infrastructure.
//
// Note that a function defined as such would not be correct as functions are
// always tagged unnamed_addr with the current lowering to LLVM IR, so their
// address is not considered important to LLVM and as such the as_usize cast
// could have been miscompiled. In practice, we never call as_usize on non-usize
// containing data (as a matter of static generation of the formatting
// arguments), so this is merely an additional check.
#[unstable(feature = "fmt_internals", reason = "internal to format_args!", issue = "none")]
static USIZE_MARKER: fn(&usize, &mut Formatter<'_>) -> Result = |_, _| loop {};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @eddyb -- am I correct that this reasoning is correct / this should work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how it's much more different from the previous approach.
LLVM will replace loads from USIZE_MARKER with the fn pointer, which is still unnamed_addr.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my reading of your comment in a different issue, my understanding is that the non-determinism is strictly in the casting from a function (or closure) to the function pointer.

But even if the closure here is unnamed_addr, the function pointer produced is guaranteed to be stable. I guess what we don't guarantee is that this closure did not get merged with a different function. But that's fine, we don't technically need to -- anywhere that this could have gotten merged, would have to be associated with a usize in the "value" part of Argument, right? So the safety property would still hold.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what the argument is. Indeed, we shouldn't introduce non-determinism and USIZE_MARKER == USIZE_MARKER should always hold.

The funny thing though is that because you used |_, _| loop {}, anyone doing the same thing (loop {} in the body) could end up being collapsed to this (I don't think the signature matters, because usize is not passed by value).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature not mattering is an interesting point. There's probably no way to prevent that collapsing, though, right? We could plausibly do something weird (e.g., print the address of the passed references shifted to the right or something) and just say "no one else will do this", I guess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should at least load an usize and blackbox it.
Otherwise there is no guarantee there is an usize (or a same-size integer) behind the opaque pointer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to do this in a follow-up PR to avoid getting this bogged down (and r? you on that one).


impl<'a> ArgumentV1<'a> {
#[doc(hidden)]
#[unstable(feature = "fmt_internals", reason = "internal to format_args!", issue = "none")]
pub fn new<'b, T>(x: &'b T, f: fn(&T, &mut Formatter<'_>) -> Result) -> ArgumentV1<'b> {
Expand All @@ -270,11 +277,13 @@ impl<'a> ArgumentV1<'a> {
#[doc(hidden)]
#[unstable(feature = "fmt_internals", reason = "internal to format_args!", issue = "none")]
pub fn from_usize(x: &usize) -> ArgumentV1<'_> {
ArgumentV1::new(x, ArgumentV1::show_usize)
ArgumentV1::new(x, USIZE_MARKER)
}

fn as_usize(&self) -> Option<usize> {
if self.formatter as usize == ArgumentV1::show_usize as usize {
if self.formatter as usize == USIZE_MARKER as usize {
// SAFETY: The `formatter` field is only set to USIZE_MARKER if
// the value is a usize, so this is safe
Some(unsafe { *(self.value as *const _ as *const usize) })
} else {
None
Expand Down