Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

num::WrappingFrom trait for conversions between integers #3703

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

scottmcm
Copy link
Member

@scottmcm scottmcm commented Oct 1, 2024

We have From for infallible, TryFrom for checked, and this proposes WrappingFrom for modular conversions.

Rendered

@ChayimFriedman2
Copy link

An alternative is to have it as inherent methods. That solves the problems with as but not generics.

@clarfonthey
Copy link

A few thoughts:

  1. Why not From<Wrapping<T>> for Wrapping<U>?
  2. What about NonZero types?

@ehuss ehuss added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Oct 1, 2024
@kpreid
Copy link
Contributor

kpreid commented Oct 1, 2024

I look forward to being able to write fewer as casts, and this seems like a well-defined chunk of them to carve off. Thoughts:

  • The RFC does not discuss the tradeoff between being a From-shaped trait (Self = destination type) and an Into-shaped trait (Self = source type). I think From is probably more appropriate (because Into is frequently ambiguous in numeric expressions), but it should be discussed explicitly.
  • Should a trait with exactly this signature be added to num-traits (or another published library) so that the new design can be tested out in stable-targeting code? The RFC mentions num_traits::FromPrimitive, but that is a fallible conversion, which is very different.
  • The trait documentation doesn't say a lot that actually constrains what it can be expected to do, unless you know exactly what “quantized numeric lattice” means. The last paragraph is most concrete and specifically giving a property that should hold, but it is for “integer”s so that's just an example, not a “should”.
    • What are examples of non-integer types that should implement this trait, and what properties should they have?
    • What are examples of non-integer types that should not implement this trait (because they don't have any operation that is a wrapping conversion in this sense)?

Comment on lines +199 to +206
It turns out that
- `NonZero<u8>` has 2⁸ - 1 = 255 × 1 values
- `NonZero<u16>` has 2¹⁶ - 1 = 255 × 257 values
- `NonZero<u32>` has 2³² - 1 = 255 × 16843009 values
- `NonZero<u64>` has 2⁶⁴ - 1 = 255 × 72340172838076673 values
- `NonZero<u128>` has 2¹²⁸ - 1 = 255 × 1334440654591915542993625911497130241 values

(This works for anything that's a multiple of octets, as `0xFF…FF = 0xFF * 0x01…01`.)
Copy link
Member

@programmerjake programmerjake Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works for more than just factors of 0xFF.
also, you should use latex math syntax, since that does work on github and doesn't use a bunch of unicode characters that may not render properly (e.g. they didn't for me with the monospace font on my phone). Rendered example:

  • NonZero<u128> has $2^{128} - 1 = (2^8 - 1) \times (2^8 + 1) \times (2^{16} + 1) \times (2^{32} + 1) \times (2^{64} + 1)$ values
Suggested change
It turns out that
- `NonZero<u8>` has 2⁸ - 1 = 255 × 1 values
- `NonZero<u16>` has 2¹⁶ - 1 = 255 × 257 values
- `NonZero<u32>` has 2³² - 1 = 255 × 16843009 values
- `NonZero<u64>` has 2⁶⁴ - 1 = 255 × 72340172838076673 values
- `NonZero<u128>` has 2¹²⁸ - 1 = 255 × 1334440654591915542993625911497130241 values
(This works for anything that's a multiple of octets, as `0xFF…FF = 0xFF * 0x01…01`.)
It turns out that
- `NonZero<u8>` has $2^8 - 1 = (2^8 - 1)$ values
- `NonZero<u16>` has $2^{16} - 1 = (2^8 - 1) \times (2^8 + 1)$ values
- `NonZero<u32>` has $2^{32} - 1 = (2^8 - 1) \times (2^8 + 1) \times (2^{16} + 1)$ values
- `NonZero<u64>` has $2^{64} - 1 = (2^8 - 1) \times (2^8 + 1) \times (2^{16} + 1) \times (2^{32} + 1)$ values
- `NonZero<u128>` has $2^{128} - 1 = (2^8 - 1) \times (2^8 + 1) \times (2^{16} + 1) \times (2^{32} + 1) \times (2^{64} + 1)$ values
(This works for anything where the number of bits is a multiple of the number of bits of a smaller type, as `0xFF…FF = 0xFF * 0x01…01` and so on for larger groups of `F`s.)

@programmerjake
Copy link
Member

programmerjake commented Oct 1, 2024

  • What are examples of non-integer types that should implement this trait, and what properties should they have?

The only examples I can think of are not exactly well known (e.g. Galois Fields):

impl WrappingFrom<GaloisField<9>> for GaloisField<3> {
    ...
}

though a better example might be:

impl WrappingFrom<Angle> for PointOnUnitCircle { // both 0 and 2 pi radians convert to the same point
   ...
}

@programmerjake
Copy link
Member

programmerjake commented Oct 1, 2024

  • What are examples of non-integer types that should not implement this trait (because they don't have any operation that is a wrapping conversion in this sense)?

that part's easier:

  • impl !WrappingFrom<u8> for ! { ... } since you can't make a value of type !
  • impl !WrappingFrom<String> for char { ... } since it doesn't have a consistent definition and "" contains no chars anyway
  • impl !WrappingFrom<f64> for f32 { ... } -- conversion isn't wrapping...

@clarfonthey
Copy link

clarfonthey commented Oct 1, 2024

The RFC does not discuss the tradeoff between being a From-shaped trait (Self = destination type) and an Into-shaped trait (Self = source type). I think From is probably more appropriate (because Into is frequently ambiguous in numeric expressions), but it should be discussed explicitly.

I think that having a WrappingInto which depends on WrappingFrom might be nice, so you could use a method-call version. It's unlikely you'd want to actually implement this, like with From and Into, but it feels valuable IMHO.

@dhardy
Copy link
Contributor

dhardy commented Oct 2, 2024

Thanks for writing a more focussed RFC @scottmcm.

I agree that something like this should exist, both for usage in generics, and with an eventual goal of making as casts obsolete (or just Clippy-linted for anything that's not a pointer cast).

Not distinguishing between truncating and extending casts is a minor flaw, but likely better than attempting to distinguish these cases with separate traits.

No reflexive impl

My take is that if A: WrappingFrom<B> then we should have reflexive impls A: WrappingFrom<A> and B: WrappingFrom<B>, however it would be preferable to make this a soft expectation on implementations, and not use a blanket impl. Partly because non-numeric types shouldn't get this impl, but mostly because Rust does not (yet) have specialization or any other story for overlapping blanket impls.

  1. Why not From<Wrapping<T>> for Wrapping<U>?

Because From never truncates. And because it would be a pain to use (having to unwrap the target U type). At one point we used Wrapping in rand code; now we mostly use wrapping_add, wrapping_mul etc.

@scottmcm
Copy link
Member Author

scottmcm commented Oct 3, 2024

Not distinguishing between truncating and extending casts is a minor flaw, but likely better than attempting to distinguish these cases with separate traits.

One thing I've been thinking here is that From is an extending cast. So if you want a "definitely a truncation" function you could write something like where A: WrappingFrom<B>, B: From<A>.

But overall it's not obvious to me in which contexts you'd want to require a truncation, and trying to write a trait for truncation specifically reopens a bunch of hard problems about usize that WrappingFrom avoids.

@kpreid
Copy link
Contributor

kpreid commented Oct 3, 2024

But overall it's not obvious to me in which contexts you'd want to require a truncation,

Here’s one I can think of: If you have an integer that’s uniformly distributed across the type’s entire range (e.g. a hash), then truncating it produces a uniformly distributed value of the smaller type. If you convert it to a larger type, then now you have added zero bits, so it no longer covers the entire range.

I don't know of any specific applications that want this property and aren't non-generic or better served by different tools, though.

text/3703-wrapping-from.md Outdated Show resolved Hide resolved
text/3703-wrapping-from.md Outdated Show resolved Hide resolved
@FHTMitchell
Copy link

I like this proposal, but I think there is a more generic need for lossy conversions than just integers, for example you could imagine a trait

trait FromLossy<T> {
     fn from_lossy(t: &T) -> Self;
}

that converts from a wider set into a narrower one, but infallibly, like a "best effort" From. This would work for integers, yes, but also from larger floats to smaller ones, from bytes to utf8 strings, and I'm sure a whole host of other places where we currently use inherent methods.

It would be really useful to be able to, for example, do

let s: String = b"hello\xffworld".into_lossy();

@dhardy
Copy link
Contributor

dhardy commented Oct 4, 2024

@FHTMitchell while that may be true, it is a very different sort of conversion to integer wrapping conversions, so I don't think here is a good place to discuss it. Feel free to go back to #2484 (or possibly start a new, focussed, RFC).

@clarfonthey
Copy link

clarfonthey commented Oct 4, 2024

Yeah, I definitely don't like the idea of combining float and integer conversions into one trait, since for floats, the result is actual rounding, whereas for integers, your result could be totally different from the original number. That's the main purpose of having a dedicated WrappingFrom trait: it's clear what it does, and you're not overburdening the concept of a "lossy conversion" between different methods.

@scottmcm
Copy link
Member Author

scottmcm commented Oct 4, 2024

In particular, as the #2484 discussion shows, it's very hard to define a useful generic concept of "lossy" that isn't just "meh, it does something, good luck" -- because you need a precise definition for generic code to make any sense.

And, for example, f64f32 fits best with "lossy" being something like "the closest representable value in the target type", but that would mean 300_u16255_u8, which is very much not what <u8 as WrappingFrom<u16>>::wrapping_from is proposed to do here.

So as others have said, this is one particular type of conversion. If there are others, those will be other RFCs.

Co-authored-by: teor <teor@riseup.net>
@Aloso
Copy link

Aloso commented Oct 4, 2024

Also prior art: #3415

Summary of the discussion:

  • People like to bikeshed method names

  • wrap is probably a better name than truncate

  • If as for lossy numeric casts is deprecated, some people are concerned about verbosity and want shorter method names, and many are concerned about ecosystem churn

  • Deprecating integer to float casts is harder to justify (they don't wrap or saturate; the worst that can happen is a precision loss that is expected when dealing with floats). The other direction (float to integer) is more controversial.

  • There was no consensus on whether the TruncatingFrom and SaturatingFrom traits should be implemented for lossless conversions (e.g. u8 -> u16) as well.

@programmerjake
Copy link
Member

programmerjake commented Oct 4, 2024

an idea for {int}: WrappingFrom<{float}>: that's how you spell JavaScript-style float -> int conversions which are basically:

if input.is_finite() {
    // can be done without actually needing an intermediate bigint or allocation
    Output::wrapping_from(input.truncate().to_bigint())
} else {
    0
}

@Aloso
Copy link

Aloso commented Oct 5, 2024

{int}: WrappingFrom<{float}> would be confusing and not very useful.

  • The method name wrapping_from doesn't fit since it doesn't just wrap, but also rounds towards zero, and turns ±Infinity and NaN into 0
  • The behavior of {float} as {int} is to saturate (rather than wrap) large numbers, which I think is more useful.

@programmerjake
Copy link
Member

{int}: WrappingFrom<{float}> would be confusing and not very useful.

  • The method name wrapping_from doesn't fit since it doesn't just wrap, but also rounds towards zero, and turns ±Infinity and NaN into 0

if you think of wrapping as returning just the bits that fit in the target integer, JavaScript-style semantics are reasonable as wrapping float -> int:
e.g.
78187493530.73755f64 which in hexadecimal is 0x12_3456_789A.BCD0, the wrapping conversion extracts just the bits that fit giving 0x3456_789Au32

though this only works for negative numbers if you think of treating the float as being in sign-magnitude and extracting the bits as an unsigned integer, then negating the result if the float was negative (kinda like sign-magnitude to twos complement conversion). Infinities are treated as being 0x1...00000 with infinitely many zeros. NaN converting to zero matches Rust float -> int conversion.

  • The behavior of {float} as {int} is to saturate (rather than wrap) large numbers, which I think is more useful.

something else being more useful doesn't mean wrapping can't also be useful...

Comment on lines +62 to +64
/// For an integer `a: Big` and a smaller integer type `Small`, truncating with `wrapping_from`
/// commutes with `wrapping_add`, so `Small::wrapping_from(a).wrapping_add(b)` and
/// `Small::wrapping_from(a.wrapping_add(b))` will give the same result.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "b" here? I'm guessing it is supposed to be something like "for all b in Small"?

```rust
// in core::num (and std::num)

/// Performs potentially-lossy conversions in a quantized numeric lattice.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "quantized numeric lattice"?

Could this be something like "Performs potentially-lossy conversion for integral types"? Does it make sense for non-integer types? For floating points, I'm not sure when you would ever want it to wrap instead of produce an infinite value if the original is out of range.

This also doesn't really describe in what way the conversion is lossy, although that is discussed elsewhere in the RFC. It would probably be good to have simple description (it truncates), and a more formal description of the semantics, perhaps something like:

For an integer a: Big and a smaller integer type Small, Small::wrapping_from(a) returns a value b in Small such that a == b + (R * n) for some n in Big where R is the number of possible values in Small ( that is effectively Small::MAX - Small::MIN + 1 ).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returns a value b in Small such that a == b + (R * n) for some n in Big where R is the number of possible

I would phrase that:

returns the unique value $b$ in Small such that, in unlimited range mathematical integers, $a = b + n \times R$ where $R$ is the number of possible

note I'm using latex math notation $a = b + n \times R$

@ronnodas
Copy link

ronnodas commented Oct 7, 2024

This talks about From<Wrapping<T>> but I think impl WrappingFrom<T> for U is more akin to impl From<T> for Wrapping<U>. For example in impl WrappingFrom<u32> for u16 we don't care that the u32 itself might wrap, but the output u16 has the property that adding 1 << 16 is a noop.

@programmerjake
Copy link
Member

but the output u16 has the property that adding u16::MAX is a noop.

I think you mean adding 0x10000 since u16::MAX == 0xFFFF and adding that isn't a noop when wrapping, it's equivalent to subtracting 1.

Copy link
Contributor

@PatchMixolydic PatchMixolydic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few typos:


`WrappingFrom` is specific to *numbers*. You can't use it for things like `String: From<&str>` which we've used a bunch.

It's is implemented for the numeric widening conversions that also exist in `From`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It's is implemented for the numeric widening conversions that also exist in `From`.
It's implemented for the numeric widening conversions that also exist in `From`.

fn wrapping_from(value: T) -> Self;
}

impl i8/i16/i32/i64/i128/isize/u8/u16/u32/u64/u128/usize for i8/i16/i32/i64/i128/isize/u8/u16/u32/u64/u128/usize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
impl i8/i16/i32/i64/i128/isize/u8/u16/u32/u64/u128/usize for i8/i16/i32/i64/i128/isize/u8/u16/u32/u64/u128/usize {
impl WrappingFrom<i8/i16/i32/i64/i128/isize/u8/u16/u32/u64/u128/usize> for i8/i16/i32/i64/i128/isize/u8/u16/u32/u64/u128/usize {


## No *technical* symmetry requirement

If there'a `A: WrappingFrom<B>`, then there *should* be an `B : WrappingFrom<A>` as well.
Copy link
Contributor

@PatchMixolydic PatchMixolydic Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If there'a `A: WrappingFrom<B>`, then there *should* be an `B : WrappingFrom<A>` as well.
If there's `A: WrappingFrom<B>`, then there *should* be a `B: WrappingFrom<A>` as well.

Alternatively,

Suggested change
If there'a `A: WrappingFrom<B>`, then there *should* be an `B : WrappingFrom<A>` as well.
If `A: WrappingFrom<B>`, then `B` *should* implement `WrappingFrom<A>`.

Comment on lines +268 to +269
There could, for example, be some sort of `.truncate()` method similar to `.into()`,
perhaps tweaked for better turbofishing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably, this would be via a WrappingInto trait, not truncate, and be wrapping_into. I think that it's fine to exclude from the initial RFC, but this should probably be at least sort-of fleshed out here.

@jongiddy
Copy link

I see i32::max as u32 as strange rather than evil, and I don't see how this proposal changes it. i32::max as u32 will still provide the integral address of a function after this proposal is implemented. This proposal does not give an alternative way to express this. I find this example confusing as motivation for the proposal.

On the other hand, it is not clear from the proposal how signed to unsigned conversion (and vice versa) is handled. Will WrappingFrom<i32> for u32 exist? And what does it do? From the name I would guess that it transmutes the representation such that it provides the two's complement. But from the description I could as easily think that it truncates to 31 bits. Or that it is not implemented at all.

My understanding is that WrappingFrom<T> for U is implemented for the primitive integral types where sizeof<T> >= sizeof<U> and takes sizeof<U> bytes from the lowest bytes of T and transmutes them to U. But I cannot recreate this from the definitions in the reference section. While this may be true for the primitive types, I expect that this should be specified in mathematical terminology so that types with other representations give similar results.

@programmerjake
Copy link
Member

programmerjake commented Oct 10, 2024

My understanding is that WrappingFrom<T> for U is implemented for the primitive integral types where sizeof<T> >= sizeof<U>

afaict WrappingFrom<T> for U is implemented for all combinations of integer types regardless of if it's a no-op, just changing type, truncating, sign-extending, or zero-extending.

WrappingFrom<T> for U for an input value $t$ of type T gives the unique value $u$ of type U where $u = t + n \times 2^W$ for the integer $n$ where $u$ is in range and $W$ is the bit-width of U -- so it does the exact same thing as t as U.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.