-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Iterator::array_chunks
(take N+1)
#100026
Conversation
…ce used to create an `IntoIter` from its parts
(suggested in the review of the previous attempt to add `ArrayChunks`)
As explained in the review of the previous attempt to add `ArrayChunks`, adapters that shrink the length can't implement `TrustedLen`.
It doesn't seem to be used at all.
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
This comment has been minimized.
This comment has been minimized.
Note that I'm not sure if this the right approach and some design work is needed (#81615). |
+1 that this should just use https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.next_chunk -- conveniently, that should mean that this PR gets substantially simpler, since the |
let mut acc = init; | ||
let mut iter = self.iter.by_ref().rev(); | ||
|
||
// NB remainder is handled by `next_back_remainder`, so | ||
// `next_chunk` can't return `Err` with non-empty remainder | ||
// (assuming correct `I as ExactSizeIterator` impl). | ||
while let Ok(mut chunk) = iter.next_chunk() { | ||
chunk.reverse(); | ||
acc = f(acc, chunk)? | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by_ref
AND a double-reverse? 😬 that must be horrendously slow.
Imo we either need a next_chunk_back
or only limit it to forward iteration. The former could be done in a separate PR, but we need to decide on an approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry you had to see this impl 😅
@rustbot ready |
self.next_back_remainder(); | ||
|
||
let mut acc = init; | ||
let mut iter = self.iter.by_ref().rev(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: You have a Sized
iterator here (since it's a field), so you can avoid some of the by_ref
optimization penalty by using /~https://github.com/rust-lang/rust/blob/master/library/core/src/iter/adapters/by_ref_sized.rs instead of by_ref()
.
r? @scottmcm |
Self: Sized, | ||
F: FnMut(B, Self::Item) -> B, | ||
{ | ||
self.try_fold(init, |acc, x| NeverShortCircuit(f(acc, x))).0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid making a closure type generic over I
too:
self.try_fold(init, |acc, x| NeverShortCircuit(f(acc, x))).0 | |
self.try_fold(init, NeverShortCircuit::wrap_mut_2(f)).0 |
(and the same in rfold
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few comments, but overall I think this is essentially fine for nightly. (Lots of open questions for stabilization, but that's ok.)
Can you please open a tracking issue, add it to the unstable attributes, and address my other comments?
// Take the last `rem` elements out of `self.iter`. | ||
let mut remainder = | ||
// SAFETY: `unwrap_err` always succeeds because x % N < N for all x. | ||
unsafe { self.iter.by_ref().rev().take(rem).next_chunk().unwrap_err_unchecked() }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I'll block the PR on this, but unwrap_err_unchecked
here scares me -- I added a note to the tracking issue.
My instinct for now is that %N
is definitely trustworthy, so that combined with take
is sufficient to say this is ok. But I'm still afraid, since it's not TrustedLen
, and wonder if someone can come up with an evil example here where it somehow gets a full chunk and thus is UB in safe code.
(At least the obvious things, like a len()
that always says usize
, is protected against because of the modulo.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As already noted in another comment I don't like this entire method. Either we should add next_chunk_back
or remove the DEI impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, an evil impl can return less than rem
elements, but actual <= rem < N
still holds (I don't think there is any way to trick Take
into returning more than rem
elements) so the chunk will be an error no matter what.
An evil impl can just return a wrong length, so after this iterator returns a number of elements not divisible by N
. But this is fine too, an Err
will just be ignored:
rust/library/core/src/iter/adapters/array_chunks.rs
Lines 116 to 119 in 5fbcde1
// NB remainder is handled by `next_back_remainder`, so | |
// `next_chunk` can't return `Err` with non-empty remainder | |
// (assuming correct `I as ExactSizeIterator` impl). | |
while let Ok(mut chunk) = iter.next_chunk() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That being said, as @the8472 already said, we should probably rewrite the DEI impl anyway (I'll try to do this, after this lands).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the %
is certainly fine. What I'm not yet convinced of is that there's no way to make take
misbehave somehow -- for example, Take::try_fold
uses the inner try_fold
, so maybe there'd be a way for that to be implemented wrongly-but-not-UB that could result in too many things getting put in the array somehow.
But I agree that just making a nicer implementation without the unsafe{}
is the right plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you are right 😨
Overriding try_fold
with some nasty unsafe to skip over Break(_)
allows you to trick Take
into returning more elements that it should (playground).
n
can underflow, if a bad iterator impl skips over Break(_)
:
rust/library/core/src/iter/adapters/take.rs
Lines 87 to 89 in 2fbc08e
*n -= 1; | |
let r = fold(acc, x); | |
if *n == 0 { ControlFlow::Break(r) } else { ControlFlow::from_try(r) } |
The fact that unsafe code can't trust Take
is scary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem there is still in the unsafe
in Misbehave
-- it protects against double-drop, but duplicating arbitrary values can violate safety invariants in general. (For example, if the type is !Clone + !Default
, I can use you moving it into me to track resource consumption, like a ZST tracker for a global semaphore.) So that specific example is just "well unsound unsafe
-using code breaks everything".
But yeah, even though I've not been able to come up with a safe trick that would do it, things along those lines are what make me worried about it. (And certainly if specialization existed then it would be possible to do particularly weird things in safe code because it can violate parametricity even worse than in normal code.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example could check, that type_id::<B>() == type_id::<usize>()
or something similar, which would IMO make the code sound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought about that one, but TypeId::of::<B>()
needs B: 'static
, which we don't have here. Regardless, it's yet another chink in the armour that makes be scared we'll break through eventually.
Running over this again I noticed one thing I'm wondering about, but I don't think it needs to block going to nightly. Thanks for pushing this through! @bors r+ |
Clearly a bug! :D |
Add `Iterator::array_chunks` (take N+1) A revival of rust-lang#92393. r? `@Mark-Simulacrum` cc `@rossmacarthur` `@scottmcm` `@the8472` I've tried to address most of the review comments on the previous attempt. The only thing I didn't address is `try_fold` implementation, I've left the "custom" one for now, not sure what exactly should it use.
Rollup of 6 pull requests Successful merges: - rust-lang#99582 (Delay a span bug if we see ty/const generic params during writeback) - rust-lang#99861 (orphan check: rationalize our handling of constants) - rust-lang#100026 (Add `Iterator::array_chunks` (take N+1)) - rust-lang#100115 (Suggest removing `let` if `const let` or `let const` is used) - rust-lang#100126 (rustc_target: Update some old naming around self contained linking) - rust-lang#100487 (`assert_{inhabited,zero_valid,uninit_valid}` intrinsics are safe) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
A revival of #92393.
r? @Mark-Simulacrum
cc @rossmacarthur @scottmcm @the8472
I've tried to address most of the review comments on the previous attempt. The only thing I didn't address is
try_fold
implementation, I've left the "custom" one for now, not sure what exactly should it use.