Add slice::ExactChunks and ::ExactChunksMut iterators by sdroege · Pull Request #47126 · rust-lang/rust (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation60 Commits11 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
These guarantee that always the requested slice size will be returned
and any leftoever elements at the end will be ignored. It allows llvm to
get rid of bounds checks in the code using the iterator.
This is inspired by the same iterators provided by ndarray.
Fixes #47115
I'll add unit tests for all this if the general idea and behaviour makes sense for everybody.
Also see #47115 (comment) for an example what this improves.
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @bluss (or someone else) soon.
If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.
Please see the contribution instructions for more information.
kennytm added the T-libs-api
Relevant to the library API team, which will review and decide on the PR/issue.
label
} else { |
---|
let start = self.v.len() - self.chunk_size; |
Some(&self.v[start..]) |
} |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use next back here to save that code duplication
} |
---|
#[unstable(feature = "exact_chunks", issue = "47115")] |
impl<'a, T> ExactSizeIterator for ExactChunksMut<'a, T> {} |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can implement is_empty
since it is actually simpler than size_hint/len (saves the division)
} else { |
---|
let start = (self.v.len() - self.chunk_size) / self.chunk_size * self.chunk_size; |
Some(&mut self.v[start..]) |
} |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, we can use next_back here. (The code here in ExactChunksMut::last is not updated to take advantage of the property that self.v is evenly divisible by the chunk size)
} |
---|
#[inline] |
fn nth(&mut self, n: usize) -> OptionSelf::Item { |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method can be much simpler if we just use the fact that self.v is evenly divisible by the chunk size. Same for the mutable version.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first try would be to use n
to find the new start of self.v
, then call self.next()
. Maybe that can be improved upon.
I've submitted code review, but I think we need the libs team and the ticky boxes to weigh in on whether to include this in libcore & libstd. I think these methods seem fine; a bit obscure (*). An unfortunate point is that these would be yet better with const generics and producing &[T; N]
and &mut [T; N]
respectively. (Unfortunate since such ideas mean that we need to wait for it to be available in Rust).
cc @rust-lang/libs
(*) Looking closer at the resulting difference it's not exactly obscure, it's essential functionality for that type of code
@bluss Thanks for your review comments, I've updated everything accordingly. Still no tests yet, they'll come if this is considered a good idea :)
let (_, snd) = self.v.split_at(start); |
---|
self.v = snd; |
assert!(self.v.len() == self.chunk_size); |
self.next() |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the assertion? It doesn't look correct as written, maybe >= was intended?
I'd probably avoid the assertion, there will be a bounds check equivalent to it anyhow in next(?).
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I was confused. Thanks!
The TrustedRandomAccess
impl minimally changes the assembly in my testcase (same number of instructions, basically equivalent) but has no real effect on the performance.
@sdroege Yep that's what I was thinking and, that's good info that it doesn't change anything. I think it can still be a good optimization in other cases?
I'm not entirely sure about that, also with regards to #47142. I'll do some benchmarking tomorrow.
Basically (for zip) we assume here that the compiler will optimize away the multiplication on each access. Without the TrustedRandomAccess
it would only be an increment every time instead of a multiplication, if the compiler does not optimize the multiplication away.
I assume when the trait was added and the specialized implementation for zip, this was all measured and taken into account though.
@sdroege I think you bring up details that matter; maybe a smarter zip specialization could be adopted that helps with that, or maybe even adding a more special case.
In my understanding, the current zip specialization helps with something "dumber" than that. In completely general .zip()
, we have two input iterators, and for each iteration, we need to ask them both if they have a next element; and this didn't compile well for some common loops with slices at least at the time when the specialization was added. With zip specialization, we only need to check "is there a next element" one time per iteration, and this is true even if we have a n-ary combination of .zip()
s.
@bluss If you don't mind I would suggest to skip the commit that adds the TrustedRandomAccess
from here so that we can discuss about the API itself instead. And move that commit to another PR at a later time.
In the meantime I'll do some more analysis and benchmarking of the effect of the trait impl for the normal Chunks
(and the other three).
@sdroege I think you bring up details that matter; maybe a smarter zip specialization could be adopted that helps with that, or maybe even adding a more special case.
Something that just increments on each iteration would be useful, as that would not rely on the optimizer to get rid of the multiplication. Something like a next_unsafe
that you must only call if you know that there is a next item. That also seems like it would solve the original purpose of the trait.
But this all seems like something for another issue to discuss it
Yes, let's split off that commit. Fwiw, both the current approach and a next unchecked approach were considered the first time. I think current was marginally better, but why not look at it again and with a wider use case.
Removed that commit. Now this should all be good to be reviewed for the actual new API.
I'm going to do some archaeology about the original implementation and reason for the trait being like this, and then open a new issue about it.
@bluss Ok, you already did all that very same investigation I was going to do back then :) See #33090 (comment) and #33090 (comment)
Basically the counter-index approach instead of pointer-increment can be optimized better by llvm. So I guess let's keep it at that and I'll re-add the commit here again. The chunks iterators are more or less the same as the normal slice iterators in every regard.
So in summary, I think the open questions here are the following:
- Does it make sense to add such a specialized chunks variant?
Tracking issue for chunks_exact/_mut; slice chunks with exact size #47115 (comment) and Tracking issue for chunks_exact/_mut; slice chunks with exact size #47115 (comment) would suggest so as it can improve performance a lot. It also potentially improves usability a bit as the code using the iterator can really assume that each slice will be exactly that many elements. - Should it use const generics?
This would mean using&[T; N]
instead of&[T]
(with a fixed size the compiler can infer). The function signature would be quite more complicated, and how to call it too (exact_chunks(n)
vsexact_chunks::<n>()
), but it seems more explicit (you have the slice length directly in your types).
It also has the disadvantage that having the function available and having it be stabilized would be coupled with const generics.
But I think the biggest disadvantage, and a reason why having both might be useful, is that the chunk size would have to be always known at compile-time.
and a reason why having both might be useful, is that the chunk size would have to be always known at compile-time.
I agree that having both could be better. Despite the increased API.
Rebased against latest master to solve a couple of merge conflicts.
Should the documentation mention that these methods may be faster than the existing ones? This seems like important information for helping users choose the appropriate iterator for a given use case.
Should the documentation mention that these methods may be faster than the existing ones? This seems like important information for helping users choose the appropriate iterator for a given use case.
Done, thanks
I think the name exact_chunks
for an API that works with dynamic, but fixed sizes is fine, and consistent with other std API like read_exact
.
In the same sense, if we would provide and API that works with const generics and [T; N]
, then we should call that fixed_chunks
to be consistent with the naming of fixed sized arrays. This naturally leads to providing both APIs, in my opinion.
In the libs meeting the possible concern got raised about it not being obvious that elements might get dropped at the end. I wounder if a solution to that would be what we did for copy_from_slice
: Just panic if the length of the slice is not evenly divisible into chunks, forcing the user to explicitly handle the case up-front.
The libs team discussed this today and was overall on board with landing this (@Kimundi commented above) so @bluss feel free to r+ when you're satisfied!
- Simplify nth() by making use of the fact that the slice is evenly divisible by the chunk size, and calling next() instead of duplicating it
- Call next_back() in last(), they are equivalent
- Implement ExactSizeIterator::is_empty()
…ter by the compiler
And also link from the normal chunks iterator to the exact_chunks one.
…t test
This way more useful information is printed if the test ever fails.
…_mut tests
Easy enough to do and ensures that the whole chunk is as expected instead of just the element that was looked at before.
These are basically modified copies of the chunks/chunks_mut tests.
Thanks, changed the "Fixes ..." to "See ...", everything else the same.
📌 Commit 5f4fc82 has been approved by bluss
Thanks @sdroege. The tracking issue you set up for this is #47115, I'll edit it a bit and maybe you can put in the remaining questions.
I've added the open question there now (panic or skip left-over elements). I don't think there were any other questions here
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request
Add slice::ExactChunks and ::ExactChunksMut iterators
These guarantee that always the requested slice size will be returned and any leftoever elements at the end will be ignored. It allows llvm to get rid of bounds checks in the code using the iterator.
This is inspired by the same iterators provided by ndarray.
Fixes rust-lang#47115
I'll add unit tests for all this if the general idea and behaviour makes sense for everybody. Also see rust-lang#47115 (comment) for an example what this improves.
kennytm added a commit to kennytm/rust that referenced this pull request
Add slice::ExactChunks and ::ExactChunksMut iterators
These guarantee that always the requested slice size will be returned and any leftoever elements at the end will be ignored. It allows llvm to get rid of bounds checks in the code using the iterator.
This is inspired by the same iterators provided by ndarray.
Fixes rust-lang#47115
I'll add unit tests for all this if the general idea and behaviour makes sense for everybody. Also see rust-lang#47115 (comment) for an example what this improves.
bors added a commit that referenced this pull request
/// |
---|
/// Due to each chunk having exactly `chunk_size` elements, the compiler |
/// can often optimize the resulting code better than in the case of |
/// [`chunks`]. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, it seems this is not referenced, resulting in a broken link.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, thanks for noticing. I'll submit a PR later, not sure yet why... or why rustdoc does not error out on broken links. Oh well :)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah understood why!
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why ? :)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See sdroege@1756f68 . It seems like you have to provide "full" paths somewhere in your doc chunk for "shortened" links. rustdoc doesn't seem to check the current scope for things with the same name.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I knew well why, you didn't reference the links in the scope :)
I asked more for:
why rustdoc does not error out on broken links
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should.....
/// |
---|
/// Due to each chunk having exactly `chunk_size` elements, the compiler |
/// can often optimize the resulting code better than in the case of |
/// [`chunks_mut`]. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
broken too.
sdroege added a commit to sdroege/rust that referenced this pull request
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request
…nks, r=kennytm
Fix broken links to other slice functions in chunks/chunks_mut/exact_…
…chunk/exact_chunks_mut docs
#[inline] |
fn next(&mut self) -> Option<&'a [T]> { |
if self.v.len() < self.chunk_size { |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition can probably be simplified to just self.v.is_empty()
because we already know that the slice has a length that is a modulo of the chunk_size
so the only reason why the slice can be too short is that the slice is empty.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[inline] |
---|
fn nth(&mut self, n: usize) -> OptionSelf::Item { |
let (start, overflow) = n.overflowing_mul(self.chunk_size); |
if start >= self.v.len() | |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition seems wrong, we must test the overflow before the test of the start
is greater or not than the self.v.len()
because if we have overflowed so the start
has been wrapped and can be smaller than the self.v.len()
. And the returned value will be wrong.
We probably want to panic if the computation is impossible and will overflow here, no ?
https://doc.rust-lang.org/std/iter/trait.Iterator.html#panics
EDIT: I am wrong about the order of the conditions, this is a ||
we don't care in this case.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still think that an overflow should panic instead of doing nothing? This is currently the same behaviour as for the other chunks
iterators and what was stabilized for them
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think checked but silent overflows is not a good behavior. But this is a really rare behavior to have an overflow to address the nth
element, so this is not something we should care of.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree but I think it's more problematic to have inconsistent behaviour between the different chunk
iterators (and we can't change the stabilized existing ones). But if there's disagreement I'd be happy to change it
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think that consistency between chunk
iterators is important, I think we must have consistency between all other iterators and panic if an overflow occurs like the Enumerate adapter do, so it could be a great improvement to update the current implementation of the Chunks/ChunksMut
to follow this rule. Don't you think ?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you open an issue about that? I would generally agree (and also think that panicking would be cleaner here) but changing Chunks/ChunksMut
could be considered a breaking change
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks :)
Labels
Status: Awaiting review from the assignee but also interested parties.
Relevant to the library API team, which will review and decide on the PR/issue.