Add Iterator::map_windows
by LukasKalbertodt · Pull Request #82413 · rust-lang/rust (original) (raw)
This method returns an iterator over mapped windows of the starting iterator. Adding the more straight-forward Iterator::windows
is not easily possible right now as the items are stored in the iterator type, meaning the next
call would return references to self
. This is not allowed by the current Iterator
trait design. This might change once GATs have landed.
The idea has been brought up by @m-ou-se here.
I think this method is quite versatile and useful. I hope the examples in the doc comment sufficiently demonstrate that.
One open question: the current design basically requires that the internal array buffer is shifted on each next
call, meaning it has an O(iter.count()
* N
) runtime complexity per next
(where N
is the const parameter). I expect this method to be called with very small N
s only, but of course it could be used with large arrays. To avoid this shifting, we would need an array version of VecDeque
basically. But this would also mean the closure wouldn't get a simple parameter anymore. I think we want this simpler but "slower for large Ns" version, but what do others thing? Maybe the docs should mention this caveat?
r? @sfackler
(rust-highfive has picked a reviewer for you, use r? to override)
Comment on lines 37 to 38
buffer.rotate_left(1); |
---|
buffer[N - 1] = next; |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slightly wasteful in theory: the first element does not need to be copied to the last slot as it's overwritten later anyway. I expect this to be optimized out anyway, but I haven't checked.
I haven't found a better non-unsafe
way to do this. copy_within
requires T: Copy
, swap
would work, but I doubt doing a bunch of swaps
is faster than the current version. Of course, the unsafe
would be straight-forward but I still wanted to avoid it for now. Any ideas or opinions?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using unsafe in std lib is acceptable if it leads to a performance advantage.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to just write to the first element then rotate?
Edit: On reflection I think this is probably not that costly either way, have to look at godbolt for it
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing to the first element doesn't seem to change that much compared to the current approach.
I played a bit on godbolt and I found 3 situations:
- They're optimized almost the same way. This often happen for tiny Ts, but sometimes also for bigger ones.
- The
unsafe
version may be optimized to 4-5x less assembly. This often happen for bigger Ts but small N. - The
unsafe
version is optimized to amemmove
call and the safe version to either a lot of moves or a loop. This often happen for bigger Ts and bigger N
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that, if .rotate_right(1)
and .rotate_left(1)
don't optimize to the obvious memmove
, then that's a bug in those methods that should be fixed.
(I vaguely recall someone on Discord was going to file me a bug about this...)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SkiFire13 Thanks for testing this stuff. But I agree with @scottmcm in that this should probably optimize to the optimal code basically. The implementation of this can always be changed later, so I wouldn't say it is hugely important to get this as efficient as possible in this PR. Especially before we even agreed we want this method.
This comment has been minimized.
This comment has been minimized.
One open question: the current design basically requires that the internal array buffer is shifted on each next call, meaning it has an O(iter.count() * N) runtime complexity per next (where N is the const parameter). I expect this method to be called with very small Ns only, but of course it could be used with large arrays. To avoid this shifting, we would need an array version of VecDeque basically.
There are a few places where this can be done more efficiently: Vec
, BinaryHeap
and []
.
Adding a []::windowed_iter<const N>()
could then return overlapping slices. Perhaps we should offer it only where it's efficient?
Adding a []::windowed_iter() could then return overlapping slices. Perhaps we should offer it only where it's efficient?
Wouldn't this be the same as array_windows?
IMO a great advantage of map_windows
compared to array_windows
is that it doesn't require to borrow anything, so you can return it from functions that don't or can't take the slice as parameter.
Anyway I don't think this is ready to be in the stdlib. I think it would make more sense to have it in itertools for now. There's also the possibility that we could get the same result by combining an hypotetical StreamingIterator::windows
with StreamingIterator::map_iter
or something like that.
Adding a []::windowed_iter() could then return overlapping slices. Perhaps we should offer it only where it's efficient?
Wouldn't this be the same as array_windows?
Oh. Yeah, exactly what I had in mind.
IMO a great advantage of
map_windows
compared toarray_windows
is that it doesn't require to borrow anything, so you can return it from functions that don't or can't take the slice as parameter.
Vec::into_array_windows()
could work if you need an owned version.
Of course this is still limiting compared to a general iterator adapter, but you could collect into an intermediate Vec
and then create the windows more cheaply from that, e.g.
iter.skip(...).filter_map(...).collect::<Vec<_>>().into_array_windows().fold(...)
A possible way to avoid the excess data movement: use a 2*N
-length internal array. Then most of the time you don't need to move anything, but every N elements you need to move the N-1 things in the buffer back to the beginning (which would be a copy_nonoverlapping
). That way it'd be amortized-O(1) copying per step, and when it does a copy it'd be in a nice big chunk that should copy efficiently.
Anyway I don't think this is ready to be in the stdlib. I think it would make more sense to have it in itertools for now.
Could you elaborate on that? Why do you feel this is not ready yet?
There's also the possibility that we could get the same result by combining an hypotetical
StreamingIterator::windows
withStreamingIterator::map_iter
or something like that.
Absolutely, and I would love to immediately go that path. But as I said, StreamingIterator
(well, hopefully it will be the same as Iterator
) is probably still quite a while out.
A possible way to avoid the excess data movement: [...]
That's a great idea! Do you think I should already change my implementation?
Is there an unsafe way to concat two slices of known size to a new size using transmute
or try_into
? This would help fix the time complexity while still having a nice type &[T; { tail_to_head + head_to_tail }]
?
Is there an unsafe way to concat two slices of known size to a new size using transmute or try_into?
Even ignoring the issue that it isn't safe in general it would only work on linear memory. If the data is out of order (i.e. needs rotation) then the slices aren't pointing to a linear section of memory.
Thinking about this some more, I think this api (too) easily hides that .iter().map_windows()
on a slice doesn't actually do .array_windows()
but instead keeps a separate array of references which it keeps fully shifting at every iteration, which is a lot less efficient.
Maybe .pairwise_map(|a, &b| ..)
would be less of a footgun?
crlf0710 added the T-libs-api
Relevant to the library API team, which will review and decide on the PR/issue.
label
To be honest, I'm not sure. pairwise_map
seems like a useful API as well, but as it's more specific, it is not a fully replacement for map_windows
.
When it comes to map_windows
on a slice iterator: yep, I agree that's unfortunate and would happen in the real world from time to time. But I'm not sure if that's really such a big problem, especially if we point it out in the docs. There are plenty other ways people can end up with iterator-chains that are slower than they could be. vec.iter().cloned()
is a classical example here.
r? @m-ou-se as Steven is now alumni and Mara left some thoughts.
rustbot added S-waiting-on-review
Status: Awaiting review from the assignee but also interested parties.
and removed S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
labels
/// # Panics |
---|
/// |
/// Panics if `N` is 0. This check will most probably get changed to a |
/// compile time error before this method gets stabilized. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added an item for this to the tracking issue.
// to 0, we treat them as uninitialized and treat their copies |
---|
// as initialized. |
unsafe { |
ptr::copy_nonoverlapping(buffer_ptr.add(N), buffer_ptr, N - 1); |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The index math on this line seems wrong.
Repro:
fn main() { for () in std::iter::repeat("0".to_owned()) .map_windows(|: &[; 3]| {}) .take(4) {} }
free(): double free detected in tcache 2 Aborted (core dumped)
dtolnay added S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-review
Status: Awaiting review from the assignee but also interested parties.
labels
@dtolnay Thanks for catching that logic bug. I was surprised my tests did not catch that. Digging deeper, I now believe my whole logic is flawed. In particular, I prepare out
inside of next
, then modify the underlying array and then return out
. That seems utterly broken. My tests also have very weird behavior that I suspect might be a sign of UB. In addition to that, the comment by @SkiFire13 is also a good one: iterators adaptors usually don't call next
on the underlying iterator on creation.
Soooo I think I will have to rethink my approach. I'm not sure how quickly I can find the time to do that. Just as a heads up that me fixing this could take a while.
JohnCSimon added S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
labels
JohnCSimon added S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
labels
JohnCSimon added S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
labels
Triage:
This PR has sat idle for a while. Maybe close it as inactive?
Yeah let's close it. I don't see myself fixing this PR anytime soon. I still think the API is very interesting and worth playing around with, but right now I don't have the resources to land this.
the8472 added the S-inactive
Status: Inactive and waiting on the author. This is often applied to closed PRs.
label
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request
…, r=Mark-Simulacrum
Add Iterator::map_windows
Tracking issue: rust-lang#87155.
This is inherited from the old PR rust-lang#82413.
Unlike rust-lang#82413, this PR implements the MapWindows
to be lazy: only when pulling from the outer iterator, .next()
of the inner iterator will be called.
Implementaion Steps
- Implement
MapWindows
to keep the iterators' Laziness contract. - Fix the known bug of memory access error.
- Full specialization of iterator-related traits for
MapWindows
.-
Iterator::size_hint
, -
,Iterator::count
-
ExactSizeIterator
(whenI: ExactSizeIterator
), -
,TrustedLen
(whenI: TrustedLen
) -
FusedIterator
, -
,Iterator::advance_by
-
,Iterator::nth
- ...
-
- More tests and docs.
Unresolved Questions:
- Is there any more iterator-related traits should be specialized?
- Is the double-space buffer worth?
- Should there be
rmap_windows
or something else? - Taking GAT for consideration, should the mapper function be
FnMut(&[I::Item; N]) -> R
or something likeFnMut(ArrayView<'_, I::Item, N>) -> R
? WhereArrayView
is mentioned in rust-lang/generic-associated-types-initiative#2.- It can save memory, only the same size as the array window is needed,
- It is more efficient, which requires less data copies,
- It is possibly compatible with the GATified version of
LendingIterator::windows
. - But it prevents the array pattern matching like
iter.map_windows(|_arr: [_; N]| ())
, unless we extend the array pattern to allow matching theArrayView
.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request
…, r=Mark-Simulacrum
Add Iterator::map_windows
Tracking issue: rust-lang#87155.
This is inherited from the old PR rust-lang#82413.
Unlike rust-lang#82413, this PR implements the MapWindows
to be lazy: only when pulling from the outer iterator, .next()
of the inner iterator will be called.
Implementaion Steps
- Implement
MapWindows
to keep the iterators' Laziness contract. - Fix the known bug of memory access error.
- Full specialization of iterator-related traits for
MapWindows
.-
Iterator::size_hint
, -
,Iterator::count
-
ExactSizeIterator
(whenI: ExactSizeIterator
), -
,TrustedLen
(whenI: TrustedLen
) -
FusedIterator
, -
,Iterator::advance_by
-
,Iterator::nth
- ...
-
- More tests and docs.
Unresolved Questions:
- Is there any more iterator-related traits should be specialized?
- Is the double-space buffer worth?
- Should there be
rmap_windows
or something else? - Taking GAT for consideration, should the mapper function be
FnMut(&[I::Item; N]) -> R
or something likeFnMut(ArrayView<'_, I::Item, N>) -> R
? WhereArrayView
is mentioned in rust-lang/generic-associated-types-initiative#2.- It can save memory, only the same size as the array window is needed,
- It is more efficient, which requires less data copies,
- It is possibly compatible with the GATified version of
LendingIterator::windows
. - But it prevents the array pattern matching like
iter.map_windows(|_arr: [_; N]| ())
, unless we extend the array pattern to allow matching theArrayView
.
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request
…, r=Mark-Simulacrum
Add Iterator::map_windows
Tracking issue: rust-lang#87155.
This is inherited from the old PR rust-lang#82413.
Unlike rust-lang#82413, this PR implements the MapWindows
to be lazy: only when pulling from the outer iterator, .next()
of the inner iterator will be called.
Implementaion Steps
- Implement
MapWindows
to keep the iterators' Laziness contract. - Fix the known bug of memory access error.
- Full specialization of iterator-related traits for
MapWindows
.-
Iterator::size_hint
, -
,Iterator::count
-
ExactSizeIterator
(whenI: ExactSizeIterator
), -
,TrustedLen
(whenI: TrustedLen
) -
FusedIterator
, -
,Iterator::advance_by
-
,Iterator::nth
- ...
-
- More tests and docs.
Unresolved Questions:
- Is there any more iterator-related traits should be specialized?
- Is the double-space buffer worth?
- Should there be
rmap_windows
or something else? - Taking GAT for consideration, should the mapper function be
FnMut(&[I::Item; N]) -> R
or something likeFnMut(ArrayView<'_, I::Item, N>) -> R
? WhereArrayView
is mentioned in rust-lang/generic-associated-types-initiative#2.- It can save memory, only the same size as the array window is needed,
- It is more efficient, which requires less data copies,
- It is possibly compatible with the GATified version of
LendingIterator::windows
. - But it prevents the array pattern matching like
iter.map_windows(|_arr: [_; N]| ())
, unless we extend the array pattern to allow matching theArrayView
.
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request
…, r=Mark-Simulacrum
Add Iterator::map_windows
Tracking issue: rust-lang#87155.
This is inherited from the old PR rust-lang#82413.
Unlike rust-lang#82413, this PR implements the MapWindows
to be lazy: only when pulling from the outer iterator, .next()
of the inner iterator will be called.
Implementaion Steps
- Implement
MapWindows
to keep the iterators' Laziness contract. - Fix the known bug of memory access error.
- Full specialization of iterator-related traits for
MapWindows
.-
Iterator::size_hint
, -
,Iterator::count
-
ExactSizeIterator
(whenI: ExactSizeIterator
), -
,TrustedLen
(whenI: TrustedLen
) -
FusedIterator
, -
,Iterator::advance_by
-
,Iterator::nth
- ...
-
- More tests and docs.
Unresolved Questions:
- Is there any more iterator-related traits should be specialized?
- Is the double-space buffer worth?
- Should there be
rmap_windows
or something else? - Taking GAT for consideration, should the mapper function be
FnMut(&[I::Item; N]) -> R
or something likeFnMut(ArrayView<'_, I::Item, N>) -> R
? WhereArrayView
is mentioned in rust-lang/generic-associated-types-initiative#2.- It can save memory, only the same size as the array window is needed,
- It is more efficient, which requires less data copies,
- It is possibly compatible with the GATified version of
LendingIterator::windows
. - But it prevents the array pattern matching like
iter.map_windows(|_arr: [_; N]| ())
, unless we extend the array pattern to allow matching theArrayView
.