update the safety preconditions of from_raw_parts by lolbinarycat · Pull Request #129483 · rust-lang/rust (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
they now reflect the fact that zero-capacity collections do not allocate
fixes #119304
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @workingjubilee (or someone else) some time within the next two weeks.
Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review
and S-waiting-on-author
) stays updated, invoking these commands when appropriate:
@rustbot author
: the review is finished, PR author should check the comments and take action accordingly@rustbot review
: the author is ready for a review, this PR will be queued again in the reviewer's queue
rustbot added S-waiting-on-review
Status: Awaiting review from the assignee but also interested parties.
Relevant to the library team, which will review and decide on the PR/issue.
labels
Pointers backing Vec
s of ZSTs can be dangling without capacity
being 0.
they now reflect the fact that zero-capacity collections do not allocate
fixes rust-lang#119304
Huh. My understanding was always that the zero-length Vec should be obtained specifically from dangling().
I would like it if you could clarify the motivation here.
The argument is that reconstructing a Vec may be unsound if this is not made true, because of Vec::new()
's internals.
However, it isn't actually language UB. This is about the library's contracts, and you're trying to argue the library is breaking its contract with... itself. Which it is allowed to do, as long as this doesn't lead to language UB. And it doesn't, as far as I understand. And more specifically, it is allowed to define an unfair contract, such that e.g. operating on a pointer "allocated" via Vec::new()
is acceptable but you calling dangling()
is not.
So, what do you actually have to say to the response of "I am the standard library. I make the rules."?
@workingjubilee
this singular line right here:
These requirements are always upheld by any
ptr
that has been allocated viaVec<T, A>
.
means that these "prerequisites" are not just that, but they are also a guarantee for Vec::as_raw_ptr
and into_raw_parts
, and the standard library does not uphold these guarantees as they are written today.
By simultaneously requiring that ptr
is allocated by the global allocator and saying that Vec
upholds the requirements, it's also saying that ptr
being allocated by the global allocator is an invariant of Vec
(at least as far as users are concerned).
This means that deallocating pointers obtained from Vec
should be sound, but it's not. It's easy to make a custom global allocator that meets the requirements on GlobalAlloc
but causes UB when trying to dealloc dangling pointers.
If the stdlib wants to prevent users from constructing dangling ptr
manually, an alternative solution would be to require that ptr
meets the existing requirements or is obtained from a Vec
.
Edit: Note that I don't think the alternative solution I mentioned is a good one. I don't see a good motivation for making the use of manually created dangling pointers unsound. I also think that this would make the docs harder to read.
Ah. Noticed that since this PR was opened, this one was also: #129748
That makes it far less likely that Vec::from_raw_parts
should (notionally) deviate.
workingjubilee changed the title
update the saftey preconditions of from_raw_parts update the safety preconditions of from_raw_parts
@@ -913,10 +913,9 @@ impl String { |
---|
/// This is highly unsafe, due to the number of invariants that aren't |
/// checked: |
/// |
/// * The memory at `buf` needs to have been previously allocated by the |
/// same allocator the standard library uses, with a required alignment of exactly 1. |
/// * unless `capacity` is 0, `buf` must have been allocated using the global allocator with an alignment of 1 and a capacity of `capacity`. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do we really want to promise that any non-null pointer is okay if capacity is 0?
Also, "allocated using the global allocator with [...] a capacity" is odd -- the allocation ABI doesn't talk about "capacity", it talks about "size". So this should probably say "and a size of capacity
".
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I kinda think we don't, and should specify the NonNull::dangling()
pointer if we accept any.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the point about using "size" rather than "capacity"; note that this is consistent with some of the current language on Vec::from_raw_parts
"capacity
needs to be the capacity that the pointer was allocated with". Maybe that should be changed as well then?
Regarding the requirement on String::from_raw_parts
; what about doing the same as in Vec::from_raw_parts
, making the alignment requirement always present? That would allow making your own dangling pointers, but I don't see the motivation for restricting the creation of dangling pointers to a specific API here.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the requirement on
String::from_raw_parts
; what about doing the same as inVec::from_raw_parts
, making the alignment requirement always present? That would allow making your own dangling pointers, but I don't see the motivation for restricting the creation of dangling pointers to a specific API here.
So that we can change the value that we're expecting as the dangling pointer.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that we can change the value that we're expecting as the dangling pointer.
NonNull::dangling
does specifically specify that it should never be used as a sentinel value, as it may also be a valid pointer.
what about doing the same as in Vec::from_raw_parts, making the alignment requirement always present?
is it possible to have a pointer that isn't byte aligned? I considered saying "the pointer must have an alignment of 1", but isn't that a given? maybe it can change the behavior of allocators? but, that shouldn't matter for a dangling pointer that is never freed.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NonNull::dangling
does specifically specify that it should never be used as a sentinel value, as it may also be a valid pointer.
See prior comments about the standard library being allowed to write unfair contracts and rely on its own internal implementation details in ways you cannot.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be being a bit silly here, but it just still doesn't seem like a good idea to bless all pointers thusly.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See prior comments about the standard library being allowed to write unfair contracts and rely on its own internal implementation details in ways you cannot.
except that the behavior of allocators is not an internal implementation detail of the standard library. even if it was, NonNull::dangling
can be manipulated into taking any value by changing the size of T
. therefore, even with the special privilege of being the standard library, using NonNull::dangling
as a sentinel value is still a logic error.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel I can just r+ on my own recognizance a doc fix to make it clear that whatever pointer is produced via String::new()
or Vec::new()
and then decomposing the type via into_raw_parts
or a similar API can then be passed into String::from_raw_parts
or Vec::from_raw_parts
. That is obviously a fix. Saying it's not allowed is silly, because you shouldn't have to check to make sure std didn't pass you the dangling pointer or anything.
...but I feel anything more runs into a question about stabilizing API details for container types, so lmk if you want this I-libs-api-nominated.
because you shouldn't have to check to make sure std didn't pass you the dangling pointer or anything.
it would be impossible to do this check, since there's not just one dangling pointer in existence, and also because the return value of NonNull::dangling
may coincidentally be a valid pointer.
Requiring libs-api signoff for this seems fair to me. For that I would like to bring up std::slice::from_raw_parts as relevant, which (from my reading) already accepts manually created dangling pointers and whose language is much clearer to me (though that is in part because the functionality is less complex).
slices aren't container types, they're pointers.
Even though slices aren't container types they are integrated with the Vec
API through slice::into_vec
.
You can use std::slice::from_raw_parts
(well actually, using std::ptr::slice_from_raw_parts
is better) together with Box::from_raw
and slice::into_vec
to implement your own Vec::from_raw_parts
that accepts any smaller than isize::MAX, aligned and non-null pointer when capacity
or size_of::<T>()
are 0: playground.
pub fn into_vec<A: Allocator>(self: Box<Self, A>) -> Vec<T, A> {
// N.B., see the `hack` module in this file for more details.
hack::into_vec(self)
}
I love it when std's code has comments like this.
Hmm, I see. That does constrain things a lot. I don't think we have a realistic choice otherwise, then.
I will re-review this for conformance with changes like #129748 and those formerly tracked by the now-closed #117945
@lolbinarycat It might be appropriate to add comments in this file that capture the fact that our implementation choices are stapled down by things like slice::into_vec
.
@oskgo Thank you for finding that example, it was basically what I was wondering if it already existed or not. It didn't occur to me to go find it on another type that isn't Vec.
rustbot added S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-review
Status: Awaiting review from the assignee but also interested parties.
labels
You can use std::slice::from_raw_parts (well actually, using std::ptr::slice_from_raw_parts is better) together with Box::from_raw and slice::into_vec to implement your own Vec::from_raw_parts that accepts any smaller than isize::MAX, aligned and non-null pointer when capacity or size_of::() are 0: playground.
We don't guarantee that the Vec
you end up with has the same pointer as what you started with, do we?
Though at this point we're getting deep into rules lawyering territory and there is probably little justification for not also guaranteeing this on Vec::from_raw_parts
.
alex-semenyuk added S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
labels
ping from triage - can you post your status on this PR? There hasn't been an update in a few months. Thanks!
FYI: when a PR is ready for review, send a message containing@rustbot ready
to switch to S-waiting-on-review
so the PR is in the reviewer's backlog.
status: i think i might be a bit over my head on this one, at the very least i would need a summary of the consensus of the relevant teams, at which point it might make more sense for someone else to just drive this forward.
Labels
Status: This is awaiting some action (such as code changes or more information) from the author.
Relevant to the library team, which will review and decide on the PR/issue.