SGX: Fix fuzzy provenance casts with AtomicUsize
by thaliaarchi · Pull Request #139775 · rust-lang/rust (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Fix a pattern of #![allow(fuzzy_provenance_casts)]
for SGX which uses an AtomicUsize
as an AtomicPtr<_>
. These symbols are linked to be available externally, but I think AtomicUsize
and AtomicPtr<_>
have the same layout.
I have not addressed the other provenance issues for SGX.
cc @jethrogb @raoulstrackx @mzohreva
rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.
Use r?
to explicitly pick a reviewer
Target: SGX
Status: Awaiting review from the assignee but also interested parties.
Relevant to the library team, which will review and decide on the PR/issue.
labels
Thanks for the PR! I think both of these cases can probably just be replaced by OnceLock
/LazyLock
.
What about the linkage? Surely that would change the layout. But, if that's fine, then we could unbox them too.
What about the linkage?
That's just there to ensure there's a single instance of the static
between std
(the library crate) and its unit tests. See #139795.
Stepping back a little bit, these are used for initializing the args and env. Examining that might yield a better design.
For args, every other platform allocates them on demand, not in init
like SGX. Is there a particular reason that they are copied up front? Can argv be modified outside, so this is to capture the originals? Is copying from User::<[ByteBuffer]>
much more expensive than cloning Vec<OsString>
and should be reduced? I'm afraid I'm still rather nebulous on the exclave. If there's not a strong reason, I'd like to move it closer to how Unix does it.
As for env, I plan to do work on that next across all platforms. For now, at least, I see that iteration order for SGX is non-deterministic and differs between calls, because it collects from a HashMap
.
For args, every other platform allocates them on demand, not in init like SGX. Is there a particular reason that they are copied up front?
I don't think there's any particular problem with delaying this until first use, as long as everything is copied into the enclave at once. Thoughts on this @raoulstrackx?
It does mean the user memory can't be freed, but that's cheaper memory than enclave memory anyway.
If you want to go this route, I recommend using a OnceLock<alloc::User::<[ByteBuffer]>>
combined with a LazyLock<Vec<OsString>>
.
As for env, I plan to do work on that next across all platforms.
I don't know how much overlap there'll really be with other platforms. The enclave env starts out empty, and can only be populated using std::env::set_var
.
I see that iteration order for SGX is non-deterministic and differs between calls, because it collects from a HashMap.
For a single instantiation of a HashMap iteration order should be constant?
I've switched to OnceLock
, but haven't yet explored the copying part. @rustbot author
rustbot added S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-review
Status: Awaiting review from the assignee but also interested parties.
labels
Reminder, once the PR becomes ready for a review, use @rustbot ready
.
I think I understand how the user memory is managed. main_entry
documents that the userspace args should be (but not must be) deallocated by the enclave. The drop glue for alloc::User::<[ByteBuffer]>::from_raw_parts(argv as _, argc as _)
is what deallocates the args.
To make init
cheap for programs which don't use args, it would store only the user reference to the args and not copy from it. It could use a pair of atomics with relaxed ordering like Unix to be the most efficient. Alternatively, a simple OnceLock
.
Then, on Args
construction, it would copy to a vec::IntoIter<OsString>
with copy_user_buffer
. Each instance of Args
already makes fresh OsString
s, so this saves needing to keep around a OnceLock<Vec<OsString>>
, as long as calls to copy_user_buffer
every time is preferable over retaining a Vec
copy.
Then in the cleanup runtime hook, the user buffers could be manually freed. Is it bad to not deallocate them? Is it good practice to deallocate them as early as possible or does that not matter?
Each instance of Args already makes fresh OsStrings, so this saves needing to keep around a OnceLock
No, as mentioned, you must copy the data once, to avoid equivocation.
#[cfg_attr(test, linkage = "available_externally")] |
---|
#[unsafe(export_name = "_ZN16__rust_internals3std3sys3sgx2os8ENV_INITE")] |
static ENV_INIT: Once = Once::new(); |
static ENV: OnceLock = OnceLock::new(); |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to use a LazyLock and get rid of get_env_store/create_env_store.
Fixes fuzzy provenance casts with AtomicUsize
.
I've fixed merge conflicts and switched to LazyLock
for Env
. I had originally avoided LazyLock
, because explicitly initializing allows get-without-set flows to not allocate and the cosmetic clutter seems minor. I'll revert that if you agree.
@rustbot blocked. I want to wait on #140143.
Status: Blocked on something else such as an RFC or other implementation work.
and removed S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
labels
HashMap::new
doesn't allocate
Ah, right. So the only reason a deferred initialization is needed is because the RandomState
prevents HashMap::new
from being const
. If we used BTreeMap
instead, we would have const BTreeMap::new
and consistent env orderings between executions (though, as you said, in the same execution, it maintains the same random state). Alternatively, we could avoid the second synchronization by using Mutex<Option<HashMap<OsString, OsString>>>
instead of LazyLock<Mutex<HashMap<_, _>>>
.
I just noticed that Xous does essentially the same thing as SGX here, but they've fixed only the provenance issues, so I'll mirror the changes over to there.
Labels
Target: SGX
Status: Blocked on something else such as an RFC or other implementation work.
Relevant to the library team, which will review and decide on the PR/issue.