Change Wasm's cdylib
etc. to be a "reactor". by sunfishcode · Pull Request #108097 · rust-lang/rust (original) (raw)
I realize that this PR is two years old, and I also realize that what I'm going to be asking below is a big ask especially relative to the size of this PR. In that sense I'm mostly curious to explore an extension to this PR which may or may not need to happen before this is landing. In any case...
One of the main consequences I think from this is is that _initialize
is going to be exported, by default, for modules within Rust-based components for the wasm32-wasip2
target, for example. As @sunfishcode pointed out the tooling has support for this and the component will, indeed, invoke _initialize
before any other export is called. That part I'm not worried about, but what I am worried about is that there's no optimization to turn this off.
To me the majority of components/code won't be using this feature and won't need _initialize
, meaning that the function is effectively dead code. To compare before/after this PR, this Rust source:
#[unsafe(no_mangle)] pub extern "C" fn foo() {}
currently generates this module on wasm32-wasip1
with the cdylib
crate type
(module $foo.wasm ;; ... (export "foo" (func $foo.command_export)) (func $foo (;0;) (type 0) return ) (func $dummy (;1;) (type 0)) (func $__wasm_call_dtors (;2;) (type 0) call $dummy call $dummy ) (func $foo.command_export (;3;) (type 0) call $foo call $__wasm_call_dtors ) ;; ... )
This is broken if __wasm_call_dtors
actually does something because each invocation would run destructors which is probably going to result in surprising behavior. I believe this is something that @sunfishcode wants to fix, and I agree this should be fixed! Of note though is that this module has no other exports, it's just the foo
function.
With this PR the generated wasm looks like:
(module $foo.wasm ;; ... (export "foo" (func $foo)) (export "_initialize" (func $_initialize)) (func $__wasm_call_ctors (;0;) (type 0)) (func $foo (;1;) (type 0) return ) (func $_initialize (;2;) (type 0) block ;; label = @1 global.get $GOT.data.internal.__memory_base i32.const 1048576 i32.add i32.load i32.eqz br_if 0 (;@1;) unreachable end global.get $GOT.data.internal.__memory_base i32.const 1048576 i32.add i32.const 1 i32.store call $__wasm_call_ctors ) ;; ... )
Here, as expected, _initialize
is exported as well. This has some extra code to figure out it doesn't actually need to do anything, but that's not the end of the world.
What I'm specifically worried about is the component-level cost involved for supporting _initialize
. The process of creating a component means that if _initialize
is present a new core module must be instantiated to actually run _initialize
. This is more costly relative to today where no extra core module is needed.
With that as background, my thinking is that we have before/after states of:
- Before this change ctors/dtors are most definitely broken, but they're (I think) rarely used. Also before this change a rust-based component does not need the extra instance in a component to run
_initialize
as it doesn't exist. - After this change dtors would work (they wouldn't be run) and static ctors also work (they actually get run). All rust-based components, however, would generate an extra instance in components to run
_initialize
which ends up doing nothing.
To me this feels kind of unfortunate and is something where I'd prefer to, for example, land something in wasm-ld and/or wasi-libc which skips _initialize
altogether if there aren't actually any static constructors. That would mean that we could preserve the majority-status-quo while also adding support for static ctors at the same time. What I'm mostly afraid of otherwise is that we're disrupting the status-quo to add support for a feature which isn't otherwise widely used yet. Not to say of course it's not useful to support, nor to downplay that having it not work can be very surprising, but I'm afraid of the larger impact this will have on folks who aren't even aware of static ctors/dtors and aren't using them.
@sunfishcode do you know if it would be possible to implement such an optimization to conditionally export _initialize
? I know wasm-ld
has knowledge that __wasm_call_ctors
is a noop, but I'm not sure how the _initialize
function, defined in wasi-libc, could be conditionally exported depending on whether __wasm_call_ctors
is a noop or not. I suspect that would require more wasm-ld integration than currently exists.