Change Wasm's cdylib etc. to be a "reactor". by sunfishcode · Pull Request #108097 · rust-lang/rust (original) (raw)

I realize that this PR is two years old, and I also realize that what I'm going to be asking below is a big ask especially relative to the size of this PR. In that sense I'm mostly curious to explore an extension to this PR which may or may not need to happen before this is landing. In any case...

One of the main consequences I think from this is is that _initialize is going to be exported, by default, for modules within Rust-based components for the wasm32-wasip2 target, for example. As @sunfishcode pointed out the tooling has support for this and the component will, indeed, invoke _initialize before any other export is called. That part I'm not worried about, but what I am worried about is that there's no optimization to turn this off.

To me the majority of components/code won't be using this feature and won't need _initialize, meaning that the function is effectively dead code. To compare before/after this PR, this Rust source:

#[unsafe(no_mangle)] pub extern "C" fn foo() {}

currently generates this module on wasm32-wasip1 with the cdylib crate type

(module $foo.wasm ;; ... (export "foo" (func $foo.command_export)) (func $foo (;0;) (type 0) return ) (func $dummy (;1;) (type 0)) (func $__wasm_call_dtors (;2;) (type 0) call $dummy call $dummy ) (func $foo.command_export (;3;) (type 0) call $foo call $__wasm_call_dtors ) ;; ... )

This is broken if __wasm_call_dtors actually does something because each invocation would run destructors which is probably going to result in surprising behavior. I believe this is something that @sunfishcode wants to fix, and I agree this should be fixed! Of note though is that this module has no other exports, it's just the foo function.

With this PR the generated wasm looks like:

(module $foo.wasm ;; ... (export "foo" (func $foo)) (export "_initialize" (func $_initialize)) (func $__wasm_call_ctors (;0;) (type 0)) (func $foo (;1;) (type 0) return ) (func $_initialize (;2;) (type 0) block ;; label = @1 global.get $GOT.data.internal.__memory_base i32.const 1048576 i32.add i32.load i32.eqz br_if 0 (;@1;) unreachable end global.get $GOT.data.internal.__memory_base i32.const 1048576 i32.add i32.const 1 i32.store call $__wasm_call_ctors ) ;; ... )

Here, as expected, _initialize is exported as well. This has some extra code to figure out it doesn't actually need to do anything, but that's not the end of the world.

What I'm specifically worried about is the component-level cost involved for supporting _initialize. The process of creating a component means that if _initialize is present a new core module must be instantiated to actually run _initialize. This is more costly relative to today where no extra core module is needed.


With that as background, my thinking is that we have before/after states of:

To me this feels kind of unfortunate and is something where I'd prefer to, for example, land something in wasm-ld and/or wasi-libc which skips _initialize altogether if there aren't actually any static constructors. That would mean that we could preserve the majority-status-quo while also adding support for static ctors at the same time. What I'm mostly afraid of otherwise is that we're disrupting the status-quo to add support for a feature which isn't otherwise widely used yet. Not to say of course it's not useful to support, nor to downplay that having it not work can be very surprising, but I'm afraid of the larger impact this will have on folks who aren't even aware of static ctors/dtors and aren't using them.

@sunfishcode do you know if it would be possible to implement such an optimization to conditionally export _initialize? I know wasm-ld has knowledge that __wasm_call_ctors is a noop, but I'm not sure how the _initialize function, defined in wasi-libc, could be conditionally exported depending on whether __wasm_call_ctors is a noop or not. I suspect that would require more wasm-ld integration than currently exists.