Add __divtf3 by tgross35 · Pull Request #622 · rust-lang/compiler-builtins (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation12 Commits5 Checks24 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

tgross35

Division for f128. Still pretty buggy

@tgross35

Unfortunately this algorithm has an exponent rounding error with subnormals for f128, llvm/llvm-project#93401.

__divtf3(0x00000000000000000000000000000001, 0x0001ffffffffffffffffffffffffffff): std: 0x3f8e0000000000000000000000000001, builtins: 0x3f8f0000000000000000000000000001

@tgross35

The f128 issue is proving pretty tricky to fix. To help slim down the surface area, I am going to split the generic refactor to another PR.

@tgross35

f128 division is working with full iterations 🎉 I need to figure out what is wrong with the half-width iterations, and do some more documentation.

@tgross35

It looks like there are some platform-specific issues with the system libraries:

@beetrees

The i686 issue appears to be an ABI issue: GCC aligns f128 arguments to 16 bytes on the stack, whereas Clang/LLVM does not. Looking at the 32-bit x86 System V ABI specification, GCC's behaviour appears to be correct:

Padding may be needed to increase the size of each parameter to enforce alignment according to the values in Table 2.1.

I believe the relevant LLVM issue is llvm/llvm-project#77401.

@tgross35

Interesting, I am surprised that doesn't show up for the other functions.

I suspect that the aarch64 division issue is another case of llvm/llvm-project#91840 that is still waiting on your fix to be merged.

@tgross35

I think this is ready for a look when you get the chance @Amanieu.

f128 seems to require one more half-width iteration than expected, and this still isn't as tidy as I would like. However, tests pass so as-is seems reasonable enough to unblock division in std as well as #614. I plan to cycle back for the other fixes.

@tgross35 tgross35 marked this pull request as ready for review

August 19, 2024 21:16

@tgross35

Even with the extra iteration performance isn't bad. Not sure why we are that much faster here.

div_f128/compiler-builtins
                        time:   [261.10 µs 262.32 µs 264.06 µs]
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) high mild
  9 (9.00%) high severe
div_f128/system         time:   [329.93 µs 330.71 µs 331.47 µs]
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

f32 and f64 are essentially unchanged.

Amanieu

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, though I didn't check the algorithm in detail.

@tgross35

Float division requires some shift operations on big integers; implement right shift here.

@tgross35

Add some bounds to integer types that allow our function trait bounds to be slightly less verbose. Also clarify documentation and remove a redundant operation.

@tgross35

Float division currently has a separate div32 and div64 for f32 and f64, respectively. Combine these to make use of generics. This will make it easier to support f128 division, and reduces a lot of redundant code.

This includes a simplification of division tests.

@tgross35

Use the new generic division algorithm to expose __divtf3 and __divkf3.

@tgross35

@tgross35

Thanks for taking a look! I'll have some more improvements to do here but at least this gets us off the ground.

tgross35 added a commit to tgross35/rust that referenced this pull request

Sep 24, 2024

@tgross35

tgross35 added a commit to tgross35/rust that referenced this pull request

Sep 27, 2024

@tgross35

tgross35 added a commit to tgross35/rust that referenced this pull request

Sep 28, 2024

@tgross35

tgross35 added a commit to tgross35/rust that referenced this pull request

Sep 28, 2024

@tgross35

bors added a commit to rust-lang-ci/rust that referenced this pull request

Sep 28, 2024

@bors

cberner pushed a commit to cberner/rust that referenced this pull request

Sep 28, 2024

@tgross35 @cberner

RalfJung pushed a commit to RalfJung/miri that referenced this pull request

Oct 3, 2024

@bors

cberner pushed a commit to cberner/rust that referenced this pull request

Oct 13, 2024

@tgross35 @cberner