[llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types (original) (raw)

Arthur O'Dwyer via llvm-dev llvm-dev at lists.llvm.org
Tue May 26 17:31:13 PDT 2020

Previous message: [llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types
Next message: [llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, May 26, 2020 at 7:32 PM John McCall via cfe-dev < cfe-dev at lists.llvm.org> wrote:

On 26 May 2020, at 18:28, Bill Wendling via llvm-dev wrote: > [...] The store is a byte: > > orb $0x1,0x4a(%rbx) > > while the read is a word: > > movzwl 0x4a(%r12),%r15d > > The problem is that between the store and the load the value hasn't > been retired / placed in the cache. One would expect store-to-load > forwarding to kick in, but on x86 that doesn't happen because x86 > requires the store to be of equal or greater size than the load. So > instead the load takes the slow path, causing unacceptable slowdowns. [...]

Clang used to generate narrower loads and stores for bit-fields, but a long time ago it was intentionally changed to generate wider loads and stores, IIRC by Chandler. There are some cases where I think the “new” code goes overboard, but in this case I don’t particularly have an issue with the wider loads and stores. I guess we could make a best-effort attempt to stick to the storage-unit size when the bit-fields break evenly on a boundary. But mostly I think the frontend’s responsibility ends with it generating same-size accesses in both places, and if inconsistent access sizes trigger poor performance, the backend should be more careful about intentionally changing access sizes.

FWIW, when I was at Green Hills, I recall the rule being "Always use the declared type of the bitfield to govern the size of the read or write." (There was a similar rule for the meaning of volatile. I hope I'm not just getting confused between the two. Actually, since of the compilers on Godbolt, only MSVC follows this rule <https://godbolt.org/z/Aq_APH>, I'm probably wrong.) That is, if the bitfield is declared int16_t, then use 16-bit loads and stores for it; if it's declared int32_t, then use 32-bit loads and stores. This gives the programmer a reason to prefer one declared type over another. For example, in

template struct A { T w : 5; T x : 3; T y : 4; T z : 4; };

the only differences between A and A are

whether the struct's alignment is 1 or 2, and
whether you use 8-bit or 16-bit accesses to modify its fields.

"The backend should be more careful about intentionally changing access sizes" sounds like absolutely the correct diagnosis, to me.

my $.02, Arthur -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200526/59c929f8/attachment.html>

Previous message: [llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types
Next message: [llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the llvm-dev mailing list