[llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types (original) (raw)
Arthur O'Dwyer via llvm-dev llvm-dev at lists.llvm.org
Tue May 26 17:31:13 PDT 2020
- Previous message: [llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types
- Next message: [llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, May 26, 2020 at 7:32 PM John McCall via cfe-dev < cfe-dev at lists.llvm.org> wrote:
On 26 May 2020, at 18:28, Bill Wendling via llvm-dev wrote: > [...] The store is a byte: > > orb $0x1,0x4a(%rbx) > > while the read is a word: > > movzwl 0x4a(%r12),%r15d > > The problem is that between the store and the load the value hasn't > been retired / placed in the cache. One would expect store-to-load > forwarding to kick in, but on x86 that doesn't happen because x86 > requires the store to be of equal or greater size than the load. So > instead the load takes the slow path, causing unacceptable slowdowns. [...]
Clang used to generate narrower loads and stores for bit-fields, but a long time ago it was intentionally changed to generate wider loads and stores, IIRC by Chandler. There are some cases where I think the “new” code goes overboard, but in this case I don’t particularly have an issue with the wider loads and stores. I guess we could make a best-effort attempt to stick to the storage-unit size when the bit-fields break evenly on a boundary. But mostly I think the frontend’s responsibility ends with it generating same-size accesses in both places, and if inconsistent access sizes trigger poor performance, the backend should be more careful about intentionally changing access sizes.
FWIW, when I was at Green Hills, I recall the rule being "Always use the
declared type of the bitfield to govern the size of the read or write."
(There was a similar rule for the meaning of volatile
. I hope I'm not
just getting confused between the two. Actually, since of the compilers on
Godbolt, only MSVC follows this rule <https://godbolt.org/z/Aq_APH>, I'm
probably wrong.) That is, if the bitfield is declared int16_t
, then
use 16-bit loads and stores for it; if it's declared int32_t
, then use
32-bit loads and stores. This gives the programmer a reason to prefer one
declared type over another. For example, in
template struct A { T w : 5; T x : 3; T y : 4; T z : 4; };
the only differences between A and A are
- whether the struct's alignment is 1 or 2, and
- whether you use 8-bit or 16-bit accesses to modify its fields.
"The backend should be more careful about intentionally changing access sizes" sounds like absolutely the correct diagnosis, to me.
my $.02, Arthur -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200526/59c929f8/attachment.html>
- Previous message: [llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types
- Next message: [llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]