RFC: supporting restrict
in libc++ (options) (original) (raw)
Continuing the preliminary discussion in issue 87178, please take a look over there for some comments by people with relevant experience.
Motivating example: restrict
pointers are not std::indirectly_readable
So we can’t use std::advance
and related functions from the standard library (compiler explorer). Working with multiple contiguous memory chunks containing values of a single integral or floating point type is quite common when implementing mathy stuff; this makes c++'s rules regarding no aliasing of different types and thus TBAA irrelevant, while avoiding unnecessary loads plays a very important part of optimizing such code (including auto-vectorization). It’s probably a good thing if libc++ could be used here.
The underlying issue: restrict
and the c++ type system
The c++ type system does not include a restrict
qualifier (unlike the type system for the c language). Nevertheless, the c++ frontend does currently consider it part of the type in c++ sources, impacting template instantiations and overload resolutions. This behavior isn’t limited to libc++ sources, naturally, but libc++ is implemented as if there are no such types in the language, again naturally.
Some additional relevant context
There seems to be no clear consensus regarding the semantics of a restrict
qualifier for the c++ virtual machine. For example, consider a restrict
class member and a default copy constructor for that class. AFAIK there’s no effort underway currently to standardize restrict
in c++.
LLVM has infrastructure in place for supporting the non-aliasing semantics (slides). Whether this gets triggered by a __restrict
keyword, a [[noalias]]
attribute, or some other mechanism is probably orthogonal to this RFC.
The -fstrict-aliasing
/-fno-strict-aliasing
flags allow only translation unit resolution, so cumbersome-to-impossible for applying to only some arguments of some methods.
Options
Please see some initial reactions in the issue.
- Do nothing.
- Keep
__restrict
as an LLVM extension and change libc++ code to support the intended functionality (adopting the c language type system for c++ sources, circumventing the underlying issue for libc++ only), by either- Add the necessary templates and concepts.
- Remove the
__restrict
qualifier in libc++ code (something like this). We’d lose the non-aliasing semantics and the associated optimization opportunities in this option, since the frontend will never see the qualifier
- Remove/ignore the
restrict
type qualifier in the c++ frontend logic (but let’s keep the non-aliasing semantics), potentially only for template instantiations and overload resolutions purposes. - Another option?
Unfortunately, __restrict
is widely supported as a compiler extension even in C++, so just ignoring it sounds like a non-option to me because it would, amongst other things, break both GCC and MSVC compatibility. I think this was already pointed out in the issue, but this would also affect cases such as
void f(int* a, int* __restrict b) {
// GCC and Clang accept this program
static_assert(!__is_same(decltype(a), decltype(b)));
}
Support for __restrict
is already fairly wonky in a few places (e.g. __restrict
on member functions is a mess, both in Clang and in general…), and I don’t think anyone is helped if we start to introduce even more divergent behaviour.
That said, since you mentioned overload resolution, one option would be to allow implicitly dropping __restrict
in implicit conversions in C++ only (possibly accompanied by a warning). I’m not too familiar with libc++, but would that help ameliorate this issue?
As an aside, I’m not entirely sure the concerns about it being unsound to do so in C apply to C++ seeing as __restrict
isn’t part of the C++ type system to begin with. At the same time, I also don’t know too much about how __restrict
is (or will be) handled in the middle/backend, so this may run into other issues that I’m not aware of.
To wit, I don’t think we can get away with treating e.g. int* __restrict
as being identical to int*
because that has rather far-reaching consequences, but allowing e.g.
int* __restrict* p;
int** q = p;
might be worth considering (this is also what MSVC does); iirc some people have voiced concern about this in the past, though.
Having already spent far too much time on __restrict
-related matters, I’d be fine with dealing with the implementation of this if we do end up agreeing on this approach.
tal June 14, 2024, 11:21pm 3
Thanks a lot for your comments!
As far as I can tell (compiler explorer), GCC and the GNU standard C++ library show the same inconsistency – namely that __restrict
is a c++ extension as far as the compiler is concerned, but it’s not properly supported by the standard library.
The issue with __restrict
pointers in the standard library is mainly due to missing templates (and concepts), in this case failing to define value_type
in std::indirectly_readable_traits<double * __restrict>
. Unless I’m missing something, an implicit conversion shouldn’t be able to “save” a method that falls out of the candidate set due to concept or requires
clause mismatch, it’s too late…
As far as I understand, the only intended implication of __restrict
down the pipeline is optimizing away unnecessary reads after stores, enabled by the no-aliasing promise. It’s almost the opposite of volatile
in the sense that we promise the value at that location will not change unless we issue a store using this specific __restrict
variable (and in that case we’d usually already have the new value in a register), so there’s really no need to re-fetch anything from memory.
Yeah, that is about what I would have expected…
Yeah, in that case it does look like the only solutions I can think of are
- using something like a
remove_restrict_t
, or - just not supporting
__restrict
on the grounds that, after all, it isn’t really part of C++.
I think all of that has already been pointed out in the issue, from what I recall, but yeah, I don’t think there is a ‘good’ solution to this problem unfortunately… (at least none I can think of; maybe someone else has an idea).
brunodf June 18, 2024, 12:17pm 5
I think this discussion focuses too much on restrict
specifically; I observe that the main concrete problem (cannot use std::advance
on a restrict-qualified pointer) is not limited to restrict: only the const
qualification is handled in the standard library specification (specifically in the specification of indirectly_readable_traits) so any other outer qualification on the pointer will not work (and of course, advancing a const
pointer is not possible either, but this is for other reason). This restriction prohibits non-standard qualification such as restrict
, but also other extensions such as address space qualification etc. (you can go over the list in clang Qualifiers to see which make sense for pointers). It also prohibits standard qualification such as volatile
(as outer qualifier on the pointer): while volatile
is deprecated in some contexts, it is certainly still present.
IMO, this issue should be addressed in the standard itself. As an outsider, I do naively wonder why the standard library is designed with a limited (closed world) view on type qualifiers (i.e. only const
and volatile
) when non-standard qualification is a fact in mainstream compilers, including clang. In the language itself, there are situations where outer qualifiers are dropped, and these extend to non-standard qualification. It seems possible to align the standard library with the language (including possible extensions), for example, a type trait that removes outer qualification as done in the language when passing by value, and not limited to const
and volatile
as defined for std::decay
.
Endill June 18, 2024, 9:10pm 6
Given the title of the RFC, this is intentional. Discussion other qualifiers here is not really on the topic, and should be done in a new thread.
Compiler extensions are to be specified by the compiler itself. The Standard has a very hard time speaking about outside things, beyond saying “implementation-defined” and “extensions are allowed in ill-formed space”.
That said, libc++ has hard enough time dealing with the standardized volatile
, as you yourself pointed out to. We shouldn’t burden current libc++ maintainers with even more work than they currently have.
brunodf June 18, 2024, 9:55pm 7
I was merely pointing out that, while the OP started this thread for the restrict
qualifier, the same issue exists for other qualifiers as well, and it thus helps to look at this from a perspective broader than restrict
.
That is a fair point, it is very hard. But the current (“closed world”) approach in the standard library poses problems for any additional type qualification added as an extension. That seems a common kind of extension to me, I count that clang has accumulated no less than 6 such extensions: restrict, unaligned, GC, lifetime, address space, and ptrauth.
I was not suggesting that the libc++ maintainers should do anything on their own, even for volatile
it seems they are following the standard.
tal June 18, 2024, 10:42pm 8
I’m not sure I understand where the difficulty is coming from – I’m not saying there’s isn’t one, just trying to understand the challenge in a little more details, so we can have a more informed discussion on the cost of properly supporting it in libc++.
My thinking is that the restrict
qualifier is completely irrelevant for the correctness of the vast majority (all?) of the standard library functionality, it just needs to be “passed along” to be consumed down the LLVM pipeline for potential performance optimizations. This makes it much simpler than volatile
, which actually rules out some implementations: caching something in a local variable and re-using the value later is incorrect for volatile
somethings. In other words, an implementation that is correct for some non-restrict
type should be correct for the restrict
qualified type as well (except potentially less performant) – am I missing anything?
If that’s true, simply removing the restrict
qualifier (like here) in the relevant concepts and requires clauses should get us >99% there. No need for lots more code to maintain in the library – it’s a single, fairly trivial change, yet in many places; plus maybe duplicating the relevant tests to use restrict
types, though I’m not convinced that’s actually called for. Of course, where there’s a good reason to have a restrict
specialized implementation, for performance rather than correctness reasons, we’d like do that – but such opportunities should be rare in libc++ (selecting memcpy
over memmove
), and those surely could wait for a later phase.
Does any of this make sense? Is there anything else required that I’m missing? Is there a way to quantify such effort?
philnik June 20, 2024, 8:30pm 9
The problem is that it’s not “just put it in a couple places”. It requires proper consideration where it should be used. e.g. your example would probably work if we provided a specialization of iterator_traits
for __restrict
. That’s probably not everything, but I don’t know. You’d have to survey the entire library, including making sure that there are no places where we pass the same pointer multiple time into a function, since that would almost certainly break. You also have to write tests to ensure things work as expected. This is a significant amount of work. For reference, there are ~6 lines of tests for every line of code in the library, and we still have aspects of the code base without good test coverage. If you want to survey the library and write tests, feel free to write write a detailed explanation of what has to be done to add proper support in libc++.
Adding a new attribute to clang to convey the same (or even better, more clearly defined) semantics is probably the easier and less error-prone way to go.
tal June 21, 2024, 4:33pm 10
I’m not sure I understand why. In this example indeed the two implementations give different results; but the implementation where we “removed” the restrict
qualifier actually gets the correct answer. I think that’s true for all valid c++ code, so it should be safe to apply throughout libc++ code without the detailed consideration and extensive testing you mentioned – what am I missing? Can you think of an example where erasing the restrict
qualifier from a type leads to a correctness issue?
Absolutely, but that’s somewhat orthogonal to this RFC – I’m trying to explore options for libc++ to support restrict
, since we support it in as a c++ extension in clang. Ideally, no-aliasing shouldn’t be expressed through the type system, but we have our legacy to consider, including the c language and quite a lot of existing c/c++ code that uses restrict
for performance.
philnik June 22, 2024, 11:03am 11
We have to add tests to ensure that we actually remove the restrict
qualifier where we have to. Miscompilations are some of the worst things to debug that exist.
I don’t thinks that’s orthogonal at all. This isn’t a problem for legacy code, since that never worked. This is only a problem for new code, which could simply use an attribute instead of restrict
.
tal June 22, 2024, 3:57pm 12
If we miss it somewhere, or explicitly decide not to support restrict
somewhere, wouldn’t the user simply get a compile error – just like today?
A friendlier message (“libc++ does not support restrict
in this context”) is nice to have, not a must IMHO.
Right
There’s a bunch of c/c++ code that uses raw pointers and restrict
for performance; what I meant to say is that without libc++ support for restrict
, it’s harder to take advantage of libc++ when working with (and modernizing) such code.
philnik June 22, 2024, 7:34pm 13
Do we get diagnostics everywhere things are broken right now?
And why wouldn’t it be possible to replace restrict
with an attribute when modernizing the code? That seems like just as much of a modernization to me as using standard library facilities.
tal June 22, 2024, 8:55pm 14
“Broken” in the sense of not supporting restrict
, you mean? In the motivating example, if you uncomment the offending line like this, we get:
In file included from <source>:2:
In file included from /opt/compiler-explorer/clang-trunk-20240622/bin/../include/c++/v1/iterator:684:
/opt/compiler-explorer/clang-trunk-20240622/bin/../include/c++/v1/__iterator/advance.h:71:3: error: no matching function for call to '__advance'
71 | std::__advance(__i, __n, typename iterator_traits<_InputIter>::iterator_category());
| ^~~~~~~~~~~~~~
<source>:17:10: note: in instantiation of function template specialization 'std::advance<double *__restrict, int, int, 0>' requested here
17 | std::advance(ptr, 1);
| ^
/opt/compiler-explorer/clang-trunk-20240622/bin/../include/c++/v1/__iterator/advance.h:39:1: note: candidate function template not viable: no known conversion from 'typename iterator_traits<double *__restrict>::iterator_category' (aka 'std::output_iterator_tag') to 'input_iterator_tag' for 3rd argument
39 | __advance(_InputIter& __i, typename iterator_traits<_InputIter>::difference_type __n, input_iterator_tag) {
| ^ ~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/clang-trunk-20240622/bin/../include/c++/v1/__iterator/advance.h:46:1: note: candidate function template not viable: no known conversion from 'typename iterator_traits<double *__restrict>::iterator_category' (aka 'std::output_iterator_tag') to 'bidirectional_iterator_tag' for 3rd argument
46 | __advance(_BiDirIter& __i, typename iterator_traits<_BiDirIter>::difference_type __n, bidirectional_iterator_tag) {
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/clang-trunk-20240622/bin/../include/c++/v1/__iterator/advance.h:57:1: note: candidate function template not viable: no known conversion from 'typename iterator_traits<double *__restrict>::iterator_category' (aka 'std::output_iterator_tag') to 'random_access_iterator_tag' for 3rd argument
57 | __advance(_RandIter& __i, typename iterator_traits<_RandIter>::difference_type __n, random_access_iterator_tag) {
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Compiler returned: 1
I think it’s not immediately obvious why a double * __restrict
isn’t eligible for the random_access_iterator_tag
. If we end up converging on the “do nothing” option, that’s a good example of where a friendlier message might be nice.
Sure, but then the user has to change the type of things, something like std::bit_cast<remove_restrict_t<T>>(ptr)
whenever they want to call libc++, otherwise the code wouldn’t compile.
But perhaps you’re suggesting a new option, which I think is great: the c++ frontend can replace every restrict
qualifier with a [[noalias]]
attribute (will currently get ignored until we implement it, as is the usual case for attributes), so we keep the restrict
c++ extension with all the desired no-aliasing performance optimizations but outside the type system, with libc++ fully supporting it without any code change. It seems like most of the objections to this option (a better variant of the last option in the original post) would be around GCC and MSVC compatibility, which do include restrict
in the c++ type system.