Another response to N4074; explicit should never be implicit (original) (raw)
Document #: | N4131 |
---|---|
Date: | 2014-08-09 |
Reply to: | Filip Roséen <filip.roseen@gmail.com> |
Summary: | Arguments for not allowing return {expr} to call an explicit constructor. |
Another response to N4074;
Contents
- Introduction
- The Meaning ofexplicit
- The Current praxis of braced-init-list
- Narrowing Conversions
- The return-statement
- Conclusion
Introduction
If one were to agree with the contents of N4074, the following snippet should compile without diagnostics;
struct Type1 { explicit Type1 (int); };
Type1 example_f1 () { return { 0 }; }
The main arguments of N4074:
return { expr }
cannot mean anything besides that we, explicitly, want to initialize the return-value.- Whoever is authoring and maintaining a function also knows about the return type, meaning that a developer is well aware of what is being initialized; and with what.
- Asking a developer to explicitly state that
explicit
initialization is allowed when writing the return-statement is redundant; both the compiler and, the developer, know what is going on.
This paper will try to prove why the proposed change of ISO C++ in N4074 shouldn't be allowed using several methods, among them are:
- Discussions of the, sometimes hidden, implications of such change, and:
- Arguments regarding how such initialization will differ from the current praxis of C++, and:
- Proof of Concepts that directly shows why such proposal is not sane.
The Meaning of explicit
Marking a constructor as explicit
is often equivalent of saying: "such initialization sure is possible, but it's potentially not what you want, if you really want to do this; go a head, but I won't let it happen without your explicit consent."
If a developer would like to use our explicit
constructor, we'd like him to go the extra mile and explicitly show us that this is the case. We'd like him to show some effort, and more specifically; consider if this is really what he wants...explicit
constructors are, by the invisible contract involved, potentially dangerous.
// meaning-of-explicit.example.1
std::unique_ptr func () { static T x; return { &x }; // error: chosen constructor is explicit in copy-initialization }
There's no way for an implementation to force a developer to actual walk around the block every time he tries to initialize an object using anexplicit
constructor, instead we require him to explicitly state his request by writing out the type he'd like to initialize at the point where such initialization takes place.
"I'll refuse to do this unless you show some effort."
Implications if N4074 is approved:
N4074 will effectively make the previously described contract disappear in the context of return { _expr_ }
, which further means that we completely disregard the original intent expressed by the author of said constructor.
// meaning-of-explicit.example.1
std::unique_ptr func () { static T x; return { &x }; // compiles, but triggers undefined-behavior } // if/when the unique_ptr is destroyed
If the author didn't want the user to "walk the extra mile", the author wouldn't have marked the constructor as explicit
.
The Current Praxis of braced-init-list
A braced-init-list is often referred to as means of uniform initialization, meaning that all types can be initialized using the same syntax. It doesn't matter if we are initializing an fundamental type, or a user-defined type that is initialized with one, or several, arguments; the initialization is uniform.
The current praxis, backed up by the Standard, does not state that_uniform initialization_ is a way to bypass the rules associated with initialization of an object of type T
, we merely have a way to express initialization of any type.
Another point of value is that you often hear developers state that one of the greatest perks of using a braced-init-list is that it's equivalent of saying: "Dear compiler, if you know what type I'm trying to initialize.. please, go-ahead."
It is important to note the usage of_"you know"_, nowhere does it imply that both the compiler and the developer "knows the type". When an initialization requires the use of an explicit
constructor the compiler sure knows, but with the meaning of explicit
in mind, an implemenation should be worried that the developer doesn't, which is why we get a diagnostic in such case.
Implications if N4074 is approved:
There are many rules to C++, some more complicated than others, but what really makes people go "hmpf" is when seemingly equivalent constructs behaves differently.
Allowing return { ... }
to use an explicit
constructor contradicts the previously, far more simple explanation:"Unless a braced-init-list has a {type, object, cast} explicitly stated where it is being used, a potential conversion must be one that can happen implicitly."
C++ has enough rules that are cluttered with "but if this applies, that doesn't hold". We don't need another one of such rule, especially when it impediments type-safety and the only real gain is to prolong the lifetime of keyboards. Lazyness doesn't go well with writing safe initializations.
Is the proposed change by N4074 really worth it?
Narrowing Conversions
There is a very close relationship between narrowing conversions, and the use of a constructor marked as explicit
.
If a fundamental type T
is initialized with a compile-time known value which isn't suitable for that type, or if such type is initialized with an object of type U
which potentially can hold a value that isn't representable in T
, a diagnostic is required.
The introduction of narrowing conversions in C++ was, and is, a very good step towards increased type-safety. It prevents developers from making mistakes that can potentially result in a program that behaves in a manner which was never intended.
// narrowing-conversions.example.1
std::size_t multiply (int x, int y) { return { x * y }; // error: non-constant-expression cannot be narrowed from } // type 'int' to 'std::size_t'
It is certainly possible to initialize a std::size_t
with the result of x * y
, but since std::size_t
cannot handle negative numbers this is potentially unsafe.
If we play with the idea of writing a wrapper aroundstd::size_t
, we could end up with something like the below:
// narrowing-conversions.example.2
struct SizeType { explicit SizeType ( signed int); SizeType (unsigned int);
… };
SizeType multiply (int x, int y) { return { x * y }; // error: chosen constructor is explicit in } // copy-initialization
The reason SizeType (signed int)
is markedexplicit
, is the same as to why we rely on diagnostics to inform us of potential narrowing conversions. We rely on the compiler to tell us when we are doing something that might lead to unforeseen consequences.
Implications if N4074 is approved:
Since C++11 the use of return { expr }
has become almost synonym to "safe initialization of any return-type", if N4074 is approved this will no longer be true. This would be one of the scarier forms of a breaking change; one that cannot be caught by something other than a watchful eye.
The return
-statement
T func1 () { return expression-or-braced-init-list; }
As the name implies, a return-statement is used to return a value to the caller of a function. However, it is of utterly importance that we understand that we never directly return the value of the_expression-or-braced-init-list
_ associated with the statement; we merely say that it is to be used as the initializer for the returned value.
The return-type of a function is per definition a distant type; one cannot know the actual return-type by only interpreting the_expression-or-braced-init-list_ used to initialize it. The opposite also applies; one cannot know the initializers for the_return-value_ by only inspecting the return-type.
With the mentioned relation between the return-type and its initializer(s), there are side-effects that one has to properly consider:
- A developer should be allowed to change the return-type of a function without having to review every return-statement in its body. The expected behavior is that such change results in a diagnostic unless every initialization of the new return-type follows the rules of strict type-safety (meaning that a potential dangerous initialization should not implicitly apply).
In the below a developer inaccurately thought "ms" was the SI unit for microseconds, long story short, it's not. The error is however caught during compilation.
// return-statement.example.1
/*! - \brief Benchmark
f()
- \return The duration in ms spent evaluating
f()
- */
unsigned long benchmark (std::function<void()> f) {
…
}
commit message: - updating codebase to C++11,
benchmark
now returns the appropriate
duration type from
commit diff:
--- benchmark.cpp 2014-07-28 03:56:32.255764544 +0200
+++ benchmark.cpp 2014-07-28 03:56:53.175682956 +0200
@@ -5,6 +5,6 @@- \return The duration in ms spent evaluating
f()
- */
- \return The duration in ms spent evaluating
- unsigned long benchmark (std::function<void()> f) {
- std::chrono::microseconds benchmark (std::function<void()> f) {
…
}
- A developer might not know the return-type of a function when he writes his return-statement, therefore he should have a mechanism to disable initializations that potentially does something which was never intended - no matter if such initialization makes use of one, or several, arguments.
// return-statement.example.2
template
struct Vector {
explicit Vector (int size, int capacity = 0);
Vector (std::initializer_list data);
};
template<class T, class... Ts>
Vector make_vector (Ts... args) {
return { args... };
}
int main () {
using secs = std::chrono::seconds;
auto x = make_vector< int> (1,5,10);
auto y = make_vector (10, 20); // error: chosen constructor is explicit in copy-initialization
}
Implications if N4074 is approved:
Even though I agree with the opinion raised by N4074, that a developer_should_ know the return-type and the return-paths of the function he is working on, I find it of higher value that the compiler is able to stop potential brainfarts from ever making it as far as to runtime.
Neither of the two previous examples would be caught during compilation if N4074 is approved. This means that the somewhat trivial errors leaked out into the world of runtime, something which the strict type-safety of C++ has saved us from in the past.
Conclusion
The proposed changes by N4074 are a violation of one of the fundamental type-safety philosophies of C++; if it's not clear that a potentially unsafe conversion can happen, we - as developers - would like the compiler to diagnose the potential error. It doesn't make sense for the rules of_copy-list-initialization_ to differ in return-statements since we are per definition initializing a distant type - and with that, a distant value.
If N4074 is approved there are other cases where such a change need to propogate for it to make sense. With the philosophy expressed by N4074,private
member-functions of a class are maintained by the same developer who is calling them (as they are implementation details), should we then allow explicit
constructors to be used when invoking such function having copy-list-initialization of the arguments involved? After all, the developer should know what is going on.