A case against direct initialisation


I would like to start by extending an apology to You-Know-Who-You-Are🙂. Even though your coding style prompted me to revisit it, I have been pondering this issue for some time now. It was only when I went back to the text books to read up on it that I finally understood how to describe this in accessible terms. Take solace in the fact that the “you” above is the plural form🙂.

Over the years, I have repeatedly come across colleagues that are obviously familiar with the Sutter book “More Exceptional C++” and its Item 36 (or his internet column, Guru of the Week #1), which contains the guideline

Prefer using the form “T t(u)” over “T t = u” for variable initialisation.

While I myself try to follow Sutter/Alexandrescu/Meyers in my everyday coding, I’ve always been slightly uncomfortable with a few of their guidelines, such as the suggestion to add virtual in front of reimplemented virtuals. That particular issue is fodder for another blog post. Today, I’d like to talk about why I never follow the guideline to prefer direct over copy initialisation, unless I have to use direct initialisation.

Direct Initialisation, of course, is the construction of variables by calling one of their constructors (T t(u)), while Copy Initialisation is construction of variables by first constructing a temporary of the variable’s type, and then copy-constructing the variable from that temporary (T t = u, which is equivalent to T t( T(u) )).

At face value, direct initialisation is preferable, because it avoids the extra copy constructor. However, the standard allows compilers to elide the copy constructor call (the copy constructor still has to be accessible, though), and every half-decent compiler will implement that optimisation, thus making the two all but identical at runtime.

So, if avoiding premature pessimisation isn’t the reason to prefer direct initialisation, what is? Interestingly, there appear to be no reasons (left anymore). While the original GotW cited “works in more cases” as a reason, the corresponding MEC++ Item #36, written a few years later, does not give any rationale for the guideline anymore. Even more interestingly, The Good Book, written yet some time later, doesn’t even include that guideline! Clearly, something’s wrong with that guideline. Don’t you, too, sometimes wish the Good Book had an appendix on “items that didn’t make it, and why?”🙂.

As far as I have made out, there are three main reasons not to prefer to use direct initialisation, two objective, and one highly subjective.

Let’s start with the subjective one, so you end up remembering the objective ones better🙂

It just looks plain weird.

Especially for people coming from C, which has no syntax for direct initialisation.

Ok, with the subjective reason out-of-the-way, let’s look at the two objective reasons:

First, direct initialisation sometimes ends up looking like a declaration, and the standard requires that everything that looks like a declaration is also parsed as one. That’s the issue behind C++’ most vexing parse (coined in Meyers: Effective STL, Item 6):

const U u;      // ok, defines a variable u of type U, and default-initialises it
const T t(u);   // ok, defines a variable t of type T, initialised from u
const T t(U()); // oops, declares a function t, taking a function returning U as argument and returning const T!

Second, direct initialisation disables the protection provided by explicit constructors. Consider the classical mistake that explicit constructors are to prevent you from making:

class Stack {
public:
    explicit Stack( size_t maxSize );
    // ...
};

Stack * stack = 0; // ok
// same line, but forgot the '*'
Stack stack = 0; // error: Stack(size_t) is explicit -> good
// same line, now with direct initialisation:
Stack stack(0); // oops, compiles!

In other words, copy construction, by virtue of potentially involving an implicit call to a conversion constructor (remember that T t = u is really T t( T(u) )), cannot be used when the conversion constructor is explicit. Direct initialisation, on the other hand, by making an explicit call to the conversion constructor (T t(u)), will succeed whether or not the constructor is explicit.

So, if you always prefer direct initialisation over copy initialisation, you’re more likely to hit “C++’s most vexing parse”, as well as suffering from unintentional use of explicit constructors. Don’t go there. Prefer copy construction.

That said, there are, of course, situations where you should use direct initialisation. E.g. when calling a constructor with more than one argument:

// this is overly verbose:
const QDateTime dt = QDateTime( 2010, 8, 16 );
// better:
const QDateTime dt( 2010, 8, 16 );

Likewise, when you want to call an explicit constructor, you have to use direct initialisation, too:

std::vector<std::string> v( 10 );

If you default to using copy initialisation, direct initialisation stands out in your code, and marks places where something potentially dangerous, or expensive happens.

About marcmutz
Marc Mutz is a Senior Software Engineer, Trainer, and Consultant with KDAB.

22 Responses to A case against direct initialisation

  1. christoph says:

    I have seen code using this kind of initialization for basic data types, such as “int k(3);” instead of “int k = 3;”. This looks like an array declaration, does not allow a search for “k = ” to find places where k is assigned a value, and (as you already pointed out), does not offer any advantage over the traditional assignment.

  2. LS says:

    I don’t really buy that argument. Yes, it’s a bit funny looking to C programmers, but considering that direct initialization in constructors with the same syntax isn’t going away anytime soon, they’ll have to learn anyway.

    So if you’re using initialization for class members, then you might as well use it for other objects when possible, and not rely on a compiler fixing your code for you, or a copy constructor being non-broken (yes I’ve seen it).

    As for C++’s most vexing parse, I don’t see how it’s a problem at all. If you mistype that you’ll catch it at compile time, unless you happen to be fantastically unlucky and have all the right things defined.

    Rules are always easier if they have fewer exceptions, and the fact that for multiple arguments you need to fall back to direct initialization means you don’t gain any benefit from dropping it for other cases. You just get a confusing mess. I’d say a better exception is to use assignment for basic types as christoph said (haven’t seen anyone do otherwise, but I guess everything is out there) and direct initialization for everything else.

    • marcmutz says:

      It’s interesting that you say you haven’t seen anyone use direct initialisation for basic types. Yet you argue for uniform use of direct initialisation. So, we’re really only of different opinion about how to treat initialisations of user-defined types with one argument. I’d love to hear your opinion on the explicit issue in that case, because that’s the real dealbreaker in my book, and it surfaces in exactly these situations.

      • LS says:

        >> Yet you argue for uniform use of direct initialisation.

        Read again. I said not for basic types. Basic types are an easy exception to make since they are already treated differently than object instances.

        >> So, we’re really only of different opinion about how to treat initialisations of user-defined types with one argument.

        Right. Very strange to make a special case for “objects where the constructor has one argument”.

        >> I’d love to hear your opinion on the explicit issue in that case

        I don’t see the problem with explicit, so I didn’t comment on it.
        If someone writes Stack stack(0); they probably meant exactly what it looks like. I really don’t see the issue of it compiling. I assume you’re saying they meant to write Stack* stack(0), and their mistake goes unnoticed. Well two things there
        a) I treat pointers like I treat basic types, so I (and all other code I’ve seen) would write Stack* stack = 0; and would not make this mistake.
        b) If you do make the mistake of forgetting the * and writing Stack stack(0), you will invariably still end with a compile error as soon as you try to dereference stack, or assign it a pointer to a Stack object. So no safety lost there.

        • marcmutz says:

          >> Yet you argue for uniform use of direct initialisation.

          > Read again. I said not for basic types. Basic types are an easy exception to make since they are already treated differently than object instances.

          Could you name an example for basic types being “treated differently” from “object instances”? I can’t find one off-handedly. In fact, one of C++’ design goals was to allow user-defined types to seamlessly blend in as if they were built-in types.

          >> So, we’re really only of different opinion about how to treat initialisations of user-defined types with one argument.

          > Right. Very strange to make a special case for “objects where the constructor has one argument”.

          Not stranger than to make a special case for “basic types”. Think about complex doubles, which are built-in types in C99, but a user-defined type in C++. In C++0x, I will be able to write

            const complex<double> = 3i;
          

          And why shouldn’t I? complex<> is supposed to be used as if it was a built-in type (and nothing prevents compilers from secretly mapping them to their implementation of complex C99 numbers).

          At least with the one-argument rule, you’re not treating entities differently that were painfully designed not to be treated differently from each other.

          > If you do make the mistake of forgetting the * and writing Stack stack(0), you will invariably still end with a compile error as soon as you try to dereference stack, or assign it a pointer to a Stack object. So no safety lost there.

          Unfortunately, I can’t share your optimism here. I’ve seen too many students in entry-level Qt courses fall into the following trap:

          QDialog * d = new QDialog;
          QTextEdit ed = new QTextEdit( d );
          // ...
          d->exec();
          

          which happened to compile in Qt 3 and older compilers that didn’t check for the availability of the copy contstructor. In Qt 4, it doesn’t anymore, of course, but when you use direct initalisation, it suddenly does again. Yes, I’m aware that pointers are basic types. This is just to show that you do not always get compile, or even runtime, errors in such situations.

      • LS says:

        >> Could you name an example for basic types being “treated differently” from “object instances”?

        Ok, maybe just in my head. You don’t think most programmers have a very clear mental distinction between basic types and objects? They are seen as two separate things, which is why I argue it is very simple/logical to initialize them differently. Maybe that’s just because our first programming course used Java, where there is a clear difference, and it was stamped into my brain.
        Anyway, even in C++ I suspect there are some significant differences to how basic types are handled internal to the compiler, however I don’t have the expertise to argue that point so I won’t continue.

        >> And why shouldn’t I?

        You should! That’s how I would write it too. Like you said, it is meant to be used like a basic type.

        >> which happened to compile in Qt 3 and older compilers that didn’t check for the availability of the copy contstructor.

        Broken older compilers are not an argument for doing something now or in the future.

        >> In Qt 4, it doesn’t anymore, of course

        Great. Problem solved.

        >> but when you use direct initalisation, it suddenly does again.

        Yeah, but you’re not using direct initialization and shouldn’t be. You said you’ve seen mistakes like the one you showed. Fine, missing an * is an easy mistake to make and I make it all the time. However when in your mind you’re wanting to allocate a widget on the heap you’re not suddenly going to write QWidget widget(parent) instead. That is not a likely mistake to make.

        By the way, I compiled your code sample since I wondered that would happen, and everything works just fine if you use direct initialization for the QTextEdit. So this example is also not a trap, since it works as expected. Any more complex example will not compile, because you will invariably want to do something with your instance of QTextEdit and then discover it is not a pointer.

  3. I’m really missing a reference to the RAII idiom here. While it is true that initialization by using a ctor quite often ends up looking like a function declaration, this is something that you have to become aware of as a C++ programmer (unfortunately), so it’s one of the first things I try to teach C++ beginners (when they reach that level). Also, I do strongly disagree with the essence, and the formulation of your last sentence:

    If you default to using copy initialisation, direct initialisation stands out in your code, and marks places where something potentially dangerous, or expensive happens.

    .
    Why exactly do you think it is “dangerous” or “expensive”? You did not offer any explanation for this.
    (And besides, I find the “Stack stack(0)” vs. Stack *stack(0)” to stick out pretty well. The only case when it might NOT be obvious is with multiple declarations in one statement, like “Stack *stack(0), pile(0);”, but this is a whole other topic).

    • marcmutz says:

      I’m really missing a reference to the RAII idiom here.

      Well, RAII classes should never ever have non-explicit constructors, so the question of which kind of initialisation to use for them never comes up. Incidentally, acquiring resources is exactly one of the “expensive or dangerous” situations I mentioned would be highlighted by the use of direct initialisation.

      I do strongly
      disagree with the essence, and the formulation of your last sentence:

      If you default to using copy initialisation,
      direct initialisation stands out in your code, and marks places where
      something potentially dangerous, or expensive happens.

      . Why
      exactly do you think it is “dangerous” or “expensive”? You did not offer
      any explanation for this.

      I never said direct initialisation is dangerous, or expensive. What I said, instead, was that if you prefer copy initialisation wherever the compiler lets you, then those direct initialisations still left in our code stand out as special. Usually, they end up being “potentially dangerous” (as in “involving the call to an explicit constructor”) or “potentially expensive” (ditto, or multi-argument constructor calls). That’s, of course, assuming that all of the implicit conversions that are allowed to be implicit are allowed because they’re (relatively) cheap and safe.

      • LS says:

        .>> Usually, they end up being “potentially dangerous” (as in “involving the call to an explicit constructor”)

        Nothing about calling an explicit constructor is dangerous, or even more “potentially dangerous” than any other code.

        >> or “potentially expensive” (ditto, or multi-argument constructor calls)

        Same here. This is an empty statement. You might as well say “code is potentially dangerous and expensive”. Yes, classes that acquire resources should use explicit constructors, but that doesn’t mean explicit constructors and therefore calling them should be labeled as potentially dangerous or expensive. Many completely harmless classes also use explicit constructors. Heck, it’s the default in the new class template in Qt Creator.

        • marcmutz says:

          >> Nothing about calling an explicit constructor is dangerous, or even more “potentially dangerous” than any other code.

          Let me give you an example:

          const shared_ptr<Widget> w( new Widget );
          

          The whole raison d’être for shared_ptr‘s constructor explicitness is the danger of the implied transfer of ownership. If you think that this isn’t dangerous, I think we just have to agree to disagree🙂

      • The User says:

        explicit normally means that there will not occure any implicit conversion, because the semantics of the classes are unrelated…
        But marcmutz is right, too, they may indicate “dangerous“ stuff, that means side-effects, changing ownership etc.

      • LS says:

        >> Let me give you an example:

        Of course there are examples of explicit constructors being also dangerous. The point is that explicit constructors do not by themselves indicate danger.

  4. Could you name an example for basic types being “treated differently” from “object instances”?
    The only difference, afaik, is that they are not default-constructed when not explicitly doing so (e.g. this will initialize integer to 0: struct foo { foo(): integer() { } int integer; }; whereas it will be “random” when not doing the initialization. With user-defined types, the default ctor would’ve been called.

  5. The User says:

    Of course direct initialization is more widely available, if there is no copy-constructor.
    I think it is natural to use direct initialization if you want to say “hey, I am constructing an object of class xy here“, if there is a more simple implementation e.g. for numeric data-types or pointers or when using C++0x’s auto-keyword (“this new variable should be that”) or just wanting to copy an object, use =.
    That is not a very precise advise, but I am sure everybody is able to write code which is easy to understand when deciding that way.
    Some examples:
    int x = 0; // primitive
    vector vec(12, “bla”); // semantics: create a vector
    complex c = 3+2i; // simply looks more natural
    KApplication app(false); // there is no copy-constructor
    auto i = vec.begin(); // iterators like pointers
    vector backup = vec; // it is clear that we want to copy
    auto stuff = getStuff(); // get the stuff and get the type of the stuff and assign it, we want to copy
    ………

  6. The User says:

    @marcmutz
    Yes it is, and bananas are yellow, what is the point? I think especially for those onesdirect initialization should be used.

    • marcmutz says:

      > I think especially for those ones direct initialization should be used.

      The connection you’re missing is that you cannot use copy initialisation if it would call an explicit constructor.
      I’m fixing the post as we speak.

      • The User says:

        Sorry, of course you are right, but even if it would be copyable, I would use direct initialization…

  7. Anon says:

    ” I’ve always been slightly uncomfortable with a few of their guidelines, such as the suggestion to add virtual in front of reimplemented virtuals”

    I can’t imagine why anyone would object to this: it saves people from A) having to look further up the inheritance stack than necessary to see if something is virtual and B) having to know the “once virtual, always virtual” tidbit of information, and appears to come with no cost, besides 7 or so extra characters, and some abstract notion of “redundancy” from the POV of the compiler (but then, comments and meaningful variable names are also redundant from the POV of the compiler, too!).

  8. Pingback: A case against direct initialisation — the sequel « -Wmarc

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: