Pimp My Pimpl
This is a translation of a two-part article that originally appeared in Heise Developer. You can find the originals here:
- Part One: http://www.heise.de/developer/artikel/C-Vor-und-Nachteile-des-d-Zeiger-Idioms-Teil-1-1097781.html
- Part Two: http://www.heise.de/developer/artikel/C-Vor-und-Nachteile-des-d-Zeiger-Idioms-Teil-2-1136104.html
You can find Part Two here:
Pimp My Pimpl
Much has been written about this funnily-named idiom, alternatively known as d-pointer, compiler firewall, or Cheshire Cat. Heise Developer highlights some angles of this practical construct that go beyond the classic technique.
Part One
This article first recapitulates the classic Pimpl idiom (pointer-to-implementation), points out its advantages, and goes on to develop further idioms on its basis. The second part will concentrate on how to mitigate the disadvantages that inevitably arise through Pimpl use.
The Classic Idiom
Every C++ programmer has probably stumbled across a class definition akin to the following:
class Class { // ... private: class Private; // forward declaration Private * d; // hide impl details };
Here, the programmer moved the data fields of his class Class
into a nested class Class::Private
. Instances of Class
merely contain a pointer d
to their Class::Private
instance.
To understand why the class author used this indirection, we need to take a step back and look at the C++ module system. In contrast to many other languages, C++, as a language of C descent, has no built-in support for modules (such support was proposed for C++0x, but did not make it into the final standard). Instead, one factors module function declarations (but not usually their definitions) into header files, and makes them available to other modules using the #include
preprocessor directive. This, however, leaves the header files filling a double role: On the one hand, they serve as the module interface. On the other, as a declaration site for potentially internal implementation details.
In times of C, this worked well: Implementation details of functions are encapsulated perfectly by the declaration/definition split; one could either merely forward-declare struct
s (in which case they were private), or define them directly in the header file (in which case they were public). In “object-oriented C”, class Class
from above would maybe look like the following:
struct Class; // forward declaration typedef struct Class * Class_t; // -> only private members void Class_new( Class_t * cls ); // Class::Class() void Class_release( Class_t cls ); // Class::~Class() int Class_f( Class_t cls, double num ); // int Class::f( double ) //...
Unfortunately, that doesn’t work in C++. Methods must be declared inside the class. Since classes without methods are rather boring, class definitions usually appear in C++ header files. Since classes, unlike namespaces, cannot be reopened, the header file must contain declarations for all (data fields, as well as) methods:
class Class { public: // ... public methods ... ok private: // ... private data & methods ... don't want these here };
The problem is evident: The module interface (header file) necessarily contains implementation details; always a bad idea. That is why one uses a rather ugly trick and in short factors all implementation details (data fields as well as private methods) into a separate class:
// --- class.h --- class Class { public: Class(); ~Class(); // ... public methods ... void f( double n ); private: class Private; Private * d; }; // -- class.cpp -- #include <class.h> class Class::Private { public: // ... private data & methods ... bool canAcceptN( double num ) const { return num != 0 ; } double n; }; Class::Class() : d( new Private ) {} Class::~Class() { delete d; } void Class::f( double n ) { if ( d->canAcceptN( n ) ) d->n = n; }
Since Class::Private
is used only in the declaration of a pointer variable, ie. “in name only” (Lakos) and not “in size”, a forward declaration suffices, as in the C case. All methods of Class
now access private methods and data members of Class::Private
through d
only.
In this way, one gains the convenience of a fully-encapsulating module system in C++, too. Because of the recourse to indirection, the developer pays for these benefits with an additional memory allocation (new Class::Private
), the indirection on accessing data fields and private methods, as well as the total waiving of (at least public) inline
methods. As the second part will show, the semantics of const
methods also change.
Before the second part of this article addresses the issue of how to rectify, or at least mitigate, the above downsides, the remainder of this article will first shed some light on the idiom’s benefits.
Benefits of the Pimpl Idiom
They are substantial. By encapsulating all implementation details, a slim and long-term stable interface (header file) arises. The former leads to more readable class definitions; the latter helps maintaining binary compatibility even through extensive changes to the implementation.
For instance, Nokia’s “Qt Development Frameworks” department (formerly Trolltech) has carried out profound changes to the widget rendering at least twice during the development of their “Qt 4” class library without the need to so much as relink programs using Qt 4.
Particularly in larger projects, the tendency of the Pimpl Idiom to dramatically speed up builds should not be underestimated. This is accomplished both by a reduction of #include
directives in header files and though the considerably reduced frequency of changes to header files of Pimpl classes in general. In “Exceptional C++”, Herb Sutter reports regular doubling of compilation speeds, John Lakos even claims build speed-ups of two orders of magnitude.
Another virtue of the design: classes with d-pointers are well-suited for transaction-oriented and exception-safe code, respectively. For instance, the developer may use the Copy-Swap Idiom (Sutter/Alexandrescu: C++ Coding Standards, Item 56) to create a transactional (all-or-nothing) copy assignment operator:
class Class { // ... void swap( Class & other ) { std::swap( d, other.d ); } Class & operator=( const Class & other ) { // this may fail, but doesn't change *this Class copy( other ); // this cannot fail, commits to *this: swap( copy ); return *this; }
Implementation of C++0x move operations is trivial as well (and, in particular, identical across all Pimpl classes):
// C++0x move semantics: Class( Class && other ) : d( other.d ) { other.d = 0; } Class & operator=( Class && other ) { std::swap( d, other.d ); return *this; } // ... };
Both member swap and assignment operators may be implemented inline
in this model, without compromising the class’ encapsulation; developers should make good use of this fact.
Extended Means of Composition
As the last benefit the option to cut down on some of the extra dynamic memory allocations through direct aggregation of data fields should be mentioned. Without Pimpl, aggregation would customarily have been through a pointer in order to decouple classes from each other (a kind of Pimpl per data field). By “pimpling” the whole class once, the need to hold private data fields of complex type only though pointers can be dispensed with.
For instance, the idiomatic Qt dialog class
class QLineEdit; class QLabel; class MyDialog : public QDialog { // ... private: // idiomatic Qt: QLabel * m_loginLB; QLineEdit * m_loginLE; QLabel * m_passwdLB; QLineEdit * m_passwdLE; };
turns into
#include <QLabel> #include <QLineEdit> class MyDialog::Private { // ... // not idiomatic Qt, but less heap allocations: QLabel loginLB; QLineEdit loginLE; QLabel passwdLB; QLineEdit passwdLE; };
Qt aficionados may argue that the QDialog
destructor already destroys the child widgets; direct aggregation would therefore trigger a double-delete. Indeed, usage of this technique poses the threat of allocation sequence errors (double-delete, use-after-free, etc), particularly if data fields are also owned by the class, and vice versa. The transformation shown, however, is safe here, since Qt always allows to delete children before their parents.
This approach is especially effective in case data fields aggregated this way are themselves instances of “pimpled” classes. This is the case in the example shown, and usage of the Pimpl Idiom saves four dynamic memory allocations of size sizeof(void*)
while incurring only one additional (larger) allocation. This can lead to more efficient use of the heap, since small allocations regularly create especially high overhead in the allocator.
In addition, the compiler is much more likely to “de-virtualise” calls to virtual functions in this scenario, ie. it removes the double indirection caused by the virtuality of the function call. This requires interprocedural optimisation when using aggregation by pointer. Whether or not this indeed constitutes a win in runtime performance against the background of an extra indirection though the d-pointer has to be checked as needed by profiling concrete classes.
In case profiling shows that the dynamic memory allocation turns in to a bottleneck, the “Fast Pimpl” Idiom (Exceptional C++, Item 30) may produce relief. In this variant, a fast allocator, e.g. boost::singleton_pool
, is used to create Private
instances instead of global operator new()
.
Interim Findings
As a well-known C++ idiom, Pimpl allows class authors to separate class interface and implementation to an extent not directly provided for by C++. As a positive side-effect, the use of d-pointers speeds up compilation runs, eases implementation of transaction semantics, and allows, through extended means of composition, implementations that potentially are more runtime-efficient.
Not everything is shiny when using d-pointers, though: In addition to the extra Private
class, and its dynamic memory allocation, modified const
method semantics, as well as potential allocation sequence errors are cause for concern.
For some of these, the author will show solutions in the second part of the article. Complexity will increase further, though, so that for each concrete case one has to verify anew that the benefits of the idiom outweigh the downsides. If in doubt, this needs to be done per class in question. As usual, there can be no blanket judgements.
Outlook
The second and last part of this article will take a closer look under the hood of Pimpl, uncover the rust-streaked areas, and pimp the idiom using a whole array of accessories.
Pingback: Translated: Pimp My Pimpl (part 1) « -Wmarc
Pingback: Heise Developer: Pimp My Pimpl (part 1) « -Wmarc
Good job! Awesome article!
I found typo:
void Class_new( Class_t * cls ); // Class::Class()
probably must be:
void Class_new( Class_t cls ); // Class::Class()
because Class_t is already pointer.
It’s not your typo, just retyped from original.
P.S. Think I make translation of this article into russian, so it will be amazing to have part 2 in English as I don’t know Deutsch.
No, the code is correct as it is. It needs to be a pointer-to-
Class_t
, so the “constructor” can assign to it:Alternatively,
Class_new()
could return the new pointer, but C class libraries usually reserve the return value for error codes.You are right
> P.S. Think I make translation of this article into russian, so it will be amazing to have part 2 in English as I don’t know Deutsch.
Be my guest. The only requirement I have is that you link back to both the German (on heise.de) and English versions (this one), and let me link to your Russian one from here.
Yes, I intend to translate the second part, too, at some point.
Ok, I will make it so
>typedef Class * Class_t;
I guess it should be:
typedef struct Class * Class_t;
Fixed, thanks!
I suggest to turn code
#include
#include
class MyDialog::Private {
// ...
// not idiomatic Qt, but less heap allocations:
QLabel loginLB;
QLineEdit loginLE;
QLabel passwdLB;
QLineEdit passwdLE;
};
to this one:
#include
#include
class MyDialog::Private {
// ...
public:
// not idiomatic Qt, but less heap allocations:
QLabel loginLB;
QLineEdit loginLE;
QLabel passwdLB;
QLineEdit passwdLE;
};
to explicitly show that data fields locates at public section
> to explicitly show that data fields locates at public section
I don’t think they should be public. Esp. if you do the Qt-style
DerivedPrivate : public BasePrivate
from part II, you don’t want all data members to be at most protected. Just makeClass::Private
a friend ofClass
.Great part one. Can’t wait for part two, in English. In my opinion d-ptr a.k.a Pimpl is not nearly used enough.
Another area when I’ve used this technique is when using actors (or active objects) to make it explicitly clear what belongs to the API for “users” and what are worker thread internals…
There is link to my translation: http://habrahabr.ru/blogs/cpp/118010/
It contains link to original (deutsch) article and your translation, as I promised.
Thanks for this (still) very useful article. FYI, both links to part 2 are broken, the correct link is:
https://marcmutz.wordpress.com/translated-articles/pimp-my-pimpl-reloaded/
Thanks, fixed.
Pingback: Stepanov-Regularity and Partially-Formed Objects vs. C++ Value Types - KDAB
Pingback: How to use the Qt's PIMPL idiom? - PhotoLens