Policies/Binary Compatibility Issues With C++: Difference between revisions

From KDE Community Wiki
m (Add two sub-headers, "You can" and "You cannot", to "The Do's and Don'ts" section)
 
(39 intermediate revisions by 20 users not shown)
Line 1: Line 1:
<languages />
<translate>
== Definition ==
== Definition ==


Line 9: Line 12:
* don't allow the program to benefit from bugfixes or extensions in the libraries
* don't allow the program to benefit from bugfixes or extensions in the libraries


In the KDE project, we will provide binary compatibility within the life-span of a major release for the core libraries (kdelibs kdepimlibs).
In the KDE project, we will provide binary compatibility within the life-span of a major release for the core libraries (kdelibs, kdepimlibs).


== Note about ABI ==
== Note about ABI ==


This text applies to most C++ ABIs used by compilers which KDE can be built with. It is mostly based on the [http://www.codesourcery.com/public/cxx-abi/abi.html Itanium C++ ABI], which is used by the GCC C++ compiler since version 3.4 in all platforms it supports.
This text applies to most C++ ABIs used by compilers which KDE can be built with. It is mostly based on the [https://itanium-cxx-abi.github.io/cxx-abi/abi.html Itanium C++ ABI Draft], which is used by the GCC C++ compiler since version 3.4 in all platforms it supports. Information about Microsoft Visual C++ mangling scheme mostly comes from [http://www.agner.org/optimize/calling_conventions.pdf this article on calling conventions] (it's the most complete information found so far on MSVC ABI and name mangling).


Some of the constraints specified here may not apply to a given compiler. The goal here is to list the most restrictive set of conditions when writing cross-platform C++ code, meant to be compiled with several different compilers.
Some of the constraints specified here may not apply to a given compiler. The goal here is to list the most restrictive set of conditions when writing cross-platform C++ code, meant to be compiled with several different compilers.
Line 20: Line 23:


== The Do's and Don'ts ==
== The Do's and Don'ts ==
===You can...===


You can...
* Add new non-virtual functions, including signals and slots and constructors, that do not overload non-overloaded functions.
 
* Add a new enum to a class.
* add new non-virtual functions including signals and slots and constructors.
* Append new enumerators to an existing enum.
* add a new enum to a class.
** Exception: if that leads to the compiler choosing a larger underlying type for the enum, that makes the change binary-incompatible. Unfortunately, compilers have some leeway to choose the underlying type, so from an API-design perspective it's recommended to
* append new enumerators to an existing enum.
*** make enums have an explicit underlying type, or, in case C compatibility is required,
* reimplement virtual functions defined in the primary base class hierarchy (that is, virtuals defined in the first non-virtual base class, or in that class's first non-virtual base class, and so forth) '''if''' it is safe that programs linked with the prior version of the library call the implementation in the base class rather than the new one. ''This is tricky and might be dangerous. Think twice before doing it. Alternatively see below for a workaround.''
*** add a '''Max....''' enumerator with an explicit large value ('''=255''', '''=1<<15''', etc) to create an interval of numeric enumerator values that is guaranteed to fit into the chosen underlying type, whatever that may be.
* Reimplement virtual functions defined in the primary base class hierarchy (that is, virtuals defined in the first non-virtual base class, or in that class's first non-virtual base class, and so forth) '''if''' it is safe that programs linked with the prior version of the library call the implementation in the base class rather than the derived one. ''This is tricky and might be dangerous. Think twice before doing it. Alternatively see below for a workaround.''
** Exception: if the overriding function has a [http://en.wikipedia.org/wiki/Covariant_return_type covariant return type], it's only a binary-compatible change if the more-derived type has always the same pointer address as the less-derived one. ''If in doubt, do not override with a covariant return type.''
** Exception: if the overriding function has a [http://en.wikipedia.org/wiki/Covariant_return_type covariant return type], it's only a binary-compatible change if the more-derived type has always the same pointer address as the less-derived one. ''If in doubt, do not override with a covariant return type.''
* change an inline function or make an inline function non-inline '''if''' it is safe that programs linked with the prior version of the library call the old implementation. ''This is tricky and might be dangerous. Think twice before doing it.''
* Change an inline function or make an inline function non-inline '''if''' it is safe that programs linked with the prior version of the library call the old implementation. ''This is tricky and might be dangerous. Think twice before doing it.''
* remove private non-virtual functions '''if''' they are not called by any inline functions (and have never been).
* Remove private non-virtual functions '''if''' they are not called by any inline functions (and have never been).
* remove private static members '''if''' they are not called by any inline functions (and have never been).
* Remove private static members '''if''' they are not called by any inline functions (and have never been).
* add new '''static''' data members.
* Add new '''static''' data members.
* change the default arguments of a method. It requires recompilation to use the actual new default argument values, though.
* Change the default arguments of a method. It requires recompilation to use the actual new default argument values, though.
* add new classes.
* Add new classes.
* export a class that was not previously exported.
* Export a class that was not previously exported.
* add or remove friend declarations to classes.
* Add or remove friend declarations to classes.
* rename reserved member types
* Rename reserved member types
* extend reserved bit fields, provided this doesn't cause the bit field to cross the boundary of its underlying type (8 bits for char & bool, 16 bits for short, 32 bits for int, etc.)
* Extend reserved bit fields, provided this doesn't cause the bit field to cross the boundary of its underlying type (8 bits for char & bool, 16 bits for short, 32 bits for int, etc.)
* Add the Q_OBJECT macro to a class if the class already inherits from QObject
* Add a Q_PROPERTY, Q_ENUMS or Q_FLAGS macro as that only modifies the meta-object generated by moc and not the class itself


You cannot...
===You cannot...===
* For existing classes:
* For existing classes:
** [[Policies/Binary_Compatibility_Examples#Unexport_or_remove_a_class|unexport or remove]] an exported class.
** [[Policies/Binary_Compatibility_Examples#Unexport_or_remove_a_class|Unexport or remove]] an exported class.
** [[Policies/Binary_Compatibility_Examples#Change_the_class_hierarchy|change the class hierachy]] in any way (add, remove, or reorder base classes).
** [[Policies/Binary_Compatibility_Examples#Change_the_class_hierarchy|Change the class hierachy]] in any way (add, remove, or reorder base classes).
** [[Policies/Binary_Compatibility_Examples#Remove_class_finality|Remove]] <code>final</code>ity
* For template classes:
* For template classes:
** [[Policies/Binary_Compatibility_Examples#Change_the_template_arguments_of_a_template_class|change the template arguments]] in any way (add, remove or reorder).
** [[Policies/Binary_Compatibility_Examples#Change_the_template_arguments_of_a_template_class|Change the template arguments]] in any way (add, remove or reorder).
* For existing functions of any type:
* For existing functions of any type:
** [[Policies/Binary_Compatibility_Examples#Unexport_a_function|unexport]] it.
** [[Policies/Binary_Compatibility_Examples#Unexport_a_function|Unexport]] it.
** remove it.
** Remove it.
** [[Policies/Binary_Compatibility_Examples#Inline_a_function|inline]] it (this includes moving a member function's body to the class definition, even without the inline keyword).
*** Remove the implementation of existing declared functions. The symbol comes from the implementation of the function, so this is effectively the function.
** change its signature. This includes:
** [[Policies/Binary_Compatibility_Examples#Inline_a_function|Inline]] it (this includes moving a member function's body to the class definition, even without the inline keyword).
*** changing any of the types of the arguments in the [[Policies/Binary_Compatibility_Examples#Change_the_parameters_of_a_function|parameter list]] (instead, add a new method)
** Add an overload (binary compatible, but not source compatible: it makes <code>&func</code> ambiguous). Adding overloads to already overloaded functions is ok (since any use of <code>&func</code> already needed a cast).
*** changing the [[Policies/Binary_Compatibility_Examples#Change_the_return_type|return type]]
** Change its signature. This includes:
*** changing the [[Policies/Binary_Compatibility_Examples#Change_the_access_rights|access rights]] to some functions or data members, for example from <tt>private</tt> to <tt>public</tt>. With some compilers, this information may be part of the signature. If you need to make a private function protected or even public, you have to add a new function that calls the private one.
*** Changing any of the types of the arguments in the [[Policies/Binary_Compatibility_Examples#Change_the_parameters_of_a_function|Parameter list]], including changing the <code>const</code>/<code>volatile</code> qualifiers of the existing parameters (instead, add a new method)
*** changing the [[Policies/Binary_Compatibility_Examples#Change_the_CV-qualifiers|CV-qualifiers of a member function]]:  the const and/or volatile that apply to the function itself.
*** Changing the <code>const</code>/<code>volatile</code> qualifiers of the function
*** extending a function with another parameter, even if this parameter has a default argument. ''See below for a suggestion on how to avoid this issue''
*** Changing the [[Policies/Binary_Compatibility_Examples#Change_the_access_rights|access rights]] to some functions or data members, for example from <tt>private</tt> to <tt>public</tt>. With some compilers, this information may be part of the signature. If you need to make a private function protected or even public, you have to add a new function that calls the private one.
*** Exception: non-member functions declared with extern "C" can change parameter types (be very careful).
*** Changing the [[Policies/Binary_Compatibility_Examples#Change_the_CV-qualifiers_of_a_member_function|CV-qualifiers of a member function]]:  the <code>const</code> and/or <code>volatile</code> that apply to the function itself.
*** Extending a function with another parameter, even if this parameter has a default argument. ''See below for a suggestion on how to avoid this issue''
*** Changing the [[Policies/Binary_Compatibility_Examples#Change_the_return_type|return type]] in any way
*** Exception: non-member functions declared with <code>extern "C"</code> can change parameter types ('''be very careful''').
* For virtual member functions:
* For virtual member functions:
** [[Policies/Binary_Compatibility_Examples#Add_a_virtual_member_function_to_a_class_without_any|add a virtual function]] to a class that doesn't have any virtual functions or virtual bases.
** [[Policies/Binary_Compatibility_Examples#Add_a_virtual_member_function_to_a_class_without_any|Add a virtual function]] to a class that doesn't have any virtual functions or virtual bases.
** [[Policies/Binary_Compatibility_Examples#Add_new_virtuals_to_a_non-leaf_class|add new virtual functions]] to non-leaf classes as this will break subclasses. ''See below for some workarounds or ask on mailing lists.''
** [[Policies/Binary_Compatibility_Examples#Add_new_virtuals_to_a_non-leaf_class|Add new virtual functions]] to non-leaf classes as this will break subclasses. Note that a class designed to be sub-classed by applications is '''always''' a non-leaf class. ''See below for some workarounds or ask on mailing lists.''
** [[Policies/Binary_Compatibility_Examples#Change_the_order_of_the_declaration_of_virtual_functions|change the order]] of virtual functions in the class declaration.
** Add new virtual functions for any reason, even to leaf classes, ''if the class is intended to remain binary compatible on Windows''. Doing so may [http://lists.kde.org/?l=kde-core-devel&m=139744177410091&w=2 reorder existing virtual functions] and break binary compatibility.
** [[Policies/Binary_Compatibility_Examples#Override_a_virtual_that_doesn.27t_come_from_a_primary_base|override an existing virtual function if that function is not in the primary base class]] (first non-virtual base class, or the primary base class's primary base class and upwards).
** [[Policies/Binary_Compatibility_Examples#Change_the_order_of_the_declaration_of_virtual_functions|Change the order]] of virtual functions in the class declaration.
** [[Policies/Binary_Compatibility_Examples#Override_a_virtual_with_a_covariant_return_with_different_top_address|override an existing virtual function]] if the overriding function has a [http://en.wikipedia.org/wiki/Covariant_return_type covariant return type] for which the more-derived type has a pointer address different from the less-derived one (usually happens when, between the less-derived and the more-derived ones, there's multiple inheritance or virtual inheritance).
** [[Policies/Binary_Compatibility_Examples#Override_a_virtual_that_doesn.27t_come_from_a_primary_base|Override an existing virtual function if that function is not in the primary base class]] (the first non-virtual base class, or the primary base class's primary base class and upwards).
* For static non-private members:
** [[Policies/Binary_Compatibility_Examples#Override_a_virtual_with_a_covariant_return_with_different_top_address|Override an existing virtual function]] if the overriding function has a [http://en.wikipedia.org/wiki/Covariant_return_type covariant return type] for which the more-derived type has a pointer address different from the less-derived one (usually happens when, between the less-derived and the more-derived ones, there's multiple inheritance or virtual inheritance).
** Remove a virtual function, even if it is a reimplementation of a virtual function from the base class
** Remove <code>final</code>ity.
* For static non-private members or for non-static non-member public data:
** Remove or unexport it
** Remove or unexport it
** Change its [[Policies/Binary_Compatibility_Examples#Change_the_type_of_global_data|type]]
** Change its [[Policies/Binary_Compatibility_Examples#Change_the_CV-qualifiers_of_global_data|CV-qualifiers]]
* For non-static members:
* For non-static members:
** add new, data members to an existing class.
** Add new data members to an existing class.
** change the order of non-static data members in a class.
** Change the order of non-static data members in a class.
** change the type of the member, except for signedness
** Change the type of the member, except for signedness (or more generally if the types are guaranteed to have the same size, and the member is not used by any inline method)
** remove existing non-static data members from an existing class.
** Remove existing non-static data members from an existing class.
* Return (or take as parameter) an iterator to a Qt container, in public API. This is not binary compatible if the library doing that is compiled with QT_STRICT_ITERATORS and the lib/app using that API isn't, or vice-versa. Return a reference to (or a copy of) the container instead.


If you need to add extend/modify the parameter list of an existing function, you need to add a new function instead with the new parameters. In that case, you may want to add a short note that the two functions shall be merged with a default argument in later versions of the library:
If you need to add extend/modify the parameter list of an existing function, you need to add a new function instead with the new parameters. In that case, you may want to add a short note that the two functions shall be merged with a default argument in later versions of the library:


<code cpp>
<source lang="cpp-qt">
void functionname( int a );
void functionname( int a );
void functionname( int a, int b ); //BCI: merge with int b = 0
void functionname( int a, int b ); //BCI: merge with int b = 0
</code>
</source>


You should...
<big>You should...</big>


In order to make a class to extend in the future you should follow these rules:
In order to make a class to extend in the future you should follow these rules:
* add d-pointer. ''See below''.
* Add d-pointer. ''See below''.
* add non-inline virtual destructor even if the body is empty.
* Add non-inline virtual destructor even if the body is empty.
* reimplement <tt>event</tt> in widget classes, even if the body for the function is empty.
* Reimplement <code>event</code> in QObject-derived classes, even if the body for the function is just calling the base class' implementation. This is specifically to avoid problems caused by adding a reimplemented virtual function as discussed below.
* make all constructors non-inline.
* Make all constructors non-inline.
* write non-inline implementations of the copy constructor and assignment operator unless the class cannot be copied by value (e.g. classes inherited from QObject can't be)
* Write non-inline implementations of the copy constructor and assignment operator unless the class cannot be copied by value. (E.g. classes inherited from QObject can't be.)


== Techniques for Library Programmers ==
== Techniques for Library Programmers ==
Line 93: Line 110:
One exception are bitflags. If you use bitflags for enums or bools, you can safely round up to at least the next byte minus 1. A class with members
One exception are bitflags. If you use bitflags for enums or bools, you can safely round up to at least the next byte minus 1. A class with members


<code cpp>
<source lang="cpp-qt">
uint m1 : 1;
uint m1 : 1;
uint m2 : 3;
uint m2 : 3;
uint m3 : 1;
uint m3 : 1;
</code>
</source>
<code cpp>
<source lang="cpp-qt">
uint m1 : 1;
uint m1 : 1;
uint m2 : 3;
uint m2 : 3;
uint m3 : 1;
uint m3 : 1;
uint m4 : 2; // new member
uint m4 : 2; // new member
</code>
</source>
without breaking binary compatibility. Please round up to a maxmimum of 7 bits (or 15 if the bitfield was already larger than 8). Using the very last bit may cause problems on some compilers.
without breaking binary compatibility. Please round up to a maxmimum of 7 bits (or 15 if the bitfield was already larger than 8). Using the very last bit may cause problems on some compilers.


Line 115: Line 132:


In your class definition for class Foo, define a forward declaration
In your class definition for class Foo, define a forward declaration
<code cpp>
<source lang="cpp-qt">
class FooPrivate;
class FooPrivate;
</code>
</source>
and the d-pointer in the private section:
and the d-pointer in the private section:
<code cpp>
<source lang="cpp-qt">
private:
private:
     FooPrivate* d;
     FooPrivate* d;
</code>
</source>
The FooPrivate class itself is purely defined in the class implementation file (usually *.cpp ), for example:
The FooPrivate class itself is purely defined in the class implementation file (usually *.cpp ), for example:
<code cpp>
<source lang="cpp-qt">
class FooPrivate {
class FooPrivate {
public:
public:
Line 134: Line 151:
     QString s;
     QString s;
};
};
</code>
</source>


All you have to do now is to create the private data in your constructors or your init function with
All you have to do now is to create the private data in your constructors or your init function with
<code cpp>
<source lang="cpp-qt">
   d = new FooPrivate;
   d = new FooPrivate;
</code>
</source>
and to delete it again in your destructor with
and to delete it again in your destructor with
<code cpp>
<source lang="cpp-qt">
delete d;
delete d;
</code>
</source>


In most circumstances you will want to make the dpointer constant to catch situations where it's accidentally getting modified or copied over so you'd loose ownership of the private object and create a memory-leak:
In most circumstances you will want to make the dpointer constant to catch situations where it's accidentally getting modified or copied over so you'd lose ownership of the private object and create a memory leak:
<code cpp>
<source lang="cpp-qt">
private:
private:
     FooPrivate* const d;
     FooPrivate* const d;
</code>
</source>
This allows you to modify the object pointed to by d but not the value of the pointer after it has been initialized.
This allows you to modify the object pointed to by d but not the value of the pointer after it has been initialized.


You may not want all member variables to live in the private data object, though. For very often used members, it's faster to put them directly in the class, since inline functions cannot access the d-pointer data. Also note that all data covered by the d-pointer is obviously private. For public or protected access, provide both a set and a get function. Example
You may not want all member variables to live in the private data object, though. For very often used members, it's faster to put them directly in the class, since inline functions cannot access the d-pointer data. Also note that all data covered by the d-pointer is "private", despite being declared public in the d-pointer itself. For public or protected access, provide both a set and a get function. Example
<code cppqt>
<source lang="cpp-qt">
QString Foo::string() const
QString Foo::string() const
{
{
Line 159: Line 176:
}
}


void Foo::setString( const QString& s )
void Foo::setString(const QString &s)
{
{
     d->s = s;
     d->s = s;
}
}
</code>
</source>
 
It is also possible (but not recommended) to declare the private class for the d-pointer as a nested private class (e.g. Foo::Private).  If you use this technique, remember that the nested private class will inherit the public symbol visibility of the containing exported class.  This will cause the functions of the private class to be named in the dynamic library's symbol table.  You can use <code>Q_DECL_HIDDEN</code> in the implementation of the nested private class to manually re-hide the symbols.  (For an existing class, this is technically an ABI change, but does not impact the public ABI supported by the KDE developers, so private symbols mistaken exposed may be re-hidden without further warning.). Other downsides of the nested private class include the lack of consistency with Qt and its Q_D/Q_Q macros, and the fact that it can't be forward-declared in unrelated headers anymore (which can be useful to declare it as a friend class). For all these reasons, prefer FooPrivate.


<h2>Trouble shooting</h2>
<h2>Trouble shooting</h2>
Line 174: Line 193:
The basic trick in your class implementation of class Foo is:
The basic trick in your class implementation of class Foo is:
* Create a private data class FooPrivate.
* Create a private data class FooPrivate.
* Create a static QHash&lt;Foo *, FooPrivate&gt;.
* Create a static QHash&lt;Foo *, FooPrivate *&gt;.
*Note that some compilers/linkers (almost all, unfortunately) do not manage to create static objects in shared libraries. They simply forget to call the constructor. Therefore you should use the  <tt>Q_GLOBAL_STATIC</tt> macro to create and access the object:
*Note that some compilers/linkers (almost all, unfortunately) do not manage to create static objects in shared libraries. They simply forget to call the constructor. Therefore you should use the  <tt>Q_GLOBAL_STATIC</tt> macro to create and access the object:


<code cppqt>
<source lang="cpp-qt">
// BCI: Add a real d-pointer
// BCI: Add a real d-pointer
Q_GLOBAL_STATIC(QHash<Foo *,FooPrivate>, d_func);
typedef QHash<Foo *, FooPrivate *> FooPrivateHash;
static FooPrivate* d( const Foo* foo )
Q_GLOBAL_STATIC(FooPrivateHash, d_func)
static FooPrivate *d(const Foo *foo)
{
{
     FooPrivate* ret = d_func()->value( foo, 0 );
     FooPrivate *ret = d_func()->value(foo);
     if ( ! ret ) {
     if (!ret) {
         ret = new FooPrivate;
         ret = new FooPrivate;
         d_func()->insert( foo, ret );
         d_func()->insert(foo, ret);
     }
     }
     return ret;
     return ret;
}
}
static void delete_d( const Foo* foo )
static void delete_d(const Foo *foo)
{
{
     FooPrivate* ret = d_func()->value( foo, 0 );
     FooPrivate *ret = d_func()->value(foo);
     delete ret;
     delete ret;
     d_func()->remove( foo );
     d_func()->remove(foo);
}
}
</code>
</source>


* Now you can use the d-pointer in your class almost as simple as in the code before, just with a function call to d(this). For example:
* Now you can use the d-pointer in your class almost as simple as in the code before, just with a function call to d(this). For example:


<code cppqt>
<source lang="cpp-qt">
d(this)->m1 = 5;
d(this)->m1 = 5;
</code>
</source>


* Add a line to your destructor:
* Add a line to your destructor:
<code cppqt>
<source lang="cpp-qt">
delete_d(this);
delete_d(this);
</code>
</source>
* Do not forget to add a BCI remark, so that the hack can be removed in the next version of the library.
* Do not forget to add a BCI remark, so that the hack can be removed in the next version of the library.
* Do not forget to add a d-pointer to your next class.
* Do not forget to add a d-pointer to your next class.
Line 212: Line 232:
=== Adding a reimplemented virtual function ===
=== Adding a reimplemented virtual function ===


As already explained, you can safely reimplement a virtual function defined in one of the base classes only if it is safe that the programs linked with the prior version call the implementation in the base class rather than the new one. This is because the compiler sometimes calls virtual functions directly if it can determine which one to call. For example, if you have  
As already explained, you can safely reimplement a virtual function defined in one of the base classes only if it is safe that the programs linked with the prior version call the implementation in the base class rather than the derived one. This is because the compiler sometimes calls virtual functions directly if it can determine which one to call. For example, if you have  
<code cpp>
<source lang="cpp-qt">
void C::foo()
void C::foo()
{
{
     B::foo();
     B::foo();
}
}
</code>
</source>


then B::foo() is called directly. If class B inherits from class A which implements foo() and B itself doesn't reimplement it, then C::foo() will in fact call A::foo(). If a newer version of the library adds B::foo(), C::foo() will call it only after a recompilation.
then B::foo() is called directly. If class B inherits from class A which implements foo() and B itself doesn't reimplement it, then C::foo() will in fact call A::foo(). If a newer version of the library adds B::foo(), C::foo() will call it only after a recompilation.


Another more common example is:
Another more common example is:
<code cpp>
<source lang="cpp-qt">
B b; // B derives from A
B b; // B derives from A
b.foo();
b.foo();
</code>
</source>
then the call to foo() will not use the virtual table. That means that
then the call to foo() will not use the virtual table. That means that
if B::foo() didn't exist in the library but now does, code that was
if B::foo() didn't exist in the library but now does, code that was
Line 232: Line 252:


If you can't guarantee things will continue to work without a recompilation, move functionality from A::foo() to a new protected function A::foo2() and use this code:
If you can't guarantee things will continue to work without a recompilation, move functionality from A::foo() to a new protected function A::foo2() and use this code:
<code cpp>
<source lang="cpp-qt">
void A::foo()
void A::foo()
{
{
     if( B* b = dynamic_cast< B* >( this ))
     if (B *b = dynamic_cast<B *>(this)) {
         b->B::foo(); // B:: is important
         b->B::foo(); // B:: is important
     else
     } else {
         foo2();
         foo2();
    }
}
}
void B::foo()
void B::foo()
Line 245: Line 266:
     A::foo2(); // call base function with real functionality
     A::foo2(); // call base function with real functionality
}
}
</code>
</source>
All calls to A::foo() for objects of type B (or inherited) will result in  calling B::foo(). The only case that will not work as expected are calls to A::foo() that explicitly specify A::foo(), but B::foo() calls A::foo2() instead and there should not be other places doing so.
All calls to A::foo() for objects of type B (or inherited) will result in  calling B::foo(). The only case that will not work as expected are calls to A::foo() that explicitly specify A::foo(), but B::foo() calls A::foo2() instead and there should not be other places doing so.


Line 254: Line 275:
=== Adding new virtual functions to leaf classes ===
=== Adding new virtual functions to leaf classes ===
This technique is one of cases of using a new class that can help if there's a need to add new virtual functions to a class that should stay binary compatible and there is no class inheriting from it that should also stay binary compatible (i.e. all classes inheriting from it are in applications). In such case it's possible to add a new class inheriting from the original one that will add them. Applications using the new functionality will of course have to be modified to use the new class.
This technique is one of cases of using a new class that can help if there's a need to add new virtual functions to a class that should stay binary compatible and there is no class inheriting from it that should also stay binary compatible (i.e. all classes inheriting from it are in applications). In such case it's possible to add a new class inheriting from the original one that will add them. Applications using the new functionality will of course have to be modified to use the new class.
<code cpp>
<source lang="cpp-qt">
class A {
class A {
public:
public:
Line 266: Line 287:
{
{
     // here it's needed to call a new virtual function
     // here it's needed to call a new virtual function
     if( B* this2 = dynamic_cast< B* >( this ))
     if (B *this2 = dynamic_cast<B *>(this)) {
         this2->bar();
         this2->bar();
    }
}
}
</code>
</source>
It is not possible to use this technique when there are other inherited classes that should also stay binary compatible because they'd have to inherit from the new class.
It is not possible to use this technique when there are other inherited classes that should also stay binary compatible because they'd have to inherit from the new class.


Line 275: Line 297:
Qt's signals and slots are invoked using a special virtual method created by the Q_OBJECT macro and it exists in every class inherited from {{qt|QObject}}. Therefore adding new signals and slots doesn't affect binary compatibility and the signals/slots mechanism can be used to emulate virtual functions.
Qt's signals and slots are invoked using a special virtual method created by the Q_OBJECT macro and it exists in every class inherited from {{qt|QObject}}. Therefore adding new signals and slots doesn't affect binary compatibility and the signals/slots mechanism can be used to emulate virtual functions.


<code cppqt>
<source lang="cpp-qt">
class A : public QObject {
class A : public QObject
Q_OBJECT
{
    Q_OBJECT
public:
public:
     A();
     A();
     virtual void foo();
     virtual void foo();
signals:
signals:
     void bar( int* ); // added new "virtual" function
     void bar(int *); // added new "virtual" function
protected slots:
protected slots:
     // implementation of the virtual function in A
     // implementation of the virtual function in A
     void barslot( int* );
     void barslot(int *);
};
};


A::A()
A::A()
{
{
     connect(this, SIGNAL( bar(int*)), this, SLOT( barslot(int*)));
     connect(this, SIGNAL(bar(int *)), this, SLOT(barslot(int *)));
}
}


Line 296: Line 319:
{
{
     int ret;
     int ret;
     emit bar( &ret );
     emit bar(&ret);
}
}


void A::barslot( int* ret )
void A::barslot(int *ret)
{
{
     *ret = 10;
     *ret = 10;
}
}
</code>
</source>


Function bar() will act like a virtual function, barslot() implements the actual functionality of it. Since signals have void return value, data must be returned using arguments. As there will be only one slot connected to the signal returning data from the slot this way will work without problems. Note that with Qt4 for this to work the connection type will have to be Qt::DirectConnection.
Function bar() will act like a virtual function, barslot() implements the actual functionality of it. Since signals have void return value, data must be returned using arguments. As there will be only one slot connected to the signal returning data from the slot this way will work without problems. Note that with Qt4 for this to work the connection type will have to be Qt::DirectConnection.


If an inherited class will want to re-implement the functionality of bar() it will have to provide its own slot:
If an inherited class will want to re-implement the functionality of bar() it will have to provide its own slot:
<code cppqt>
<source lang="cpp-qt">
class B : public A {
class B : public A
Q_OBJECT
{
    Q_OBJECT
public:
public:
     B();
     B();
protected slots: // necessary to specify as a slot again
protected slots: // necessary to specify as a slot again
     void barslot( int* ); // reimplemented functionality of bar()
     void barslot(int *); // reimplemented functionality of bar()
};
};


B::B()
B::B()
{
{
     disconnect(this, SIGNAL(bar(int*)), this, SLOT(barslot(int*)));
     disconnect(this, SIGNAL(bar(int *)), this, SLOT(barslot(int *)));
     connect(this, SIGNAL(bar(int*)), this, SLOT(barslot(int*)));
     connect(this, SIGNAL(bar(int *)), this, SLOT(barslot(int *)));
}
}


void B::barslot( int* ret )
void B::barslot(int *ret)
{
{
     *ret = 20;
     *ret = 20;
}
}
</code>
</source>


Now B::barslot() will act like virtual reimplementation of A::bar(). Note that it is necessary to specify barslot() again as a slot in B and that in the constructor it is necessary to first disconnect and then connect again, that will disconnect A::barslot() and connect B::barslot() instead.
Now B::barslot() will act like virtual reimplementation of A::bar(). Note that it is necessary to specify barslot() again as a slot in B and that in the constructor it is necessary to first disconnect and then connect again, that will disconnect A::barslot() and connect B::barslot() instead.
Line 334: Line 358:


[[Category:Policies]] [[Category:C++]]
[[Category:Policies]] [[Category:C++]]
</translate>

Latest revision as of 16:18, 14 June 2024

<languages /> <translate>

Definition

A library is binary compatible, if a program linked dynamically to a former version of the library continues running with newer versions of the library without the need to recompile.

If a program needs to be recompiled to run with a new version of library but doesn't require any further modifications, the library is source compatible.

Binary compatibility saves a lot of trouble. It makes it much easier to distribute software for a certain platform. Without ensuring binary compatibility between releases, people will be forced to provide statically linked binaries. Static binaries are bad because they

  • waste resources (especially memory)
  • don't allow the program to benefit from bugfixes or extensions in the libraries

In the KDE project, we will provide binary compatibility within the life-span of a major release for the core libraries (kdelibs, kdepimlibs).

Note about ABI

This text applies to most C++ ABIs used by compilers which KDE can be built with. It is mostly based on the Itanium C++ ABI Draft, which is used by the GCC C++ compiler since version 3.4 in all platforms it supports. Information about Microsoft Visual C++ mangling scheme mostly comes from this article on calling conventions (it's the most complete information found so far on MSVC ABI and name mangling).

Some of the constraints specified here may not apply to a given compiler. The goal here is to list the most restrictive set of conditions when writing cross-platform C++ code, meant to be compiled with several different compilers.

This page is updated when new binary incompatibility issues are found.

The Do's and Don'ts

You can...

  • Add new non-virtual functions, including signals and slots and constructors, that do not overload non-overloaded functions.
  • Add a new enum to a class.
  • Append new enumerators to an existing enum.
    • Exception: if that leads to the compiler choosing a larger underlying type for the enum, that makes the change binary-incompatible. Unfortunately, compilers have some leeway to choose the underlying type, so from an API-design perspective it's recommended to
      • make enums have an explicit underlying type, or, in case C compatibility is required,
      • add a Max.... enumerator with an explicit large value (=255, =1<<15, etc) to create an interval of numeric enumerator values that is guaranteed to fit into the chosen underlying type, whatever that may be.
  • Reimplement virtual functions defined in the primary base class hierarchy (that is, virtuals defined in the first non-virtual base class, or in that class's first non-virtual base class, and so forth) if it is safe that programs linked with the prior version of the library call the implementation in the base class rather than the derived one. This is tricky and might be dangerous. Think twice before doing it. Alternatively see below for a workaround.
    • Exception: if the overriding function has a covariant return type, it's only a binary-compatible change if the more-derived type has always the same pointer address as the less-derived one. If in doubt, do not override with a covariant return type.
  • Change an inline function or make an inline function non-inline if it is safe that programs linked with the prior version of the library call the old implementation. This is tricky and might be dangerous. Think twice before doing it.
  • Remove private non-virtual functions if they are not called by any inline functions (and have never been).
  • Remove private static members if they are not called by any inline functions (and have never been).
  • Add new static data members.
  • Change the default arguments of a method. It requires recompilation to use the actual new default argument values, though.
  • Add new classes.
  • Export a class that was not previously exported.
  • Add or remove friend declarations to classes.
  • Rename reserved member types
  • Extend reserved bit fields, provided this doesn't cause the bit field to cross the boundary of its underlying type (8 bits for char & bool, 16 bits for short, 32 bits for int, etc.)
  • Add the Q_OBJECT macro to a class if the class already inherits from QObject
  • Add a Q_PROPERTY, Q_ENUMS or Q_FLAGS macro as that only modifies the meta-object generated by moc and not the class itself

You cannot...

  • For existing classes:
  • For template classes:
  • For existing functions of any type:
    • Unexport it.
    • Remove it.
      • Remove the implementation of existing declared functions. The symbol comes from the implementation of the function, so this is effectively the function.
    • Inline it (this includes moving a member function's body to the class definition, even without the inline keyword).
    • Add an overload (binary compatible, but not source compatible: it makes &func ambiguous). Adding overloads to already overloaded functions is ok (since any use of &func already needed a cast).
    • Change its signature. This includes:
      • Changing any of the types of the arguments in the Parameter list, including changing the const/volatile qualifiers of the existing parameters (instead, add a new method)
      • Changing the const/volatile qualifiers of the function
      • Changing the access rights to some functions or data members, for example from private to public. With some compilers, this information may be part of the signature. If you need to make a private function protected or even public, you have to add a new function that calls the private one.
      • Changing the CV-qualifiers of a member function: the const and/or volatile that apply to the function itself.
      • Extending a function with another parameter, even if this parameter has a default argument. See below for a suggestion on how to avoid this issue
      • Changing the return type in any way
      • Exception: non-member functions declared with extern "C" can change parameter types (be very careful).
  • For virtual member functions:
  • For static non-private members or for non-static non-member public data:
  • For non-static members:
    • Add new data members to an existing class.
    • Change the order of non-static data members in a class.
    • Change the type of the member, except for signedness (or more generally if the types are guaranteed to have the same size, and the member is not used by any inline method)
    • Remove existing non-static data members from an existing class.
  • Return (or take as parameter) an iterator to a Qt container, in public API. This is not binary compatible if the library doing that is compiled with QT_STRICT_ITERATORS and the lib/app using that API isn't, or vice-versa. Return a reference to (or a copy of) the container instead.

If you need to add extend/modify the parameter list of an existing function, you need to add a new function instead with the new parameters. In that case, you may want to add a short note that the two functions shall be merged with a default argument in later versions of the library:

void functionname( int a );
void functionname( int a, int b ); //BCI: merge with int b = 0

You should...

In order to make a class to extend in the future you should follow these rules:

  • Add d-pointer. See below.
  • Add non-inline virtual destructor even if the body is empty.
  • Reimplement event in QObject-derived classes, even if the body for the function is just calling the base class' implementation. This is specifically to avoid problems caused by adding a reimplemented virtual function as discussed below.
  • Make all constructors non-inline.
  • Write non-inline implementations of the copy constructor and assignment operator unless the class cannot be copied by value. (E.g. classes inherited from QObject can't be.)

Techniques for Library Programmers

The biggest problem when writing libraries is, that one cannot safely add data members since this would change the size and layout of every class, struct, or array containing objects of the type, including subclasses.

Bitflags

One exception are bitflags. If you use bitflags for enums or bools, you can safely round up to at least the next byte minus 1. A class with members

uint m1 : 1;
uint m2 : 3;
uint m3 : 1;
uint m1 : 1;
uint m2 : 3;
uint m3 : 1;
uint m4 : 2; // new member

without breaking binary compatibility. Please round up to a maxmimum of 7 bits (or 15 if the bitfield was already larger than 8). Using the very last bit may cause problems on some compilers.

Using a d-Pointer

Bitflags and predefined reserved variables are nice, but far from being sufficient. This is where the d-pointer technique comes into play. The name "d-pointer" stems from Trolltech's Arnt Gulbrandsen, who first introduced the technique into Qt, making it one of the first C++ GUI libraries to maintain binary compatibility even between bigger release. The technique was quickly adapted as general programming pattern for the KDE libraries by everyone who saw it. It's a great trick to be able to add new private data members to a class without breaking binary compatibility.

Remark: The d-pointer pattern has been described many times in computer science history under various names, e.g. as pimpl, as handle/body or as cheshire cat. Google helps finding online papers for any of these, just add C++ to the search terms.

In your class definition for class Foo, define a forward declaration

class FooPrivate;

and the d-pointer in the private section:

private:
    FooPrivate* d;

The FooPrivate class itself is purely defined in the class implementation file (usually *.cpp ), for example:

class FooPrivate {
public:
    FooPrivate()
        : m1(0), m2(0)
    {}
    int m1;
    int m2;
    QString s;
};

All you have to do now is to create the private data in your constructors or your init function with

   d = new FooPrivate;

and to delete it again in your destructor with

delete d;

In most circumstances you will want to make the dpointer constant to catch situations where it's accidentally getting modified or copied over so you'd lose ownership of the private object and create a memory leak:

private:
    FooPrivate* const d;

This allows you to modify the object pointed to by d but not the value of the pointer after it has been initialized.

You may not want all member variables to live in the private data object, though. For very often used members, it's faster to put them directly in the class, since inline functions cannot access the d-pointer data. Also note that all data covered by the d-pointer is "private", despite being declared public in the d-pointer itself. For public or protected access, provide both a set and a get function. Example

QString Foo::string() const
{
    return d->s;
}

void Foo::setString(const QString &s)
{
    d->s = s;
}

It is also possible (but not recommended) to declare the private class for the d-pointer as a nested private class (e.g. Foo::Private). If you use this technique, remember that the nested private class will inherit the public symbol visibility of the containing exported class. This will cause the functions of the private class to be named in the dynamic library's symbol table. You can use Q_DECL_HIDDEN in the implementation of the nested private class to manually re-hide the symbols. (For an existing class, this is technically an ABI change, but does not impact the public ABI supported by the KDE developers, so private symbols mistaken exposed may be re-hidden without further warning.). Other downsides of the nested private class include the lack of consistency with Qt and its Q_D/Q_Q macros, and the fact that it can't be forward-declared in unrelated headers anymore (which can be useful to declare it as a friend class). For all these reasons, prefer FooPrivate.

Trouble shooting

Adding new data members to classes without d-pointer

If you don't have free bitflags, reserved variables and no d-pointer either, but you absolutely have to add a new private member variable, there are still some possibilities left. If your class inherits QObject, you can for example place the additional data in a special child and find it by traversing over the list of children. You can access the list of children with QObject::children(). However, a fancier and usually faster approach is to use a hashtable to store a mapping between your object and the extra data. For this purpose, Qt provides a pointer-based dictionary called QHash (or Template:Qt3 in Qt3).

The basic trick in your class implementation of class Foo is:

  • Create a private data class FooPrivate.
  • Create a static QHash<Foo *, FooPrivate *>.
  • Note that some compilers/linkers (almost all, unfortunately) do not manage to create static objects in shared libraries. They simply forget to call the constructor. Therefore you should use the Q_GLOBAL_STATIC macro to create and access the object:
// BCI: Add a real d-pointer
typedef QHash<Foo *, FooPrivate *> FooPrivateHash;
Q_GLOBAL_STATIC(FooPrivateHash, d_func)
static FooPrivate *d(const Foo *foo)
{
    FooPrivate *ret = d_func()->value(foo);
    if (!ret) {
        ret = new FooPrivate;
        d_func()->insert(foo, ret);
    }
    return ret;
}
static void delete_d(const Foo *foo)
{
    FooPrivate *ret = d_func()->value(foo);
    delete ret;
    d_func()->remove(foo);
}
  • Now you can use the d-pointer in your class almost as simple as in the code before, just with a function call to d(this). For example:
d(this)->m1 = 5;
  • Add a line to your destructor:
delete_d(this);
  • Do not forget to add a BCI remark, so that the hack can be removed in the next version of the library.
  • Do not forget to add a d-pointer to your next class.

Adding a reimplemented virtual function

As already explained, you can safely reimplement a virtual function defined in one of the base classes only if it is safe that the programs linked with the prior version call the implementation in the base class rather than the derived one. This is because the compiler sometimes calls virtual functions directly if it can determine which one to call. For example, if you have

void C::foo()
{
    B::foo();
}

then B::foo() is called directly. If class B inherits from class A which implements foo() and B itself doesn't reimplement it, then C::foo() will in fact call A::foo(). If a newer version of the library adds B::foo(), C::foo() will call it only after a recompilation.

Another more common example is:

B b;		// B derives from A
b.foo();

then the call to foo() will not use the virtual table. That means that if B::foo() didn't exist in the library but now does, code that was compiled with the earlier version will still call A::foo().

If you can't guarantee things will continue to work without a recompilation, move functionality from A::foo() to a new protected function A::foo2() and use this code:

void A::foo()
{
    if (B *b = dynamic_cast<B *>(this)) {
        b->B::foo(); // B:: is important
    } else {
        foo2();
    }
}
void B::foo()
{
    // added functionality
    A::foo2(); // call base function with real functionality
}

All calls to A::foo() for objects of type B (or inherited) will result in calling B::foo(). The only case that will not work as expected are calls to A::foo() that explicitly specify A::foo(), but B::foo() calls A::foo2() instead and there should not be other places doing so.

Using a new class

A relatively simple method of "extending" a class can be writing a replacement class that will include also the new functionality (and that may inherit from the old class to reuse the code). This of course requires adapting and recompiling applications using the library, so it is not possible this way to fix or extend functionality of classes that are used by applications compiled against an older version of the library. However, especially with small and/or performance-critical classes it may be simpler to write them without making sure they'll be simple to extend in the future and if the need arises later write a new replacement class that will provide new features or better performance.

Adding new virtual functions to leaf classes

This technique is one of cases of using a new class that can help if there's a need to add new virtual functions to a class that should stay binary compatible and there is no class inheriting from it that should also stay binary compatible (i.e. all classes inheriting from it are in applications). In such case it's possible to add a new class inheriting from the original one that will add them. Applications using the new functionality will of course have to be modified to use the new class.

class A {
public:
    virtual void foo();
};
class B : public A { // newly added class
public:
    virtual void bar(); // newly added virtual function
};
void A::foo()
{
    // here it's needed to call a new virtual function
    if (B *this2 = dynamic_cast<B *>(this)) {
        this2->bar();
    }
}

It is not possible to use this technique when there are other inherited classes that should also stay binary compatible because they'd have to inherit from the new class.

Using signals instead of virtual functions

Qt's signals and slots are invoked using a special virtual method created by the Q_OBJECT macro and it exists in every class inherited from QObject. Therefore adding new signals and slots doesn't affect binary compatibility and the signals/slots mechanism can be used to emulate virtual functions.

class A : public QObject
{
    Q_OBJECT
public:
    A();
    virtual void foo();
signals:
    void bar(int *); // added new "virtual" function
protected slots:
    // implementation of the virtual function in A
    void barslot(int *);
};

A::A()
{
    connect(this, SIGNAL(bar(int *)), this, SLOT(barslot(int *)));
}

void A::foo()
{
    int ret;
    emit bar(&ret);
}

void A::barslot(int *ret)
{
    *ret = 10;
}

Function bar() will act like a virtual function, barslot() implements the actual functionality of it. Since signals have void return value, data must be returned using arguments. As there will be only one slot connected to the signal returning data from the slot this way will work without problems. Note that with Qt4 for this to work the connection type will have to be Qt::DirectConnection.

If an inherited class will want to re-implement the functionality of bar() it will have to provide its own slot:

class B : public A
{
    Q_OBJECT
public:
    B();
protected slots: // necessary to specify as a slot again
    void barslot(int *); // reimplemented functionality of bar()
};

B::B()
{
    disconnect(this, SIGNAL(bar(int *)), this, SLOT(barslot(int *)));
    connect(this, SIGNAL(bar(int *)), this, SLOT(barslot(int *)));
}

void B::barslot(int *ret)
{
    *ret = 20;
}

Now B::barslot() will act like virtual reimplementation of A::bar(). Note that it is necessary to specify barslot() again as a slot in B and that in the constructor it is necessary to first disconnect and then connect again, that will disconnect A::barslot() and connect B::barslot() instead.

Note: the same can be accomplished by implementing a virtual slot. </translate>