Reflection is a very useful tool and if you aren’t familiar with the state of affairs in C++ I’d suggest reading these two excellent blog posts by @jackie.
Unfortunately none of the current solutions really clicks with me:
- using Boost.Fusion or Boost.Hana to annotate my classes:
- seems a bit ugly and intrusive
- raises concerns about compile times
- no way to annotate fields in a custom way
- external tooling (such as libClang - see siplasplas and CPP-Reflection):
- makes build setups complicated (especially if your main compiler is not Clang (say MSVC))
- not sure about the impact on compile times
- annotating fields in a custom way is not straightforward (although some developers have reported having success using
__attribute__
)
- explicitly registering each field outside of the class definition:
- violates the DRY principle
- some implementations are intrusive - like this one
I also tried with a DSL (not in C++) which got parsed by a python script and C++ code got emitted but found it too much work to make it support complex C++ types - teaching it about templates, variants and typedefs just got hairy very quickly (not a parser expert).
What I came up with
The cmake-reflection-template repository is a small working example of a few source files with added reflection which generates serialization and deserialization routines (using std::any<>
for simplicity - so it requires C++17 - but it can be rewritten to serialize to JSON instead).
- each CMake target that wants to have reflection should have the
target_parse_sources()
CMake function called on it like so - each source file in the reflected projects has an attached custom CMake command so when it gets modified that command gets ran
- that command runs the parser on the file - named for example
my_type.h
- which generates code and dumps it in a file calledmy_type.h.inl
in agen
folder insideCMAKE_BINARY_DIR
- the resulting
my_type.h.inl
can be included either directly inmy_type.h
or perhaps elsewhere - the forward declarations of the generated functions are written inside the classes inmy_type.h
using the helperFRIENDS_OF_TYPE(MY_TYPE);
macro.
It is a bit like what Unreal is doing for reflection of properties - C++ source code is parsed (and annotated with preprocessor identifiers) and each source file includes the generated code for itself.
Here is what C++ code looks like with my solution:
class Foo {
FRIENDS_OF_TYPE(Foo); // friend declarations of the generated functions
FIELD int a;
FIELD float b = 42.f;
FIELD std::string c {""};
FIELD std::map<int, int> field_with_spaces_in_its_type;
FIELD int
field_on_the_next_line_of_its_type;
};
#include <gen/my_type.h.inl>
FIELD
is a preprocessor identifier that expands to nothing - used for easier parsing - otherwise detecting comments or distinguishing between class fields and local variables in inline methods would be very complicated.
What gets generated for this class entirely depends on the parser. FIELD
, FRIENDS_OF_TYPE()
and a few other preprocessor identifiers have to be visible in any source file that wants to use reflection - see them here.
Currently the parser from the example project would generate the following code for the type Foo
written above:
any serialize(const Foo& src) {
map<string, any> out;
out["a"] = serialize(src.a);
out["b"] = serialize(src.b);
out["c"] = serialize(src.c);
out["field_with_spaces_in_its_type"] = serialize(src.field_with_spaces_in_its_type);
out["field_on_the_next_line_of_its_type"] = serialize(src.field_on_the_next_line_of_its_type);
return out;
}
void deserialize(const any& src, Foo& dst) {
const auto& data = any_cast<const map<string, any>&>(src);
if(data.count("a")) { deserialize(data.at("a"), dst.a); }
if(data.count("b")) { deserialize(data.at("b"), dst.b); }
if(data.count("c")) { deserialize(data.at("c"), dst.c); }
if(data.count("field_with_spaces_in_its_type")) { deserialize(data.at("field_with_spaces_in_its_type"), dst.field_with_spaces_in_its_type); }
if(data.count("field_on_the_next_line_of_its_type")) { deserialize(data.at("field_on_the_next_line_of_its_type"), dst.field_on_the_next_line_of_its_type); }
}
cmake-reflection-template is meant to be used as an initial starting point and it is expected to be modified by anyone using it to better suit their needs - perhaps to tweak the parsing, to use JSON instead of std::any<>
, to add prefixes to the macros, to add the notion of class descriptions or to change what code gets emitted. The CMake part is solid and the rest can be viewed as a proof of concept. One might even rewrite the parser in a different language! I’m no expert in templating engines or writing parsers so the python script (located in /scripts/
) is nothing special but what I’ve got is good enough for my needs so far.
Support for attributes and tags
Currently besides FIELD
there are a few other preprocessor identifiers in a common header that expand to nothing - used for annotating:
#define FIELD // indicates the start of a field definition inside of a type
#define INLINE // class attribute - emitted functions should be marked as inline
#define CALLBACK // field attribute - call the callback after the field changes
#define ATTRIBUTES(...) // comma-separated list of attributes and tags into this
And they are used like this:
ATTRIBUTES(INLINE)
class Foo {
FRIENDS_OF_TYPE(Foo);
ATTRIBUTES(tag::special, CALLBACK(Foo::some_callback))
FIELD std::string tagged;
FIELD std::string normal;
static void some_callback(Foo& src) { cout << "callback called!" << endl; }
};
This will result in the following codegen:
inline void print(const Foo& in) {
print(in.tagged, tag::special()); // tag given to the print() function
print(in.normal);
}
inline void deserialize(const any& src, Foo& dst) {
const auto& data = any_cast<const map<string, any>&>(src);
if(data.count("tagged")) { deserialize(data.at("tagged"), dst.tagged); Foo::some_callback(dst); }
if(data.count("normal")) { deserialize(data.at("normal"), dst.normal); }
}
- The 2 routines are inline because of the class attribute - the codegen can be included in a header, which in turn can be included in many places.
- Both fields are of type
std::string
but tagging would allow for different overloads to be chosen - for example consider there are 2 string fields which represent paths for assets of different types (meshes and textures) - if the reflection generates GUI for editing those fields one might want to have different filters for the 2 kinds of assets (.jpg
vs.mesh
) - There is a callback attached to the
tagged
field and it will be called any time that field is changed in a routine generated by the reflection - this might be useful when 2 fields are related and changing one must result in changes to the other (the other might not even be annotated withFIELD
).
Dependencies and minimal rebuild
The CMake macro target_parse_sources()
adds custom CMake commands on each source file and when a source file is modified - the parser is ran only on that file - that way there is no explosion in build times. Also the parser is very light and its result will most likely be included by the changed file anyway so rebuilds will stay minimal.
Also when the parser itself is modified - all generated code in ${CMAKE_BINARY_DIR}/gen
is deleted since the parsing rules and codegen are most likely changed.
Current limitations
- Types in namespaces not supported (but should be easy to extend).
- Nested types not supported (maybe easy).
- Inheritance not supported (but should be easy with class attributes)
- Templated types - haven’t thought about it but a solution must exists
FIELD
has to be used when declaring class fields - otherwise they will be skipped. This greatly simplifies the parser - but I’d be happy if the parser improved and all fields got parsed by default - and perhaps skipping fields should happen only when annotated explicitly to be skipped.- Currently all generated
.inl
files go into thegen
folder of the CMake build folder - that means that if 2 files in any of the projects (CMake targets) are named identically - there would be a clash. This can be alleviated if the CMake script is improved a bit. - Cannot declare multiple fields on the same line:
FIELD int x, y, z;
- but I don’t like that style anyway. - Parser is not multi-line comment aware - will parse types in
/*...*/
and emit code for them
How I use this
In my (cleverly titled) game project I use reflection extensively. So far I generate only serialization/deserialization and GUI binding routines (3 functions per type) on top of which I’ve built the following:
- reloadable .dll plugins (with the help of dynamix) - even with the ability to change the layout of types at runtime! (by serializing object state, recreating the instances and then deserializing the old state)
- generic undo/redo system
I might expand on that in a future post.
I also like how composable everything is with the serialization/deserialization routines. In a few headers I provide overloads for the primitive types and for a few containers like std::vector<>
(and maybe types from third party libraries) - and from then on the generated routines for all my types call overloads for each of their fields - composes quite nicely - I was originally exposed to the idea of such composability in this talk about hash_append
!
Thanks for reading :)
Leave a Comment