Generics in C

As part of my recent tour of languages, I have started using C again. Particularly, I've been using C11 and have been exploring the new additions. One such addition is Generics. I decided to try to use Generics to see how far I could push the feature to find its limitations. To do this, I chose implementing some data structures - in my case porting over Zig's slices. While there was some benefit, there were also quite a few headaches and limitations.
What are generics?
Let's establish an (oversimplified) understanding of what generics are in C, and how they differ from other languages. Generics in C are essentially a compile-time switch statment. This switch statement switches on the type of first parameter, rather than its value. The body of the matching case will then be substituted into the code at compile time. This allows for different code depending on the type. Here's an example:
int main() {
int a = 1;
float b = 2.0f;
// Prints "INT"
printf(_Generic(a, int: "INT\n", float: "FLOAT\N"));
// Prints "FLOAT"
printf(_Generic(b, int: "INT\n", float: "FLOAT\N"));
}
Generics do have a default case as well, similar to switch statements. I haven't found much use for it personally. For completeness, here's an example with the default case.
int main() {
double c = 1.0;
// Prints "OTHER"
printf(_Generic(a,
int: "INT\n",
float: "FLOAT\N",
default: "OTHER\n"));
}
One thing you may have noticed in the examples is that generics aren't all that useful by themselves. Where generics become useful is when they're combined with the preprocessor to create "generic macros". One of the most strait-forward uses is defining macros to mimick function overloading. This allows you to group a bunch of specialized type methods (e.g. sin
, sinl
, sinf
) and have one name for all of them (e.g. sin
). Below is an example:
#define sin(X) _Generic((X), \
long double: sinl, \
double: sin, \
float: sinf \
)(X)
int main() {
printf("sin(3.0) = %f\n", sin(3));
}
Generics are not able to call macros in the case bodies, which allows this use case to work. Because of this, generics are essentially a step that happens post-preprocessor but before compilation.
Tangent on Macros
Macros make generics useful, and without macros C generics would be practically pointless. However, this reliance on macros ends up creating some sticking points.
Macros have their own set of problems. Readability and parsability all get hurt substantially by macros. Macros can break many tools, such as automatic binding generators and static analysis tools. Macros also cause issues when trying to debug or modify code. There's enough issues with macros that NASA's Rule of 10 (their guidelines for making reliable software) expressly restrict macros to "the inclusion of header files and simple macro definitions."
Genericized macros make the situation worse. Since generics aren't really a macro, they aren't handled by any preprocessor logic in existing tools. But, they also aren't really C code since they're comptime conditionals. This add a new pre-compilation stage that tooling needs to adapt to, and not a lot of tools have really adapted yet.
The tooling drawback is very quickly felt. Especially since in modern C (C11 and C23) the standards committee is pushing for a lot more generic macros to bring function overload to common methods that get unintentional type casted, like math libraries. This in turn means that IDE intellisense is now (correctly) picking up the macro override and not the underlying function. However, the new behavior does mean that the popup help and goto definition no longer work properly. While the functions themselves have proper documentation (parameters, return types, etc.), the IDE is no longer detecting the function. It's detecting the macro wrapper, and so it pulls the documentation from the macro. Using "goto definition" doesn't help as well since that takes me to the macro. And, since generic support isn't there yet, sometimes the IDE cannot parse that a function is referenced by the generic, so "goto definition" won't take me any further. This was pretty frustrating, and I'm really questioning if using C99 would be a better overall experience.
Making the Slices
Macro tangent over. So, now it's time to create a generic data structure in C with _Generic
, right? Well, no.
Generics cannot return definitions or declarations - including structs, typedefs, and functions. Generics also cannot be used as a struct body. This limits them to only working on existing types and functions, rather than building new ones. Any "use a macro" workarounds don't work here either because generics resolve after macros, so we cannot conditionally select between macros.
Long story short, we are not able to pass an int
to a _Generic
and get an int slice
as output. Instead, we have to write/generate our different slice types, and then create generic macro wrappers for "function overloading."
With our plan readjusted, let's make the base slices. I ended up with something like the following:
#define DECLARE_SLICE_SPECIALIZATION(TYPE, NAME) \
typedef struct NAME { \
TYPE* _Nonnull start; \
size_t len; \
} NAME; \
NAME NAME ## _make(TYPE* _Nonnull start, size_t len); \
TYPE* _Nonnull NAME ## _ptr_at(NAME s, size_t index);
#define DEFINE_SLICE_SPECIALIZATION(TYPE, NAME) \
inline NAME NAME ## _make(TYPE* start, size_t len) { \
assert(start); \
NAME res; \
res.start = start; \
res.len = len; \
return res; \
} \
\
inline TYPE* NAME ## _ptr_at(NAME s, size_t index) { \
assert(s.start); \
assert(index < s.len); \
return s.start + index; \
} \
I then used my macros to define slices for int
, int64_t
, double
, float
, char
, etc. I created generic wrappers around my slice methods (e.g. make
and ptr_at
). I had wrappers for const and non-const slices (non-const slices had mut_
as the prefix, while const ones didn't have a prefix). My wrappers looked like the following:
#define slice_from(X, LEN) _Generic( \
(X), \
char*: chars_make, \
int64_t*: i64s_make, \
int32_t*: i32s_make, \
float*: i16s_make, \
double*: i8s_make \
)(X, LEN)
#define mut_slice_from(X, LEN) _Generic( \
(X), \
char*: mut_chars_make, \
int64_t*: mut_i64s_make, \
int32_t*: mut_i32s_make, \
float*: mut_i16s_make, \
double*: mut_i8s_make \
)(X, LEN)
#define ptr_at(Collection, Index) _Generic( \
(Collection), \
f64s: f64s_ptr_at, \
mut_f64s: mut_f64s_ptr_at, \
f32s: f32s_ptr_at, \
mut_f32s: mut_f32s_ptr_at, \
chars: chars_ptr_at, \
mut_chars: mut_chars_ptr_at, \
i64s: i64s_ptr_at, \
mut_i64s: mut_i64s_ptr_at, \
i32s: i32s_ptr_at, \
mut_i32s: mut_i32s_ptr_at \
)((Collection), (Index))
This highlights another quirk of generics - C generics cannot tell the difference between const and non-const qualifiers. In fact, they ignore all qualifiers (e.g. restrict
, volatile
). So, we can't automatically switch to a mutable slice on a char*
and an immutable slice on a const char*
. Instead, we have to define two different "slice from" macros and manually switch between them.
Making it extensible
One of the greatest benefits of generics and templates in other languages is they allow developers to delay type binding and dispatch until the generic code is actually used. The delay allows for generic collections (e.g. maps, vectors, lists) where the elements stored aren't known to the generic code. This helps separate "library code" from "application code." Library developers can build whatever data structure they want, and application developers can tell those same structures what to store.
C generics are different since you must outline every type case up-front. This means you can't define a generic in a library and let the application developer add new cases. Or, at-least, you can't do that without entering the weird world of variadic macros.
I'm not going to go into too much detail of variadic macros. The essentials is that when you add "...
" to the end of your macro parameter list, anything the developer types after the named parameters will get put into a special list (which includes invalid syntax, commas, etc). That list can then be accessed with __VA_ARGS__
from inside the macro. If you type just __VA_ARGS__
then it'll paste that list verbatim.
With this much variadic macro knowledge, we can allow developers to add their own type cases to our existing _Generic
s. Since developers probably don't want to type every new type case every time, we'll assume they make their own macro wrapping our macro. That way, they can create their own slice types and use them like our slice types from within their macro. Here's an example for the ptr_at
wrapper.
// Our library's macro
#define ptr_at(Collection, Index, ...) _Generic( \
(Collection), \
__VA_ARGS__, \
f64s: f64s_ptr_at, \
mut_f64s: mut_f64s_ptr_at, \
f32s: f32s_ptr_at, \
mut_f32s: mut_f32s_ptr_at, \
chars: chars_ptr_at, \
mut_chars: mut_chars_ptr_at, \
i64s: i64s_ptr_at, \
mut_i64s: mut_i64s_ptr_at, \
i32s: i32s_ptr_at, \
mut_i32s: mut_i32s_ptr_at \
)((Collection), (Index))
// ...
// Someone's application code
// The trailing comma is needed
#define my_ptr_at(Collection, Index) \
ptr_at(Collection, Index, i16s: i16s_ptr_at, )
// ...
i16s s = i162_make(...);
// Will use our added type case
my_ptr_at(s, 2);
Wrapping variadic macros with more macros will almost certainly lead to a nested macro mess. I don't really like this solution, and I'm not recommending it. But this is what I could find as a workaround.
Debugging
Error messages around generics are awful. Every error goes to the line with the _Generic
keyword. Errors only say something went wrong with the _Generic
and don't say which case was being used. Other details from error messages also aren't clear. For instance, I'd get "invalid syntax" or "invalid match" when I had both const and non-const pointer types. Since the qualifiers are ignored, I ended up getting a type collision which triggered the other warnings. What would have been better is a "qualifier ignored" or "duplicate case detected, <case 1>; <case 2>".
I've also had errors where an unused case substitution wouldn't make a valid statement post-substitution. It confused me a lot since I was expecting it to be like a macro - it only checks what is substituted in the end. I didn't expect unused cases to cause issues. I ran into this the most when I copied and pasted large generics and didn't update a line. The error message I got was "invalid substitution" - again without any indication into which case was invalid. I ended up spending way too much time debugging the used substitution cases instead of looking at the unused substitution cases.
Are C generics useful?
C generics are sort of useful. There are some "magic macros" I've used from other libraries which ended up using generics under the hood, and they "just worked." In all of these cases, generics were either being a function override, or they were selecting a literal value to go with a type (like a printf
format string or the correct numeric suffix like f
or ULL
).
I wouldn't try to use C generics like you use generics/templates in any other language. C generics aren't made for data structures. They're meant for mimicking function overrides without actually having function overrides. So limit your usage to "function override"-like use cases.