Snapshot Tests in C
Lately I’ve been working on a fairly big C library (it’s still early days, and I’ll write more about it over time). As part of this library, I’ve been creating my own test framework.
Most of my tests are typical unit tests - you have some setup, run your code, and then do a series of assertions with hard-coded values. Nothing too surprising, and the test framework for that isn’t too difficult to make.
For most of my tests, that style is ideal. It’s very focused (both writing the tests and what is being tested), and it’s not too hard to setup. Also, since I’m writing the examples, it works really well when I know what the output should be (which is most of the time).
However, it does have a lot of drawbacks. I’m not going to cover all of the drawbacks, or how I’m trying to address each one. Rather, I’m going to focus on one singular drawback: I need to know the answer/behavior ahead of time.
Again, for most use cases, this is fairly trivial. Memory allocators have a very known behavior (allocates memory sufficiently large or returns an error, memory doesn’t overlap, memory isn’t leaked, etc.), so for my memory allocators it’s pretty easy to write some typical unit tests. Similar thing for most of the other code I write. I know what behavior I want it to have.
Where it’s not so true is when it comes to low-level math functions - like sin, cos, tan, etc. I don’t know what the answer should be for every number ahead of time. I’m not someone who regularly calculates sin by hand or has a book full of tan lookup tables sitting on my shelf.
This means that I need to lookup or calculate the correct values. Fortunately, the standard library provides such a tool, so I wrote something like the following:
for (int i = 0; i < numInputs; ++i) {
double expected = sin(inputs[i]);
double actual = mySin(inputs[i]);
if (!approx_equal(expected, actual, 0.0000001)) {
fail();
}
}
pass();
This works great - so long as you only stick with one standard library implementation. It turns out, there are different ways of implementing sin. A lot of open source standard libraries I’ve gone through appear to be based off of the Sun Microsystems code in FreeBSD - which to be fair is what I’m basing most of my math code around too. This means that they’re very compatible in similar precision, error tolerances, and biases (sin methods are an approximation not a calculation so a lot can change due to different trade-offs made in the implementation).
However, not everything is open source, and Microsoft’s MSVC standard library notably is not. So much so that when I used FreeBSD’s approximation, my above test failed1. Which, is honestly what I expected. The reason I wanted to have my own sin was simply because I want my code to behave the same way on every platform (or at least as close as I can get it). Having a math library that executes the same code on every platform is a huge step towards that determinism.
Fortunately, my laziness got the better of me and I didn’t want to manually generate a table of expected and actual values and try to keep that updated. I have that pattern in a few places in my code and it’s a nightmare to debug and maintain.
So, for a while I just wrapped whatever was in the standard, that way my simple test would pass on both open-source and closed-source libraries. In the meantime, I was thinking about how to write a test that better captured the FreeBSD-based version of sin I was wanting to have.
One day I read about snapshot tests and was reminded about Jane Street’s blog post on their snapshot testing.
I had known about snapshot tests when I first wrote my library, but I’ve only used snapshot tests in the context of the React components, and React snapshot tests suck. I’m not going to go into too much detail in this post, but here are the main gripes I have with React snapshot tests:
They check the exact HTML output, not the visual output and not the semantic output. This makes tests flaky in the worst way - they fail for changes you don’t care about and pass for changes you do care about
The snapshots are incredibly verbose, so when things do change the deltas can be massive - which means no one looks at them
Snapshots aren’t stored in the code, but rather in a separate directory. So now there are multiple files that you need to look at to understand what a test is doing
Because of the issues I’ve had with React snapshots, I had initially written off snapshot testing entirely. And when I had first read Jane Street’s post I couldn’t comprehend how the mess of snapshot testing could be enjoyable.
And then, after pondering my own predicament, I realized it was precisely what I wanted.
I already had a way to generate correct cases - simply test against GCC/Clang/Musl. All I needed was a way to capture that output and preserve it in the code. And snapshot tests let me do that.
So, I quickly wrote my own snapshot testing macro. This macro was based around Jane Street’s methodology where it generates correct source code to embed rather than separate files. Though, for ease of use with how many cases I was going to be generating, I gave it the ability update the source files directly, though I normally have it off by keeping a “return false” at the start. That way I only update the test cases when I want them to be updated, not all the time.2
So far it’s been really great! I have a lot of snapshot tests in my math code right now. This allows me to separate my math tests from the standard library implementation, which is needed for me to standardize the math functions I use across platforms.
That said, there are some limitations in my system. For one, my parser is really simple - it basically looks for the phrase “CHECK_SNAPSHOT” followed by an open parenthesis and then a comma (while ignoring commas inside quotes) in the C file. If you’ve ever used C macros, you’ll know this isn’t enough to handle complex use cases. Something simple like the following would break my snapshot test code:
#define COMMA ,
TEST(my_test) {
CHECK_SNAPSHOT(“hello” COMMA “bye”);
}
Also, it only looks for the first occurrence on a line - it doesn’t try to understand if it’s the “right” occurrence. So each snapshot check must be on a separate line.
Of course, getting that level of robustness takes a lot of work. I’d have to have a preprocessor, C compiler front-end, etc. just to identify what code to modify - something that doesn’t provide me benefit and only provides marginal benefit generally.
The bigger issue is that my snapshot testing requires values to be strings - which is a little annoying in C traditionally due to the fact memory is manually managed. Combine this with the fact I have a limited parser and now it’s a lot harder for someone to macro their way to “elegant” test code.3
My solution is just to have a fixed buffer that I reuse and to have macros define how to write to that buffer.
Overall, I’m really pleased with my snapshot test solution.
The numbers are fairly similar, but they’re within a percentage value of each other rather than a fixed distance, and my approximate equality method only handled a fixed distance. I didn’t feel like trying to figure out what the value tolerance so just moved on to other options - especially since at the end of the day I’m wanting to match GCC/Clang/Musl since those are closer to the FreeBSD library anyways.
At some point, this will become a CLI/environment variable option. However, I get the most bang for effort by simply using a return statement for now and then refactoring it later once my usage/needs gets more complex and I can know what features I really need.
I do have a little trick where it will match a phrase with “CHECK_SNAPSHOT” in it rather than doing an exact match - but that may still be too limiting for some use cases. Also, I haven’t actually tested how it works with extended phrases, so it may not actually work.

