Adding Assertions to our Database
Our code has unit tests. Our database should too.
Previously I wrote about how I setup dependencies for my migration tool. This time, we’re going to take things a step further and add in assertions - or unit tests - to our database migrations. But first, we need to figure out how to even do tests.
Choosing a Dialect
To make things easier, we’re going to start with a single dialect to optimize for. We can then pull out the part that can be generalized by the tool, and let developers figure out their specific dialects (or write tutorials for them ourselves).
While we could use something very easy to setup like SQLite, it is unfortunately very limited without creating our own extensions. So, we’re going to just focus on the more “feature-filled” servers out there.
Also, I’m doing this in my spare time, so I’d rather not be paying premium licensing for a database just to test out an idea. Because of that, I’m going to rule out any paid license servers (even if they have a trial version or free tier). Goodbye OracleDB, Microsoft SQL Server, and any proprietary cloud database.
That leaves us with open source databases, like MySQL, MariaDB, and PostgreSQL. I’m going to pick PostgreSQL simply because I like it, and it’s the database I’ve done the most advanced queries in so I know the syntax the best. That said, I won’t be using extensions to do any sort of assertions - I want to make sure the solution is portable.
As for the Postgres version, I’m going to use 18 - but things should work on an older version (so long as it’s not too old). So with that, let’s start figuring out how we want a test to work!
Properties of Tests
Good tests are reliable and say something about whether the system works as intended or not. Better tests also find bugs, but we got to start somewhere before we get that far.
Reliability from tests come from minimizing dependencies. We should only rely on the migration scripts and the database engine - nothing else. We don’t want to rely on data being present (or not present). And we don’t want to rely on things that change constantly - like time - if we can help it.
However, there is one more key aspect that we don’t have with code. Code doesn’t have “data” in it. Data is provided as input and comes as output. This makes it really easy to prevent code under test to accidentally corrupt user data since we just don’t hook it up to the production database when we run tests.
Testing a database is not like that. If we’re trying to test the production database after a migration to make sure things are working, then we are testing a service that stores our customer data! This is very risky since we don’t want our tests to corrupt or change the data in the database in anyway.
And yet, most tests of any interest require interacting with data in some way. We can’t test that a view returns the correct aggregated data without there being data to aggregate. We can’t test that a procedure correctly modifies data without there being data to modify. So, how do we ensure our tests don’t mess with user data?
Fortunately, we’re working with Postgres, and Postgres has transactions. Transactions are an “all-or-nothing” in that the transaction is either committed and all the changes are made permanent, or the transaction is rolled back and none of the changes take place. That rollback is what we’re after - if we run our tests in a transaction that is always rolled back, then our tests cannot create any lasting changes to the database. If there are no lasting changes, then the user data is safe.
The rollback also gives us another really nice property. Since none of our tests can make lasting changes, that means they’re isolated. And that isolation will help us maintain reliability.
There’s one more little detail. I don’t want the start and end of the transaction to happen in the SQL script. Repeating that code in every test is error prone - especially when we’re doing a pattern that isn’t common (aka. writing SQL tests for our database). So, the actual migration tool will start and rollback the transaction. This means our SQL can just focus on a test.
Syntax for an SQL Test
There are a few things we’ll need for tests: storage for local variables that we can reuse across statements (such as ids, results, etc.), branches so we can adjust behavior, and a way to signal a test failure. Fortunately, Postgres provides all of these inside a DO block with the sections DECLARE (for local variables), and BEGIN and END for the code. Inside this block we can use IF … THEN (with the most common variant we’ll use being IF EXISTS (SELECT …) THEN. This lets us test any assertion we can put inside an SQL statement, including calls to functions, queries from tables and views, and even operations on local variables! We can then use RAISE EXCEPTION to throw an error if the assertion fails. If we ever need to invert the condition, we just use IF NOT EXISTS and we have an inversion. Putting this together, we get something like the following:
DO $$
DECLARE
userid UUID;
todoid UUID;
BEGIN
-- Prepare our test data
SELECT uuidv7() INTO userid;
INSERT INTO users (id, "name", age) VALUES (userid, 'bob', 44);
INSERT INTO todo_lists (user_id, item, completed) VALUES (userid, 'Buy eggs', false)
RETURNING id INTO todoid;
-- Now check our views work
IF NOT EXISTS (SELECT 1 FROM unfinished_lists ul WHERE ul.user_id = userid) THEN
RAISE EXCEPTION 'View unfinished_lists is not showing newly inserted items';
END IF;
IF EXISTS (SELECT 1 FROM finished_lists fl WHERE fl.user_id = userid) THEN
RAISE EXCEPTION 'View finished_lists is incorrectly showing unfinished items';
END IF;
-- Now mutate our data for testing the inverse scenario
UPDATE todo_lists SET completed = true WHERE id = todoid;
-- Check our views with the inverse data
IF EXISTS (SELECT 1 FROM unfinished_lists fl WHERE fl.user_id = userid) THEN
RAISE EXCEPTION 'View unfinished_lists is incorrectly showing finished items';
END IF;
IF NOT EXISTS (SELECT 1 FROM finished_lists ul WHERE ul.user_id = userid) THEN
RAISE EXCEPTION 'View finished_lists is not showing newly finished items';
END IF;
-- End of test - cleanup is automatic thanks to the rollback
END;
$$;Here we insert some data, test some views, update the data, and retest the views. This ensures that our views will be correctly filtering on the completed field! While it may not seem that impressive on its own, we can start combining these tests to layer on checking and asserting correct behavior across all the different filters of our views, stored procedures, custom functions, and more!
Incorporating Tests into our Migrations
So, we have our tests in SQL - but couldn’t we have written them as part of our integration tests? Well, yes, we could have. However, integration tests require a lot more than just migrating the database. They also require at least a change to the application code, and often require a deployment (or even a partial deployment) to have the tests run against the environment. This means that, in order to test our database changes, we have to change the code.
To further complicate the issue, database changes are usually made separately from code changes. They are also often reviewed separately, deployed separately and sometimes even developed or managed by different teams. In these types of organization setups, we are delaying database integrity verification, adding risk to database changes, and adding risk that if the database change has a bug, we won’t know or be able to fix it for several days or months - depending on the release cycle.
Not to mention, most database migration pipelines I’ve seen are broken - as-in if you start with a fresh database they either fail to run or the database schema/behavior you get out does not match production. Which means that a change made locally and saved off to the migration pipeline may not behave the same once it hits production.
My solution to this is to move the validation of the database up the pipeline as far as I can - all the way to when the database is migrated. A migration runs, checks run immediately afterwards before the next migration is ran. If a failure is detected, a rollback is automatically triggered, an error is logged, and the migration is marked as “failed”. This ensures that the database always remains in a state that the assertions work. So, let’s now add them to our migration config.
Adding them to our config file
So far, our config file lets us declare migrations with up and down files (or raw SQL). They also let us declare which other migrations we depend on. As a refresher, here’s the current configuration
{
"name": "alter_session_table",
"up": [{
"file": "up/alter_session_table.sql"
}],
"down": [{
"file": "down/alter_session_table.sql"
}],
"deps": ["create_session_table", "create_company_table"]
}This works really well, but there’s some refactoring that should be done before we add in assertions. First, in the event there’s a lot of up/down actions, it quickly becomes unclear if everything is properly paired, so we’ll want to put those in a single object. We’ll call those “actions”. Second, I find if someone uses a file for either up or down, then things stay clearer if the counterpart is a file too. Same for raw SQL literals. In other words, if one string is a file, then the other string is a file.
With that, we get our refactored example.
{
"name": "alter_session_table",
"deps": ["create_session_table", "create_company_table"],
"actions": [
{
"file": {
"up": "up/alter_session_table.sql",
"down": "up/alter_session_table.sql"
}
}
]
}Now that the example is made, we need to put our test files somewhere. We could add then to the action object to tie them there. However, we are very often going to have more tests than we are changes. Especially when we have a migration that adds multiple stored procedures or functions. So, we’ll add a new key. I picked asserts since they will assert if the migration worked and would trigger a rollback if they failed. Here’s the new JSON file:
{
"name": "alter_session_table",
"depsOn": [
"init"
],
"actions": [
{
"file": {
"up": "up/alter_session_table.sql",
"down": "down/alter_session_table.sql"
}
}
],
"asserts": [
{
"file": "assert/alter_session_table.sql"
}
]
}I did leave the option for asserts to be raw SQL. That’s for consistency with the actions. If one can be raw SQL, the other should be as well.
JSON to TOML
Now, while JSON is great for little tutorials like this, it’s pretty terrible for configuration. The thing that bugs me the most is the lack of comments. Fortunately, TOML is another option, and Go supports serializing to and from JSON and TOML pretty easily (Go is what I wrote Rove in). So, I was able to quickly get TOML support added to Rove. It still supports JSON, but TOML is my new default. Here’s the above in TOML:
name = 'sessions'
deps = ['users']
# The list of actions that will happen to the database as part of the migration.
# These actions are persistent and will mutate the database.
[[actions]]
[actions.file]
up = 'up/alter_session_table.sql'
down = 'down/alter_session_table.sql'
# Could also have written
# [[actions]]
# file = { up = 'up/alter_session_table.sql', down = 'down/alter_session_table.sql}
# A list of assertions to run against the database post-migration.
# These assertions are not persistent and will be rolled back.
[[asserts]]
file = 'assert/alter_session_table.sql'The addition of comments is really nice!
If you’re interested in my project the git repository is available online.

