Erlang Deployment: First Impressions

Picture of Matt Tolman
Written by
Published: Apr. 25, 2025
Estimated reading time: 17 min read

Lately I've been experimenting with deploying to bare metal servers. No Kubernetes. No Docker. No AWS Console. Just the physical hardware, mounted to a rack, no OS pre-installed, and no RAID preconfigured. Just that beautiful, metal casing and lots of cables to plug in.

During this process, I've been learning a lot about servers, hardware, availability, backups, redundancy, configuration, Linux, etc. There's a ton that I still don't know. There's also a lot more appreciation for what cloud providers do, and there's a lot of "oh, while that was annoying to do it wasn't that hard." I can see why people like cloud providers, but I can also see why there's a growing movement away from cloud providers. It's been a great eye-opening experience.

As part of this journey I've ran into a fun little "problem". I only have two servers, and I've split them to have very different roles. One is just a glorified DVR and file backup. The other server is where I host applications I've built, including my experiments and side projects. This setup, while convenient most of the time, poses a big problem when it comes to updating services. For most services, I don't have a "fallback" server ready-to-go, and I can't just "spin up" a new server. That means I have to either setup some complicated virtualization platform like Kubernetes, or deal with downtime. Or, use a language that allows for updates while running.

There are two core programming languages I know of which allows for updates while running: PHP and Erlang (Elixir and Gleam fall under Erlang). I've used PHP a lot, and I'm really familiar with how it works, pitfalls, and other issues that the new Object Oriented style is causing for deploys. It's quickly becoming a language that needs the blue/green deploy systems in other languages. However, Erlang has kept the core of "update while running," and it's fascinated me. So, I gave it a try.

A Crude Understanding of Erlang's VM

Let me preface with the fact I'm not an Erlang expert. I've read lots of books (Erlang in Anger, Desigining for Scalability with Erlang/OTP, Build it with Nitrogen, etc.). I've made some websites with Erlang. I got a CI/CD pipeline working. But I'm not an expert. So, this is not exactly how Erlang works, just my current understanding that makes sense to me.

Erlang's VM has the ability to store code in memory, and it has essentially two buffers: current version and new version. Code that's currently running is put in the current version memory, and code that we're upgrading/downgrading to is in the new version memory. When we tell the VM to load new code, it will first put that code in the new version memory. Then, it will start swapping function pointers to point to the new code once current processes hit an injection point. These injection points are fully-qualified function calls (my_module:my_func rather than just my_func).

Additionally, there are some hooks and message handlers that can be called to "upgrade" or "downgrade" process state. Essentially, the developer writes a code_change that gets the old version, new version, current process state, and extra "upgrade"/"downgrade" configuration options. The method then returns the new state, and we've "upgraded" (or "downgraded") our state. It's not automatic, but it is pretty powerful, and it lets developers dictate how process state should be updated. I personally prefer this approach over to JavaScript land's "hot code reload" system since I've found that state changes cause a lot of bugs and issues with hot code reload since developers don't specify what should happen. It's a nice touch from Erlang.

Prepping a Release

Right, so we've gotten our Erlang application running, we made a code change, and we want to release. But, how do we tell the Erlang VM that we want it to load the new code? Well, this is where it gets tricky.

In development mode, you typically have a REPL opened and can issue commands like c(module_name). to recompile a specific code module. Alternatively, you can use a hot-code reload system like sync:go() (from the sync package) which will watch and reload changes as you make them. But in production, we typically don't do that. Instead, there's the concept of "releases".

A release is essentially a version of the Erlang runtime, the application code, and release meta-information. By including the Erlang runtime into the release system, we have removed a lot of complexity when it comes to updating servers. We simply create a new release, and we get the new Erlang VM, no separate server update needed. Bundling the application code makes it so we can actually update the application. This also includes our code_change handlers. Pretty normal stuff. However, by themselves it's not enough to actually tell the VM how to update the application - at least in a "production friendly way." This is where release meta-information comes in.

The Release Meta

Erlang releases aren't just "load which files changed" model. That's PHP's model, and it's currently breaking down (especially when doing the "one-class per file" model of OOP). Erlang also doesn't have a "try to infer what to load from a source or dependency tree" model, mostly because it doesn't know how you plan to update. Sure, it might be able to check if there's a code_change method, but will that method work when moving from version "1.2.0" to version "25.3.2"? Probably not.

We're also not releasing by going through every version update in between the two versions. For one, there's only one memory buffer. For another, what happens if there's a buggy in-between version that is patched in a later version? Including that in-between version in the update path could lead to data corruption. Also, it's really silly if to update from version "1.0.0" to version "1.0.25" you had to upload and run 25 different upgrades.

So, Erlang doesn't try to provide an auto-upgrade system. Instead, it provides a manually configurable upgrade system. Developers can choose how to upgrade each component of the application for different versions of the application. Developers can choose to do a simple file reload, or run the code_change, or even wipe all the in-memory data for a segment, or even to restart the entire application on a new VM. It's a lot of power and control.

But, it also means that we need a lot more information to do a release. Hence, why we need release meta-information. We need steps to tell Erlang how to do the upgrade (or downgrade). Think of it like a CI/CD pipeline provided by the programming language which also updates a server in-place instead of spinning up a new VM. Perfect for baremetal deployements.

Bringing in tools

Erlang provides a basic CI/CD pipeline, but it doesn't provide a lot of tooling to help make the pipeline run smoothly. If you want the low-level details of how to use raw Erlang releases, I recommend reading Designing for Scalability with Erlang/OTP by Francesco Cesarini and Steve Vinoski. They're book is an excellent read about Erlang, and they cover the details of deployments really well. I'm just glossing over the details here.

For this post, I'm going to be using "rebar3," so everything I discuss will be from the perspective of rebar3. It handles creating projects from templates, building projects, dependency management, and creating releases.

Here's a basic rebar configuration file:


{deps, []}. % list of dependencies

{plugins, [rebar3_appup_plugin]}. % List of build plugins, this is one that I use

{relx, [ % Release information
    {release,
       {my_release, "0.1.0"}, % Release information (release name + version). I use git instead of a string to pull from git
       [sasl, my_app]}, % List of Erlang applications to bundle

       {mode, prod}, % Build profile

       % Erlang configuration files
       {sys_config, "./config/sys.config"},
       {vm_args, "./config/vm.args"}
 ]}.

Creating the AppUp files

"Applications" aren't just a term used to describe a program in Erlang, they're actually a core idea in how Erlang organizes code. Each "application" is a group of functionality, with it's own supervision trees. Applications can talk and depend on other applications (similar to microservices), but they can all be local running in the same VM instance.

Each application also gets it's own deployment rules, and these rules are stored in a ".appup" file. The appup file is essentially the current release version, plus a list of rules on how to upgrade from each previous version, and a list of rules on how to downgrade to each previous version.

Here's an example:


{"2.5.3", % Current version
    %% Upgrade from these versions
    [
        {<<"2\\.5(\\.[0-9]+)*">>, %% Upgrades for anything in our patch range
         [ %% Commands for updating code
            {update, my_module, {advanced, {}}}, %% reloads the file "my_module" and calls "code_change"
            {update, my_stateless_module} %% reloads the file "my_stateless_module", does not call "code_change"
         ]},
        {<<"2\\.[0-9]+(\\.[0-9]+)*">>, %% Upgrades for anything in our major range
         [
            {update, my_module, supervisor}, %% restarts our supervision tree
         ]},
        {<<"1\\.[0-9]+(\\.[0-9]+)*">>, %% Upgrades for anything from our previous major version
         [
            {restart_application, my_app}, %% restarts our application, but keeps VM
         ]},
        {<<".*">>, %% Catch-all
         [
            restart_new_emulator, %% restarts the Erlang VM, possibly with a new emulator
         ]}
    ],
    %% Downgrade to these versions
    [
        {<<"2\\.5(\\.[0-9]+)*">>, %% Downgrades for anything in our patch range
         [ %% Commands for updating code
            {update, my_module, {advanced, {}}}, %% reloads the file "my_module" and calls "code_change"
            {update, my_stateless_module} %% reloads the file "my_stateless_module", does not call "code_change"
         ]},
        {<<"[1-2]\\.[0-9]+(\\.[0-9]+)*">>, %% Downgrades for anything from our previous major version or this one
         [
            {restart_application, my_app}, %% restarts our application, but keeps VM
         ]},
        {<<".*">>, %% Catch-all
         [
            restart_new_emulator, %% restarts the Erlang VM, possibly with a new emulator
         ]}
    ]}.

Here we have our current version declared, and then we're using regular expressions to match other versions. We have several update strategies, ranging from just swapping out code on a patch, to restarting the entire Erlang VM if it's a big enough difference.

Overall, it's a nice system with one big shortcoming: we have to keep updating the version tag by hand. Fortunately, we only have to update it when our application updates, and if we have multiple applications that means not every one will be updated every time. But, the issue is that applications have a separate config file that also has the version string. So, now we have it in two places.

This is where that rebar3_appup_plugin I referenced in my rebar configuration comes into play. I can simply change the current version string to "{{vsn}}" and the plugin will update it for me automatically based on the other config file, so I only have one place to update it instead of two.

The Relup file

Great, so we've documented how to upgrade our applications, so we're done right? Well, no. We've described how to update each application, but not how to perform an actual release. We need another file for that called "relup".

Fortunately, we can generate relup files using our application files. This can be done with rebar3 relup -n my_app -v 1.2.5 -u 1.2.4, and we'll generate a release upgrade file with instructions on how to move from "1.2.4" to "1.2.5". Nice!

Except, there's one littl gotcha. The command only works if the previous release was built on the same machine. Without that previous release, it will fail. The solution? Either always use the same build machine and directory, or pull down the previous release before building. Fortunately, the previous release doesn't need a relup file, it just needs to be present, so rebuilding it is an option too. Annoying, but manageable.

Packaging the Release

Once we've gotten our relup file, we can run rebar3 tar to package the release. We then get a tar.gz file which we can SFTP onto a server, extract, and then run with bin/my_release foreground (just replace my_release with your actual release name).

Release Upgrade quirk

Upgrading has a little quirk that took me forever to realize. After the first release is extracted, it creates a "releases" directory where it stores all releases. All future releases need to have their ".tar.gz" file uploaded to that generated "releases" directory before "rebar3 install <version>" will work. It's a poorly documented quirk, and it took me forever to figure it out.

Shared Library Linking Nightmares

One other "quirk" is that Erlang (and linux in general) does rely on shared libraries, so they do need to be present for things to work properly. And, all of them needs to match/be compatible with what was on your build server. This was fun to figure out since I had setup one of my servers to be the build machine, and the other to be running the code. It took a while, but I got there. It's just something to keep in mind if you try to do this yourself. It's also one of the main reasons I'm starting to eye Alpine and NixOS and leave behind Debian (especially since I'm tired of manually building/installing things because Debian repositories are very outdated).

More automation please

The above process is the general, high-level view. There is some room for improvement though, especially if you try to have Git manage your versions, and you want application and release versiosn in lockstep (which I do for debugging purposes).

I made a little python script which sets versions based on tags. I also update my rebar.config to have my release version line be {my_release, git} rather than {my_release, "1.2.4"}. The "git" will grab the current tag as the version, or if there is no tag it will grab the last tag version and add a build suffix with a hash. It's really nice.

My python script does update all of the application versions to match the release version, which suits my taste. I may get rid of it in the future, mostly since I have a feeling it'll bite me at some point. But until it does, I'm not going to worry about it, and when it does I'll have a blog post about why it's a bad idea.

My script also copies the resulting tarball to build_uploads, and it clears that directory before building each time. This just helps me keep track of what I actually want to upload instead of having to sift through every development build. I do keep the original tarballs around in the original rebar3 build spot, this is just an easy-to-find copy.

Here's my script:


"""
Copyright 2025 Matthew Tolman

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

"""

import subprocess
import re
from glob import glob
import os
import shutil

subprocess.run(['git', 'stash'], stdout=subprocess.DEVNULL).check_returncode()

buildPath = 'build_uploads'

if os.path.exists(buildPath):
    shutil.rmtree(buildPath)
os.makedirs(buildPath, exist_ok=True)

print("Finding version...")
result = subprocess.run(['rebar3', 'release'], stdout=subprocess.PIPE)
result.check_returncode()
lines = result.stdout.splitlines()
version = '0.0.0'
appname = ''
for line in lines:
    line = line.decode('utf-8')
    if "Assembling release" not in line:
        continue
    info = line.split(' ')[3].removesuffix("...").split('-', 1)
    appname = info[0]
    version = info[1]

    print("Found version " + version + " for app " + appname)

    print("Finding previous version...")

    prevVsn = None
    try:
        result = subprocess.run(['git', 'rev-list', '--tags', '--max-count=2'], stdout=subprocess.PIPE)
        result.check_returncode()
        possibleTags = [
            t for t in [subprocess.run(['git', 'describe', '--abbrev=0', '--tags', x], stdout=subprocess.PIPE).stdout.decode('utf-8').removesuffix('\n')
                for x in result.stdout.decode('utf-8').splitlines()]
            if t != version
        ]
        prevVsn = possibleTags[0]
        print("Found previous full release version: " + prevVsn)
    except:
        print("No previous version found in git tag, assuming no previous version. Will skip relup")

    print("Updating app files...")
    for filename in glob('apps/**/*.app.src', recursive=True):
        if "_build" in filename:
            continue
        print(filename)
        with open(filename, 'r') as file:
            content = file.read()
        content = re.sub(r'\{vsn,\s*"[^"]+"\}', '{vsn, "' + version + '"}', content)
        content = re.sub(r'\{licenses,\s*"[^"]+"\}', '{licenses, "Proprietary"}', content)
        with open(filename, 'w') as file:
            file.write(content)

    print("Building Release")
    result = subprocess.run(['rebar3', 'as', 'prod', 'release'], stdout=subprocess.PIPE)
    if result.returncode != 0:
        print(result.stdout.decode('utf-8'))
    result.check_returncode()

    if prevVsn != None:
        print("Building Relup")
        result = subprocess.run([
            'rebar3', 'as', 'prod', 'relup',
            '-n', appname,
            '-v', version,
            '-u', prevVsn
        ],
            stdout=subprocess.PIPE
        )
        if result.returncode != 0:
            if b'not found' in result.stdout:
                print("Previous release not found, going to build it")
                curCommit = subprocess.run(['git', 'rev-parse', 'HEAD'], stdout=subprocess.PIPE)
                curCommit.check_returncode()
                print("Checking out previous version")
                subprocess.run(['git', 'stash'], stdout=subprocess.DEVNULL).check_returncode()
                subprocess.run(['git', 'checkout', prevVsn], stdout=subprocess.DEVNULL).check_returncode()
                
                print("Building previous version")
                build = subprocess.run(['rebar3', 'as', 'prod', 'release'], stdout=subprocess.PIPE)
                if build.returncode != 0:
                    print(build.stdout)
                build.check_returncode()
                print("Previous version built!")
                
                print("Checking out original hash (will be in detached mode)")
                subprocess.run(['git', 'checkout', curCommit.stdout.decode('utf-8').removesuffix('\n')], stdout=subprocess.DEVNULL).check_returncode()
                subprocess.run(['git', 'stash', 'pop'], stdout=subprocess.DEVNULL).check_returncode()

                print("Retrying relup command...")
                result = subprocess.run([
                            'rebar3', 'as', 'prod', 'relup',
                            '-n', appname,
                            '-v', version,
                            '-u', prevVsn
                        ],
                    stdout=subprocess.PIPE
                )
            
            if result.returncode != 0:
                print(result.stdout.decode('utf-8'))
        
        result.check_returncode()

    print("Making tar")
    result = subprocess.run(['rebar3', 'as', 'prod', 'tar'], stdout=subprocess.PIPE)
    result.check_returncode()
    lines = result.stdout.splitlines()
    tarballPath = None
    tarballName = None
    for line in lines:
        line = line.decode('utf-8')
        if "Tarball successfully created" not in line:
            continue

        print('Tarball created!')
        info = line.split(' ')
        tarballPath = info[4]
        tarballName = os.path.basename(tarballPath)

        print('Copying tarball to upload dir...')
        outFile = os.path.join(buildPath, tarballName)
        shutil.copyfile(tarballPath, outFile)
        print('Tarball path: ' + outFile)

    if "build" in version:
        if prevVsn != None:
            with open('prodrun.txt', 'w') as file:
                file.write(prevVsn)
    else:
            with open('prodrun.txt', 'w') as file:
                file.write(version)

So far, deploying Erlang on baremetal has been a good experience. I'm going to keep experimenting to see what I can get done. I haven't tried distributed deployments yet, but that's on my todo list.