Nix build on a diet

At Niteo, we are using nix-shell to build isolated development and production environments. We deploy production as a Docker image on Heroku. Recently, I’ve noticed that Pareto Security dashboard app deployments have been very slow. Almost 10 minutes, even more on a busy afternoon. Most of the time was spent in copying files around, as Heroku seems to have terrible disk i/o dedicated to building Docker images.

I like my CI/CD pipelines in the range of 3 to 7 minutes, so I took some time to take our production env on a diet. This post is a story of how I did just that, and reduced the resulting image size by 60%. And consequently halved the time it takes to deploy the latest commit to production.

The journey

Our Dockerfile is quite simple:

  • Pull a minimal base image with preinstalled Nix and Cachix, along with a pre-fetched commit of nixpkgs, to speed things up.
  • Run nix-build --attr herokuEnv to build the production runtime enviornment.
  • Create the second stage image from scratch and copy over the result of herokuEnv.

The first step was to run nix-build -A herokuEnv locally to get the derivation path:

$ nix-build -attr herokuEnv
these derivations will be built:
...
created 431 symlinks in user environment
/nix/store/rpy6why69q13snjv2byzm90qpcbqnffy-pareto

Then I fed this path to nix-store to give me a list of runtime dependencies:

$ nix-store --query --graph /nix/store/4ld5jzxrgibzkvcq3kqv5cffs5mlim38-pareto
"/nix/store/nqz4h9cqfcvcn08nq80bzddkd9h6wq05-pareto";
"/nix/store/1cxrpmfwwvncncsp1hnmkapijsx927zj-bash-interactive-5.1-p8" -> "/nix/store/nqz4h9cqfcvcn08nq80bzddkd9h6wq05-pareto";
"/nix/store/5wch96kji9zlffxjqpjdrszjzp4i7m3a-coreutils-9.0" -> "/nix/store/nqz4h9cqfcvcn08nq80bzddkd9h6wq05-pareto";
...

I kept scrolling until I found this line:

"/nix/store/vq7r6jvhn3mffzvi0x7w478llls7h2jv-gcc-10.3.0-lib" -> "/nix/store/7j16w13sd90k2jfh1p37r2im2p1aw12b-icu4c-70.1"

We don’t need gcc in a runtime environment! ? It’s a build-time dependency, sure, but once things are compiled, we don’t need it anymore. And as such, it shouldn’t be listed as a runtime dependency for our herokuEnv production environment.

The line above shows me that gcc is pulled in by icu4c. To find out what pulls in icu4c I pressed Shift + Page Up to go the beginning of nix-store output and then searched for icu4c by typing /icu4c.

"/nix/store/7j16w13sd90k2jfh1p37r2im2p1aw12b-icu4c-70.1" -> "/nix/store/zwy33l1hvnc109r6n5mw2waamdrl3mlj-nodejs-14.18.3";

We also don’t need nodejs in our production runtime. The backend app is Python based, and the frontend app is a bunch of static JS files. We need nodejs during build time to compile these static JS files, but then we don’t need it anymore. I kept on searching, what pulls in nodejs? Shift + Page Up followed by /nodejs:

"/nix/store/zwy33l1hvnc109r6n5mw2waamdrl3mlj-nodejs-14.18.3" -> "/nix/store/0dkzmdj6j6xx0jzw0j98zqhnyyxvff95-pareto-node-packages"

Keep going, Shift + Page Up followed by /pareto-node-packages:

"/nix/store/0dkzmdj6j6xx0jzw0j98zqhnyyxvff95-pareto-node-packages" -> "/nix/store/gz33l393f2pmja4fvjs703mkjj1fjbpc-pareto-frontend-dist";

Our frontend app is based on Elm, a delightful language for reliable web applications. The .elm files of our app compile to static .js files. Hence the result, pareto-frontend-dist, should not depend on nodejs or pareto-node-packages as runtime dependencies. But the nix-store query above shows me that it does. Do any files in pareto-frontend-dist contain pareto-node-packages? ?️

$ grep --files-with-matches --recursive pareto-node-packages /nix/store/gz33l393f2pmja4fvjs703mkjj1fjbpc-pareto-frontend-dist
/nix/store/gz33l393f2pmja4fvjs703mkjj1fjbpc-pareto-frontend-dist/index.ef118269.js.map

Ahhhhh! Besides our Elm app compiled to the index.js file, we also apparently ship a source map file. These are useful if you are using a JS framework, so that when you View Source in your Browser, you see the source of the code, instead of an unreadable minified JS result file.

But we are using Elm, and we have no use for source maps. Elm has great protection against runtime errors, and even if one does crop up: source mapping doesn’t work for .elm files.

It should be safe to remove any .js.map files from our production frontend dist. I appended rm -rf $out/frontend/*.js.map to commands that generate the frontend dist and rebuilt herokuEnv to see if there is a difference.

# nix-build --attr herokuEnv
these derivations will be built:
/nix/store/qda5qawm4f245vclk5jy0g3ip6rr0s6b-pareto.drv
...
created 427 symlinks in user environment
/nix/store/k3m3vvf0kj17p25kaqlr5s5gq2k1xray-pareto

4 symlinks less than at the start. But is there a difference in total size of runtime dependency graph?

# du -shc $(nix-store -qR /nix/store/rpy6why69q13snjv2byzm90qpcbqnffy-pareto) | tail -n 1 
1.3G	total

# du -shc $(nix-store -qR /nix/store/k3m3vvf0kj17p25kaqlr5s5gq2k1xray-pareto) | tail -n 1
527M	total

Huge difference in size! ? And sure enough, a bunch of large dependencies such as gcc, nodejs, systemd etc. re no longer listed as runtime dependencies of herokuEnv! Great success!

Making the dependency graph smaller makes our production Docker image smaller, and that makes our deployment process faster. It’s now down to 5 minutes or so and I can go back to building features for the Teams Dashboard. ?

And the result of optimization can be clearly seen

Many thanks go out to Domen Kožar of Cachix for helping me spell out the nix incantations I needed on my journey. ?