You may not need Nx Cloud

Tags:
  • CI/CD
  • CircleCI
  • Nx

Published

Some of the things I learned while migrating the frontend monorepo at FLYR from Lerna to Nx and how to make the most out of this setup without Nx Cloud.


I'm going to quote the words of my friend Alessandro Cifani from his talk "Subito's journey to a scalable monorepo" at reactjsday 2022: I hate waiting.

In the past months, the thing that made me wait the most has been the frontend pipeline at FLYR. When I joined the company, a full build would take 22-23 minutes, despite the fact that the workload was distributed on 8 XL parallel instances in CircleCI, each instance having 8 CPUs and 16GB of RAM.

The reason why it took so long is that every frontend application in our monorepo was getting linted, tested, and built every time, regardless of whether the changes affected that application or not.

Since then, I've changed the pipelines to only test and lint the applications and libraries affected by the changes. In this post, I'll focus instead on the build process.

For historical reasons, despite having a setup based around micro frontends and module federation, we still need to build all applications when deploying. In this post, I'm going to share how we can leverage Nx to output all applications, while only building the ones that changed.

There are many resources on how to setup a JavaScript monorepo with Nx, but this is not one of them. Refer to the official documentation instead. What I'm going to focus on instead is how to make the most out of Nx in CI without Nx Cloud.

In the rest of this post, I'm going to use the terms task and build somewhat interchangeably, but these ideas apply equally to any Nx command, whether you're building an app, running tests, or linting your code.

Nx Cloud

Nx Cloud is a paid service that allows you to share the computation cache of your tasks across your team and across CI runs. It's also smart enough to run your tasks in parallel, reducing the wall time of your commands.

I haven't actually tried the service, because it would require approval and budget from my company, but also because my first approach to things is usually to try to self host them.

Let's see how we can achieve the similar results without Nx Cloud.

Understanding Nx computation cache

Given that Nx is able to cache results without requiring any cloud service when running tasks locally, there must be a way to hack something together to share the cache across CI runs.

The first thing to understand is how the cache works. By default, Nx stores its cache in node_modules/.cache/nx. As a side note, you can change this by setting the cacheDirectory property in nx.json.

When opening this directory, you'll see something like this:

.
├── 1535800501287718917
│  ├── code
│  ├── outputs
│  │  └── packages
│  │     └── your-app
│  │        └── build
│  │           ├── favicon.ico
│  │           ├── index.html
│  │           ├── remoteEntry.js
│  │           └── static
│  │              ├── css
│  │              │  ├── 985.9719a4e8.chunk.css
│  │              │  └── 985.9719a4e8.chunk.css.map
│  │              └── js
│  │                 ├── 57.2db274e7.chunk.js
│  │                 └── 57.2db274e7.chunk.js.map
│  └── terminalOutput
├── 1535800501287718917.commit
├── d
│  └── daemon.log
├── file-map.json
├── lockfile.hash
├── parsed-lock-file.json
├── project-graph.json
├── run.json
└── terminalOutputs
   └── 1535800501287718917

Every time you change your code and run a task, Nx will calculate the hash of the inputs and do one of two things.

If the hash is already present in the cache, it will reuse the stored outputs and replay the terminal output.

If the hash is not present in the cache, it will run the task and store the outputs in a new directory named after the hash. In the example above, 1535800501287718917 is the hash of a build.

Sharing the cache in CI

If you simply, and naively, restore the contents of the cache directory before building your apps on every CI run you'll have your own bootleg Nx Cloud.

In CircleCI this can be done by adding the following steps to your workflow:

# you may want to restore both caches depending
# on whether you're using origin/main or main as
# your defaultBase
- restore_cache:
    name: restore nx build cache
    keys:
      # attempt to load cache from branch specific cache
      - nx-cache-{{ .Branch }}
      # attempt to load cache from main branch if there is
      # no branch specific cache
      - nx-cache-main

- run: nx run-many --target=build

- save_cache:
    name: save nx build cache
    # we include the current epoch so every run will
    # save a new cache
    key: nx-cache-{{ .Branch }}-{{ epoch }}
    paths:
      - node_modules/.cache/nx

With this setup every time you rebuild an app that hasn't changed, it will hit the cache and skip the build.

Done, right? Not quite.

By default, Nx will store the results of a task in the cache for a week and every time your inputs change, that will result in a new cache entry. This means that on every commit your cache will grow. This can get out of hand pretty quickly.

Purging the cache before saving it

Instead of naively saving the entire cache every time, we should purge the cache to keep only the latest results of every task. This will ensure that the cache will have a constant size and the hit rate will remain unchanged.

To do this, we need to know which cache entries are the latest, but thankfully Nx stores the results of the latest run in a run.json file in the cache directory.

The contents of the file in v16 look like this:

{
  "run": {
    "command": "nx affected --target=build",
    "startTime": "2023-07-06T12:54:30.772Z",
    "endTime": "2023-07-06T12:58:55.701Z",
    "inner": false
  },
  "tasks": [
    {
      "taskId": "@your-org/your-package:build",
      "target": "build",
      "projectName": "@your-org/your-package",
      "hash": "8153617374777772515",
      "startTime": "2023-07-06T12:54:30.774Z",
      "endTime": "2023-07-06T12:54:30.788Z",
      "params": "",
      "cacheStatus": "local-cache-hit",
      "status": 0
    },
    {
      "taskId": "@your-org/your-other-package:build",
      "target": "build",
      "projectName": "@your-org/your-other-package",
      "hash": "13783462303774178395",
      "startTime": "2023-07-06T12:54:30.774Z",
      "endTime": "2023-07-06T12:54:30.789Z",
      "params": "",
      "cacheStatus": "cache-miss",
      "status": 0
    }
  ]
}

Regardless of whether the task hit the cache or not, the hash of the task is always present in the run.json file and it points to the latest outputs of the task.

Using the information of this file, you can thus write a small script to purge all other entries and run it before storing the cache in CI.

Extending the example above, in CircleCI this would look something like so:

  - restore_cache:
      name: restore nx build cache
      keys:
        - nx-cache-{{ .Branch }}
        - nx-cache-main

  - run: nx run-many --target=build

  - run: python purge_nx_cache.py

  - save_cache:
      name: save nx build cache
      key: nx-cache-{{ .Branch }}-{{ epoch }}
      paths:
        - node_modules/.cache/nx

I'll leave the implementation of the script as an exercise to the reader.

Conclusion

In this article we've seen how to share the Nx computation cache across CI runs without using Nx Cloud. This is extremely helpful to speed up your CI builds and reduce the cost of running them, since trading CPU time for storage is usually a good deal.