Back arrowBlog

Automating the Monorepo

The monorepo solves a lot of problems for new teams but may bring up a number of new issues. At Windsor, we've tamed the beast and it's made our team a lot more productive. Here's how we did it.

Let's start by laying out the constraints we're working with:

  1. Most of our codebase is written in JavaScript
  2. Each directory in the repo contains a single module that can be deployed
  3. The modules have interdependencies

    This isn't always through an npm dependency. For instance, our frontend depends on the schema.graphql file from our API module, and the API depends on a prisma module.

  4. We have a variety of deployment targets, some get shipped to Vercel, some to Kubernetes on GKE, and some are published to npm for example.
  5. We use Github Actions to run our CI

Structure

  • repo
    • _repo
    • .github
    • module1
    • module2
    • .gitignore
    • .pnp.js
    • .prettierignore
    • .yarnrc.yml
    • package.json
    • yarn.lock
Select a file to see its contents

Every module is a top level directory

This is purely for aesthetic reasons. Viewing the repo on Github is much nicer when you can see the modules and the README on the main page rather than clicking into a packages folder.

Yarn Workspaces

At the heart of our monorepo setup is Yarn 2 with workspaces. In addition to all the benefits of workspaces, this easily solves the issue of npm based interdependencies.

Further, the root package.json is useful to house global dependencies like prettier or husky.

However, Yarn 2's pièce de résistance is the ability to auto-update dependencies across the monorepo by just running yarn version <major|minor|patch>.

Global Scripts

Here's where the automation happens. The root package.json can have scripts, just like any other package.json so this is where we throw in the commands to maintain our monorepo

  • proper

    Running yarn proper at the root simply checks that all our modules are configured correctly and meet the "specification". For example, we enforce that every module has a package.json with a build, lint and test script, and that private is set to true.

    This script isn't limited to checking the package.json though. For instance, we also assert that each module has a shipit.sh script in the folder (the purpose of which is explained below).

  • build

    Running yarn build <module/path> builds a DAG of all the dependencies for the given module. It then runs yarn build for each dependency in the right order.

    Without this script it's hard to use workspace: references or yarn version . We want to guarantee that we're using the latest code across all our dependencies when we build.

  • workflows

    Running yarn workflows generates all our Github Workflow files. Yup, these are completely generated since they're very formulaic and setting up a new manifest for each project can be a pain.

    Here's also where the yarn proper script comes in handy since we can ensure that each module has commands to build, lint and test it so the github action can just execute those commands.

    The script additionally requires each module to maintain a _repo object in the package.json that lists deps and a type.

    • deps reference non npm dependencies that the module has. This list is appended to other dependencies within the repo (extracted from dependencies in package.json) and is used to generate the on.pull_request.paths list in the CI manifest. Hence, any change in any module ensures that everything dependent on the module is also tested on CI during a push.
    • type lets us declare an alternate "runtime". We use this to generate build steps that run something besides yarn test and yarn build (for ex. docker build when the type is docker)
  • repo

    Running yarn repo <cmd> adds _repo/scripts to the PATH before running the cmd. This way we can house any useful bash or deno scripts inside that folder and reference them easily.

  • format

    yarn format simply formats the entire codebase using prettier (via pretty-quick) and shfmt.

shipit

Each module has a shipit.sh script (ensured through yarn proper). This script defines the steps needed to ship that module to the right target. For example, running vercel for a Vercel deployment, or running skaffold run for a kubernetes deployment. This is primarily useful when shipping code to dev and staging environments during development, but it's also used in CI to ship to production.

Having a standard interface like shipit.sh means anyone can jump into a project for the first time and immediate ./shipit.sh <username> to ship the code to their dev instance without needing to read any docs.

Automating the automation

For everything to work smoothly, we need to run these global scripts often. That's a nuisance. So, we added one more piece to yarn workflows. It generates a CI manifest file called repo.yml that simply runs yarn proper and yarn workflows -c (a mode which checks that running yarn workflows won't cause any changes - similar to yarn --immutable).

This meta script helps make sure the repo as a whole is following the specification. It makes it impossible to, for instance, create a new module without generating a CI manifest, or add a dependency without updating the github workflow file to reflect that.

Simple Wins

As a startup with a small team, navigating through code quickly matters just as much, if not more, than at a large tech company. A simple system built on a few scripts and conventions lets our engineers move extremely fast.

We can create new modules easily, or update existing ones - trusting automation to make sure everything works as expected . There's no need to write additional CI steps or a README for deployment instructions, and the monorepo just becomes a natural part of our dev workflow instead of being a nuisance.

Find all your user data in one place. Invites go out every week.

Sign Up