firq-dev-website/src/data/blog/unlighthouse-ci.md at 120af666cb7539540d94b2f07e429e61fa909266

Firq/firq-dev-website

Fork 0

Firq 120af666cb

Testing astro-critical-css

2025-04-14 22:32:04 +02:00

12 KiB

Raw Blame History

title

pubDate

description

author

Unlighthouse

First of all, I want to give a big shout-out to Harlan Wilton (harlan-zw) for making unlighthouse - it makes analyzing larges sites so much easier. The package can be installed from npm and enables you to easily run Lighthouse for each of your subpages - even from a CI environment. This is a huge improvement given that the best alternative is the Lighthouse docker container that is provided by GitLab.

For ease of use, I also build my own version of that GitLab docker container, but with unlighthouse instead. This allows me to easily use it in other CI jobs, with pinned dependencies and resistance against random changes. However, it also requires me to regularly check for updated chromium versions, as this is usually what causes issues when updating the container.

The container is available from here and can be pulled with the following url:

docker pull forgejo.neshweb.net/ci-docker-images/unlighthouse:latest

Dockerfile for the unlighthouse container

FROM node:22.14-bookworm

LABEL authorname="firq"
LABEL description="unlighthouse container for ci-based lighthouse testing"
WORKDIR /unlighthouse

ENV CHROMIUM_VERSION="135.0.7049.84-1~deb12u1"
ENV UNLIGHTHOUSE_VERSION="0.16.3"
ENV NODE_ENV='production'

# Update path so executable can be run globally
ENV PATH="/unlighthouse/node_modules/.bin:${PATH}"

RUN apt-get update && apt-get -y install --no-install-recommends chromium=${CHROMIUM_VERSION} procps && rm -rf /var/lib/apt/lists/*
RUN npm install @unlighthouse/cli@${UNLIGHTHOUSE_VERSION} puppeteer

Automation with Forgejo

However, running unlighthouse manually after each page build is too much of a hassle - so it was time to automate it. For that reason, I created a corresponding actions job that would run the checks each time a preview version of the site was being built.

In the beginning, I would run the test against the actual deployed instance, but with time it became increasingly obvious that this would be inefficient, mainly because I would run this using two different tags (x.x.x-pre.x and x.x.x-ulh.x) to trigger the corresponding CI runs.

Regarding this, I decided to instead run everything using a service container. This means that the unlighthouse job runs in the main job task, while the site is running as a service container in parallel. This allows access to the container using an alias (website) instead of a URL or IP address. (Note: The yaml is a bit reduced, meaning lines that only provide visual context in the UI are removed).

jobs:
  unlighthouse:
    runs-on: docker
    container: 
      image: forgejo.neshweb.net/ci-docker-images/unlighthouse:0.16.3
    services:
      website:
        image: forgejo.neshweb.net/firq/firq-dev-website:${{ inputs.containertag }}
    steps:
      - uses: https://code.forgejo.org/actions/checkout@v3
      - run: |
          while [ "$(curl -o /dev/null -s -w '%{http_code}' http://website:8081)" -ne 200 ];
            do echo "Waiting...";
            sleep 5;
          done;
      - run: unlighthouse-ci --site "http://website:8081"
      - run: find ./unlighthouse-reports -type f | xargs sed -i "s|http://website:8081|https://preview.firq.dev|g";

So what is happening here? The first few lines are just there for configuring the runner and starting up the service-container. The container tag is passed from the upstream pipeline, making this setup a lot more flexible. Afterwards, two things happen: The code in the git repo gets checked out, and the job gets put on hold until the service-container is available. This is necessary, as sometimes the service container takes a few seconds to start up.

Afterwards, unlighthouse runs against the given site inside the service-container, using the local config for any other settings. After the run concludes, the artifacts inside the unlighthouse-reports folder get search-and-replaced once to change the url from the service-container to the preview site. This is mainly for visual consistency, but also has the benefit of making the "Check with PageSpeedInsights" button work.

To enable this to work more fluently, I decided to run the unlighthouse CI as a downstream pipeline, which was triggered by one of the CI jobs after the build-and-push step concludes. This also allows me to easily pass the new container name to the downstream pipeline for usage.

run-unlighthouse:
  needs: [ build-site ]
  if: success()
  runs-on: docker
  steps:
    - name: Launch workflow
      run: |
        payload="{\"ref\": \"${GITHUB_REF_NAME}\", \"inputs\": { \"containertag\": \"${GITHUB_REF_NAME}\" }}"
        curl -X "POST" \
        -H "accept: application/json" \
        -H "Content-Type: application/json" \
        -H "Authorization: token ${GITHUB_TOKEN}" \
        -d "${payload}" \
        "${GITHUB_API_URL}/repos/${GITHUB_REPOSITORY}/actions/workflows/unlighthouse.yml/dispatches" -v

And with that, the general CI setup concludes. I won't go into too much detail about how the deployment of the reports works, as it is similar to the main site. If you are interested, see the repository linked here for further details.

Fine-tuning the config

After setting everything up, it took some good time until I managed to fine-tune the configuration that unlighthouse uses. Don't get me wrong, running with the defaults would probably also work, but from my experience the defaults are set so strict that a perfect score is unobtainable - even if for example PageSpeedInsights doesn't see any issues.

My strategy for the config was to copy the settings that PageSpeedInsights uses, as this seemed like a good baseline for testing. This means that the Lighthouse settings look like this:

lighthouseOptions: {
  throttlingMethod: 'devtools',
  throttling: {
    cpuSlowdownMultiplier: 4,
    requestLatencyMs: 150,
    downloadThroughputKbps: 1638.4,
    uploadThroughputKbps: 1638.4,
  },
  screenEmulation: { width: 412, height: 823, deviceScaleFactor: 1.75 },
  skipAudits: [ 'is-on-https', 'redirects-http', 'uses-http2' ],
}

As mentioned previously, as I am using a docker network, I can't easily use HTTPS and HTTP/2. Given that, I decided to disable the corresponding audits, as I know for a fact that this always works for the real deployment.

The other configuration steps are for Puppeteer, which handles the simulation of the site in a CI environment. This is really straightforward:

puppeteerOptions: { 
  args: [ '--no-sandbox', '--disable-setuid-sandbox' ]
},
puppeteerClusterOptions: { 
  maxConcurrency: 1
},

maxConcurrency ensures that there is only one inspection task running at the time, as having multiple run in parallel results in some weird situations where the report fails to correctly generate.

The rest of the configuration is for unlighthouse itself, regarding both the scanner and the ci environment, as well as the configuration of outputs and sites to check

ci: {
  budget: 50,
  buildStatic: true,
},
scanner: {
  sitemap: true,
  dynamicSampling: false,
  samples: 3,
},
outputPath: 'unlighthouse-reports',
cache: true,
urls

However, there is one interesting entry there: urls. This is a dynamic list of URLs generated from the sitemap, which happens earlier in the config (what a blessing export default async config-files are). This is necessary, as scraping the site from the sitemap directly is not working in my environment (the crawler finds https://preview.firq.dev, while I need http://website:8081 for it to run). After some digging, I raised that usecase with Harlan, who proceeded to enable me to do exactly that in no time (thank you again for doing this - see issue 248 here)

This is where the magic snippet comes in, which 1. fetches the sitemap, 2. replaces the URLs and 3. fetches each of the URLs once to warm up the serve webserver to ensure that the server-caching correctly works (improves the performance by a lot).

const sitemap = await (await fetch('http://website:8081/sitemap-0.xml')).text();

const urls = sitemap.match(/<loc>(.*?)<\/loc>/g)!.map(
  (loc) => loc.replace(/<\/?loc>/g, '').replace(/https:\/\/firq.dev/g, 'http://website:8081')
); 

for (const url of urls) { await fetch(url) };

Afterwards, urls can be used in the config to provide unlighthouse with a list of urls to check - already replaced and modified to work with the docker network.

View the whole config here

(It's also really nice that unlighthouse provides type hints for the config, makes figuring out where goes what a lot easier).

import type { UserConfig } from 'unlighthouse'

export default async (): Promise<UserConfig> => {
  /* fetch sitemap from debug container */
  const sitemap = await (await fetch('http://website:8081/sitemap-0.xml')).text();

  /* format URLs to work with debug container */
  const urls = sitemap.match(/<loc>(.*?)<\/loc>/g)!.map(
    (loc) => loc.replace(/<\/?loc>/g, '').replace(/https:\/\/firq.dev/g, 'http://website:8081')
  ); 

  /* ensure serve is already "warm", preventing startup lag that reduces performance */
  for (const url of urls) { await fetch(url) };

  /* actual config */
  return {
    lighthouseOptions: {
      throttlingMethod: 'devtools',
      throttling: {
        cpuSlowdownMultiplier: 4,
        requestLatencyMs: 150,
        downloadThroughputKbps: 1638.4,
        uploadThroughputKbps: 1638.4,
      },
      screenEmulation: {
        width: 412,
        height: 823,
        deviceScaleFactor: 1.75,
      },
      skipAudits: [ 'is-on-https', 'redirects-http', 'uses-http2' ],
    },
    puppeteerOptions: {
      args: [ '--no-sandbox', '--disable-setuid-sandbox' ],
    },
    puppeteerClusterOptions: {
      maxConcurrency: 1
    },
    ci: {
      budget: 50,
      buildStatic: true,
    },
    scanner: {
      sitemap: true,
      dynamicSampling: false,
      samples: 3,
    },
    outputPath: 'unlighthouse-reports',
    cache: true,
    urls
  }
}

What's next?

After setting this whole thing up over the course of multiple months, with a variety of issues and shortcomings along the way, I hope that this is now ready for good. I will, however, write a Forgejo Action for me to reuse in the future, as this would enable me to easily test other sites with the same concept.

If you want to check out the whole thing in action, check out the website repository here . In addition, you can find the reports that get generated here at unlighthouse.firq.dev

Anyway, I wish I gave you an interesting insight in how Unlighthouse ensures good site quality - and how YOU can also profit from website testing in your CI.

12 KiB Raw Blame History

Unlighthouse

Automation with Forgejo

Fine-tuning the config

What's next?

12 KiB

Raw Blame History