embeddedci/cdtesting

WCET and CI/CD: Integrating Timing Analysis into Embedded Software Pipelines

ssimpler

2026-02-02

10 min read

Integrate WCET checks into CI/CD to prevent timing regressions — practical templates, baselining, and a 2026 roadmap after Vector's RocqStat acquisition.

Hook: Stop timing regressions from breaking your real-time guarantees

If your embedded team ships software that must meet hard deadlines, a single unnoticed change can push a task past its budget and create a latent safety or performance failure. In 2026, with Vector's acquisition of RocqStat and the push to unify timing analysis into mainstream verification toolchains, you can — and should — automate worst-case execution time (WCET) checks in CI/CD. This article gives a practical recipe to integrate timing analysis into embedded pipelines, prevent regressions, and maintain real-time guarantees.

The bottom line (most important first)

Integrate WCET checks into CI/CD pipelines as a gate: build, analyze, compare to a baseline, and fail the merge on regressions. Use a hybrid approach: static timing analysis (RocqStat-style) as the fast, repeatable first line, and measurement-based verification for critical paths. Automate baselining and thresholds, publish machine-readable timing reports, and tie results to your issue tracker and code reviews.

Why now (2026 trends)

Vector's January 2026 acquisition of StatInf's RocqStat brought mature static WCET tech closer to mainstream toolchains via VectorCAST — expect tighter toolchain integration and more CI-friendly interfaces through 2026.
Regulatory momentum (ISO 26262:202x updates, DO-178C addenda, and automotive safety guidance) is driving explicit timing evidence in verification artifacts.
Toolchain consolidation and cloud-native test infrastructure enable running timing analysis on CI runners and ephemeral HIL resources at scale.

How WCET fits into embedded CI/CD — the architecture

Embed WCET checks as part of the verification stage in your CI pipeline. A practical pipeline includes:

Compile and instrument (if needed) for measurement-based tests or produce ELF for static analyzers.
Run static WCET analysis (RocqStat or equivalent) to produce a conservative estimate for functions, tasks, and end-to-end scenarios.
Execute targeted measurement tests on a hardware runner or QEMU to validate assumptions for critical paths.
Compare results to a stored baseline and timing budget; if exceeded beyond a configurable tolerance, fail the CI job.
Publish a timing report (JSON/HTML) and attach it to the merge request and traceability artifacts.

Key design principles

Fail early, fail loudly: Detect regressions in merge checks, not in production.
Baseline and drift control: Keep a tracked golden baseline per release branch and compare PR runs to that baseline.
Actionable outputs: Machine-readable reports with per-function deltas, stack usage, and path contributors.
Performance budget as a living artifact: Store budgets in repo (YAML/JSON) and reference them in CI.
Hybrid verification: Static for scale and coverage; measurement for validating critical assumptions (caches, timing anomalies).

Practical CI templates and patterns

The following templates demonstrate common CI platforms. Tailor the commands to your actual toolchain/API (VectorCAST + RocqStat integration is expected to expose CLI/CI hooks through 2026).

GitLab CI example

stages:
  - build
  - wcet
  - test

build:
  stage: build
  script:
    - make all TARGET=hw
  artifacts:
    paths:
      - build/app.elf

wcet_static:
  stage: wcet
  image: ghcr.io/yourorg/wcet-runner:2026
  script:
    - ./tools/rocqstat-cli analyze --input build/app.elf --output reports/wcet.json
    - ./tools/wcet-compare --baseline baselines/wcet.json --current reports/wcet.json --threshold 1.05 --fail-on-regression
  dependencies:
    - build
  artifacts:
    paths:
      - reports/wcet.json

hw_measurements:
  stage: test
  script:
    - ./tools/deploy-to-runner --target runner-1 build/app.bin
    - ./tools/measure-wcet --runner runner-1 --scenario sensor_read
    - ./tools/measure-compare --baseline baselines/sensor_read.json --current reports/sensor_read.json --fail-on-regression
  when: manual  # optional: make hardware measurements manual for cost control
  dependencies:
    - build

Notes:

rocqstat-cli is a placeholder for whichever CLI is provided by RocqStat/VectorCAST integration; it will likely provide project-level or file-level analysis outputs in JSON. Adjust tooling names to your installed versions.
Thresholds (1.05 = 5% tolerance) should be set by platform owners and traceable to requirements.

GitHub Actions snippet

name: CI with WCET
on: [pull_request]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Build
        run: make all TARGET=hw
      - name: Upload ELF
        uses: actions/upload-artifact@v4
        with:
          name: app-elf
          path: build/app.elf

  wcet:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Download ELF
        uses: actions/download-artifact@v4
        with:
          name: app-elf
      - name: Run static WCET
        run: |
          ./tools/rocqstat-cli analyze --input build/app.elf --output reports/wcet.json
          ./tools/wcet-report-to-comment reports/wcet.json --token ${{ secrets.GITHUB_TOKEN }}
      - name: Compare baseline
        run: ./tools/wcet-compare --baseline baselines/wcet.json --current reports/wcet.json --threshold 1.05 --fail-on-regression

Baselining and regression policy

Baseline management is the single highest-impact practice for preventing false positives and maintaining trust in automated timing checks.

Baseline strategy

Keep a per-branch baseline file (JSON) checked into a /baselines folder or managed by a release artifact registry.
When a planned performance change is approved (e.g., algorithmic improvement), update the baseline via a controlled change request that includes the timing evidence.
Use semantic versioning for baseline artifacts tied to releases.

Comparator rules

Set absolute and relative thresholds (e.g., absolute: 2 ms, relative: 5%).
Fail CI on violations unless a reviewer marks the regression as acceptable and a baseline update is merged.
Provide a “soft failure” mode: the pipeline opens an automatic issue and comments on the MR with the delta; this mode is useful for non-critical branches.

Measurements, models, and why static analysis helps

In 2026, teams will use a mix of verification techniques. Static timing analysis like RocqStat provides:

Path coverage: Coverage of all feasible control-flow paths without exhaustive execution.
Conservative estimates: Safe upper bounds that account for hardware features modeled (caches, pipelines, multicore interference).
Scalability: Fast enough to run on CI for every PR.

Measurement-based testing validates static assumptions, especially on real hardware where microarchitectural effects or compiler codegen patterns matter. Use both — static WCET as the gatekeeper and spot-check measurements for high-risk updates.

Instrumentation, observability, and reports

WCET checks are only useful if results are traceable and actionable. Publish these artifacts in CI:

JSON timing reports: per-function and per-scenario upper bounds and deltas vs baseline.
HTML human-readable summaries for reviewers.
Diffs highlighting which code paths increased and by how much (stack + path contributors).
Links to reproduction scripts and HIL run IDs for measurements.

Example JSON result schema (minimal)

{
  "version": "1.0",
  "binary": "app.elf",
  "timestamp": "2026-01-18T12:00:00Z",
  "scenarios": [
    {
      "name": "sensor_read",
      "wcet_us": 3200,
      "baseline_us": 3100,
      "delta_pct": 3.2,
      "paths": [
        {"id": "p1","wcet_us": 1800},
        {"id": "p2","wcet_us": 1400}
      ]
    }
  ]
}

Infrastructure as Code (IaC) for timing labs

Bring reproducible test hardware and runners under IaC. You don't need to provision expensive physical HIL for every run — use emulator-based tests where possible and reserve HIL for nightly/regression runs.

Terraform example (provision ephemeral test runner)

provider "aws" {
  region = "us-west-2"
}

resource "aws_instance" "wcet_runner" {
  ami           = "ami-0abcdef1234567890"
  instance_type = "c5d.4xlarge"
  tags = {
    Name = "wcet-runner-${var.branch}"
  }
}

Attach unique tags, install required cross toolchains and RocqStat/VectorCAST agent via cloud-init, and destroy runners after runs. For secure regulated environments, integrate with on-prem HIL orchestrators and use Terraform providers for VLANs and access control — or consider a community-cloud/co‑op model for shared governed infrastructure.

Handling multicore and interference in CI

Multicore WCET and interference analysis remain hard problems. Practical steps for CI:

Start with single-core, partition-critical tasks to cores where possible.
Model and analyze interference explicitly for shared buses, caches, and memory contention in static analysis; include the models in your version control.
Where interference modeling is impractical, use measurement-based stress tests on representative load and include worst-case observed values as separate evidence files.

Real-world example: Automotive ECU team (fictionalized case study)

Background: A mid-size automotive company had sporadic timing spikes after code refactors. They needed a CI-side guardrail without slowing down their dev loop.

What they did:

Added a RocqStat-based static analysis step to PR pipelines to compute WCET for critical tasks.
Stored a per-branch baseline and set a 3% relative tolerance for non-safety-critical paths, and 0% tolerance for safety-critical tasks tied to ISO 26262 requirements.
Automated comment generation on PRs showing per-function deltas and recommended remediation links.
Kept a nightly job that runs hardware measurements across variations and updates the baseline after an approved performance change.

Results within three months:

Timing regressions were caught in PRs rather than late in integration tests.
Average time-to-merge improved because less rework was required later.
Traceability artifacts helped during audits and reduced manual effort for compliance reports.

Automation recipes and tips for maintainability

Automate model updates: When hardware or compiler changes occur, schedule re-analysis and baseline validation jobs and gate the rollout.
Limit scope for PR runs: For speed, analyze only changed modules; run full-system WCET on nightly builds.
Cache results smartly: Store previous analysis outputs as artifacts keyed by commit hash to speed comparisons.
On-call alerting: Automatically create issues with full artifacts for traceability and assign to platform owners on regressions — tie this into your incident response workflows.
Use feature flags: For large refactors that might cause temporary regressions, use feature-flagged merges with performance reviews and a plan to bring timing back under budget.

The human side: processes and governance

CI automation is powerful, but it must be backed by governance:

Define owners for timing budgets per module and scenario.
Document acceptable tolerances and how to request baseline updates.
Include WCET evidence in your change control and release notes.
Train reviewers to read timing diffs — make the output digestible with highlighted hotspots and suggested fixes (e.g., inline loop bounds, algorithmic alternatives). For reviewer tooling and quick research, keep a list of recommended browser tools like the Top 8 research extensions.

Future-proofing: what to expect in 2026–2027

With Vector integrating RocqStat into VectorCAST in 2026, expect:

Tighter IDE/CI plugins with standardized JSON/Protobuf timing artifacts that are CI-friendly.
More hybrid analysis capabilities that combine static proofs with measured annotations.
Better support for multicore interference models and automated model calibration using telemetry from fleet devices and HIL labs.
More regulated tool qualification guidance for timing analysis to ease inclusion in safety cases (ISO 26262/DO-178C workflows).

Advanced strategies for large codebases

For teams with thousands of modules:

Prioritize: classify code into critical/non-critical and apply strict CI gates only to critical modules.
Incremental analysis: use change impact analysis to only compute WCET for affected paths.
Partitioning: enforce temporal partitioning through RTOS configs and validate partitions with separate analysis runs.
Model-based requirements: link timing budgets directly to higher-level requirements in your requirements management tool and automate traceability reports.

Mistakes to avoid

Relying solely on measurement-based tests in CI — they miss unexecuted worst-case paths.
Setting thresholds too tight and creating noisy CI failures; this erodes trust in automation.
Not versioning timing models — changes to cache or CPU models must be auditable.
Failing to integrate artifacts into your change control and audit trails — timing evidence must be part of your release artifacts.

Actionable checklist — get this into your CI in 8 steps

Add a static WCET analysis step (rocqstat or equivalent) to PR pipelines with a machine-readable report output.
Create and commit a per-branch baseline file; define absolute and relative thresholds.
Publish timing reports as pipeline artifacts and attach to merge requests automatically.
Fail the merge when core IR-reported tasks exceed budgets; provide a soft-fail mode for non-critical branches.
Run nightly full-system static and measurement-based verification and update baselines via change requests.
Provision ephemeral runners via IaC for reproducible measurements, and destroy after runs to control cost. Start with cloud VMs and scale to on-prem HIL as needed.
Implement automated issue creation and owner assignment on regression detection.
Train reviewers and add timing evidence to your release and audit artifacts.

"Timing safety is becoming a critical requirement" — the industry signal from Vector’s acquisition of RocqStat is clear: timing analysis must move into mainstream CI practices. Make it part of your pipelines now.

Final thoughts: why integrating WCET into CI is non-negotiable

In 2026, timing analysis tools are becoming CI-first. Teams that adopt integrated WCET checks reduce late-stage surprises, accelerate safe delivery, and improve auditability for safety cases. The Vector + RocqStat move signals a maturation: expect richer CI integrations, standardized artifacts, and more automation-friendly workflows. If your product depends on real-time guarantees, integrating WCET into your CI/CD pipeline is no longer optional — it's engineering hygiene.

Call to action

Ready to add WCET checks to your pipeline? Start with the 8-step checklist above. If you want a jump-start: clone our sample CI repo (includes GitLab and GitHub Actions templates, baseline management scripts, and Terraform examples) or contact the simpler.cloud team for a hands-on workshop to tailor timing automation for your stack and compliance needs.

simpler

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.