Release 6: Agentic Deploy Pipeline and Module Health Guard¶
Status: Complete | Sprints: 8 | GitHub Release tag: v1.6.0
Release 6 builds the development infrastructure that makes AI-collaborative development practical at scale: a reproducible uv-based toolchain, a fast PC inner loop, compact artifacts, USB-hub parallel flash, and an MCP server so Claude can drive the full build/flash/test cycle from within a conversation. The second half adds runtime safety and observability: a proactive heap guard, graceful allocation refusal, and per-module health dots in the UI.
Release Overview¶
| Area | Highlights |
|---|---|
| Toolchain | pyproject.toml + uv.lock; pre-commit (clang-format + ruff); core_only CMake PAL boundary check |
| Serial hygiene | TIMING gated on 20% change or 60 s silence; MemLive edge detector; GET /api/log ring buffer |
| Fast inner loop | all_pc.py (build + unittest + PC livetest, ~73 s); all_devices.py (flash + ESP32 livetest) |
| Compact artifacts | live-results-*.json 85% smaller; live-results-all.json merged across devices |
| Status docs | docs/status/index.md (per-device table); live-results.md cross-device matrix |
| USB hub scaling | "group" field; parallel flash + test (--workers N); device-type split in live-results.md |
| MCP server | deploy/mcp_server.py with 8 tools; .mcp.json; Claude drives build/flash/test without leaving the chat |
| Memory guard | pal::check_alloc(); disableSelf()/setupOk_; green/red dot with health tooltip in UI |
Sprint 1: Foundation Tooling¶
Goal: reproducible Python environment and compile-enforced PAL boundary so every subsequent sprint inherits clean tooling.
pyproject.toml+uv.lock: pins all transitive deps;uv sync --extra devis the complete setup step..pre-commit-config.yaml: clang-format (C++) + ruff (Python);uv run pre-commit installwires into git hooks.core_onlyCMake OBJECT target: compiles 6 PAL-free core headers with error-triggering stubs; PAL violations become compile errors.- CI lint job updated to
uv sync+uv run pre-commit run --all-files. timeMicros()removed fromTiming.h; 9 callers migrated topal::micros().- PAL discovery:
StatefulModule.hhad hiddenpal::calls via transitive include; explicit#include "pal/Pal.h"added to surface the dependency.Scheduler.h/cpp,ModuleManager.cpp,PhysMap.hbacklogged violations documented.
| Metric | Value |
|---|---|
| Tests | 357/357 (unchanged) |
core_only PAL-clean headers |
6 (Module.h, Timing.h, Logger.h, KvStore.h, Coord3D.h, BuildInfo.h) |
uv.lock |
11 packages pinned |
Sprint 2: Serial Output Discipline¶
Goal: remove periodic serial noise; log state changes, not heartbeats.
Scheduler.cpp: TIMING logged only whentotalMsPerTickchanges by more than 20% or after 60 s silence (lastTimingTotalMs_/lastTimingLogUs_fields).- MemLive edge detector:
memWarnActive_bool; logs on rising edge (WARNING) and falling edge (OK) only. printNetworkInfo()change-gated vialastNetReport_cache.GET /api/log: 64-entry ring buffer (8 KB static);logPush()feeds both ring and WebSocket;logClearRing()for test isolation.- Ring reduced from planned 256 to 64 entries: 32 KB would take ~50% of classic ESP32 free heap.
| Metric | Value |
|---|---|
| Tests | 361/361 (+4 ring buffer cases in test_logger.cpp) |
| Static RAM cost | +8192 B (ring) + ~104 B (4 Scheduler fields) |
Sprint 3: Fast Inner Loop¶
Goal: split all.py so the PC development loop completes without touching hardware.
deploy/all_pc.py:build.py -target pc+unittest.py+livetest.py -type pc; ~73 s warm build.deploy/all_devices.py: build esp32 envs + flash + optionalflashfs+ mem capture + livetest esp32 + summarise.deploy/all.py: thin delegation wrapper;--flashfsforwarded.deploy/_lib.py:run_step()andwait_for_esp32s()shared helpers.py = ["uv", "run"]in all three scripts: project venv always active, CI and local invocations byte-for-byte identical. MCP server (Sprint 7) inherits the same isolation for free.- CI PC job: 4 manual steps replaced by
uv run deploy/all_pc.py.
| Metric | Value |
|---|---|
| Tests | 361/361 (unchanged) |
all_pc.py wall clock |
73 s (3/3 steps passed) |
Sprint 4: Compact Log Files¶
Goal: reduce artifact footprint so deploy/test/ stays readable as device count grows.
deploy/live_suite.pyR.to_dict():assertionskey omitted for all-pass tests; only failures included when a test fails.deploy/livetest.pymerge step:_merge_results()writesdeploy/live/live-results-all.json(array of device entries,"current": true/false); copies tolive-results-all-last-good.jsonon all-pass.deploy/summarise.py: readslive-results-all.json; backward-compatible fallback to per-device files.
| Metric | Value |
|---|---|
| Tests | 361/361 (unchanged) |
live-results-pc.json |
810 lines / 21 KB before; 119 lines / 3 KB after (85% reduction) |
Sprint 5: Consolidated Status Docs¶
Goal: single cross-device test matrix that stays readable at 16 devices.
docs/status/index.md: one row per device, columns for unit tests, live tests, heap_free, fps; 12 lines at 3 devices.docs/status/live-results.md: all devices in one matrix (rows = tests, columns = devices); device-type groups.deploy-summary.md: compressed from 132 to 10 lines (header block + pipeline table only).test-results.md: unchanged — full per-file test list preserved for discoverability and module doc anchor links.heap_free_kbandfpsadded todevice_infoinlive_suite.py; stored inlive-results-all.json.- Also fixed: two ESP32 CI regressions from Sprint 3's
lib_ldf_modechange: switched tolib_compat_mode = soft+ explicitlib_ignore = RPAsyncTCP, ESPAsyncTCP;esp32_footprint.pygraceful return on missing size lines.
| Metric | Value |
|---|---|
| Tests | 361/361 (unchanged) |
docs/status/index.md |
12 lines for 3 devices (target < 80) |
Sprint 6: USB Hub Scaling¶
Goal: parallel flash and test across N devices; group filter for targeting a subset.
devicelist.json:"group"field added to alltest:truedevices.select()bug fixed:default_test_trueandfiltersnow ANDed independently (was silently OR-ing).flash.py:--workers N(default 4);_flash_one(d, esptool) -> bool;ThreadPoolExecutorreplaces sequential loop.livetest.py:--workers N(default 4);_run_esp32_test(d) -> bool;ThreadPoolExecutorfor both ESP32 paths.all_devices.py:--group <name>+--workers N;esp32_envsderived from filtered device set (build and flash always in sync).all_pc.py:summarise.pyadded as step 4.summarise.py:live-results.mdsplit by device type;_short_chip()strips Rev suffix.
| Metric | Value |
|---|---|
| Tests | 361/361 (unchanged) |
Sprint 7: MCP Server¶
Goal: wrap the stable script set as MCP tools so an AI agent can trigger builds, flash firmware, and read test results without leaving the conversation.
pyproject.toml:mcp = ["mcp>=1.0"]optional dependency;uv run --extra mcphandles install automatically.deploy/mcp_server.py: FastMCP server; 8 tools (run_all_pc,run_all_devices,run_build,run_flash,run_livetest,run_summarise,read_status,list_devices). Each spawnsuv run deploy/<script>.pyand returns combined stdout+stderr; non-zero exit appends[exit N]..mcp.json: project-level config; Claude Code picks it up without manual setup.- Tool schema derived automatically from type hints and docstrings by FastMCP.
--merge-portsdeferred: device-management utility, not a pipeline step for an AI agent.
| Metric | Value |
|---|---|
| Tests | 361/361 (unchanged) |
| Tools | 8 tools; list_devices smoke test passes |
What this enables: the full development loop (propose change, implement, build, flash two devices in parallel, run live test suite, read results, diagnose failures, iterate) runs without leaving the chat window.
Sprint 8: Memory Guard¶
Goal: prevent a large module allocation from silently exhausting the heap and leaving the HTTP server unreachable.
pal::check_alloc(bytes, reserve_bytes) -> bool: returnsfalseiffree_heap_bytes() - bytes < reserve_bytes;PAL_HEAP_RESERVE_BYTES = 90 KB(calibrated from live device failure: coversserver.begin~28 KB + WiFi buffers ~24 KB + headroom + fragmentation dead space); alwaystrueon PC.StatefulModule:setupOk_bool +disableSelf()helper +setupOk()accessor;runSetup()resets totrue;runLoop()skipsloop()when!setupOk_.EffectsLayer::allocate_(): guard beforepsram_malloc; with fallback buffer keeps previous size silently;disableSelf()only when no fallback exists.DriverLayer::allocate_(): allocate-before-free; null check; refused alloc keeps previous buffer.DriverLayer::onChildrenReady(): proactivecheck_allocon new geometry; on failure marks the most recently added layout childdisableSelf()(correct red dot attribution: layout that caused the problem shows red, not EffectsLayer).ArtNetOutModule:disableSelf()+ early return on nullpkt_.GET /api/modules:"health"and"setup_ok"fields added to every module entry.- Frontend: green/red dot (
●) next to each module name; tooltip showshealthReport()string — first placehealthReport()output is visible in the UI. - Unified
[MemBoot]/[MemLive]log format:DeltaKB = FreeKB (frag=X%, largest=YKB)across all emit sites. uptime_scontrol uses"time"uiType;fmtTime()helper inapp.jsformats asNd Nh Nm Ns.
| Metric | Value |
|---|---|
| Tests | 364/364 (+3: disableSelf, setupOk reset, DriverLayer refusal) |
| Reserve calibration | 90 KB: 60 KB proved insufficient on MM-C1BC (server.begin costs 28 KB after guard runs) |
Post-sprint esp32dev (MM-C1BC, no PSRAM) memory floor:
| Stage | Free | Largest | Frag |
|---|---|---|---|
| After module setup | ~235 KB | 108 KB | 55% |
| After server.begin | ~149 KB | 80 KB | 47% |
| Live steady-state | 109-126 KB | 60-72 KB | 41-48% |
Live floor of ~109 KB sits only ~19 KB above the 90 KB reserve — enough to run but no room for additional module allocations. Dual check_alloc guard (total free + largest block) and WiFi buffer tuning tracked in Backlog.
Retrospective: Release 6 complete¶
What was proven:
uv run+uv.lockmakes every script invocation reproducible across contributor machines, CI, and MCP tool calls.- The MCP server closes the agentic feedback loop: Claude can drive build, flash, test, and diagnosis without terminal context switches.
- Pre-checking heap with
check_allocbefore a largepsram_mallocprevents the half-initialised module state that previously left HTTP unreachable. - Red dot attribution (layout child, not EffectsLayer) required live device debugging — the
disableSelf()placement reflects operator intuition, not code topology.
Watch points going into Release 7:
- 8 KB log ring buffer is a meaningful static cost on non-PSRAM esp32dev — consider halving to 4 KB.
StatefulModule.hPAL violations (backlogged in Sprint 1) block fullcore_onlycoverage.- MCP tools collect output at the end of a long flash run — streaming progress (
ctx.report_progress()) is the next usability improvement. - OTA flash (no USB cable) not yet implemented;
flash.pystill requires a serial port.