Skip to content

Release 8: Dynamic Controls, Deploy Pipeline, and Runtime Hardening

Theme: Release 8 adds dynamic control schemas (rebuild a module's control set at runtime so the UI shows only relevant parameters), a complete deploy pipeline overhaul (log-to-docs data flow, browser deploy UI, MCP tools), and a series of runtime hardening improvements: ArtNet unicast and persistent sockets, WiFi/WS heap stability, namespace projectMM for FastLED compatibility, NTP time sync, and generic module auto-wiring via autoWireKeys().


Release Overview

What was delivered in Release 7 (build on this)

Strength Notes
OTA firmware update FirmwareUpdateModule: file upload + GitHub releases tab; POST /api/firmware
CI release pipeline Tagged releases + nightly pre-release with firmware assets on GitHub
Windows support Native .exe build; projectMM-pc-windows.zip in CI artifacts
Scenario baselines Hardware --update-baseline run; "extends" inheritance; wired into all.py
Static RAM hardening Per-device LOG_RING_SIZE; WiFi buffer tuning; dual check_alloc guard
Log frontend panel WS push of ring buffer entries; collapsible log UI

What Release 8 addresses

Problem Sprint
Control schema is fixed at setup() time; irrelevant parameters always visible regardless of selected type Sprint 1 (Dynamic controls: clearControls(), rebuildControls(), early WS flush), complete
Static RAM column in techdebt monitor always shows 0 (parser bug); no accounting of what consumes the 51 KB ESP32 RAM; Notable Findings have no action owners Sprint 3 (RAM accounting, parser fix, actions table), complete
classSize() misses runtime heap (controls_[] array, pendingProps_ doc); large char[] struct members inflate classSize; scanner blind to allocations in private helpers Sprint 4 (baseHeapUsage, char[] audits, scanner improvements), complete
Deploy pipeline grew to 17+ scripts with no architecture overview; steps produced no status pages; techdebt.py name misleading; orchestrators monolithic Sprints 5-10 (full log→md pipeline, orchestrator restructuring, naming cleanup), complete
No interactive way to trigger individual deploy scripts; MCP tools covered only orchestrators; no AI-assisted log analysis; deploy.md was CLI-first with no visual overview Sprint 11 (browser deploy UI, run_script/read_log MCP tools, deploy.md overhaul), complete
struct RGB in global namespace collides with FastLED's enum EOrder { RGB=0012 }; ArtNetOutModule hardwired to DriverLayer; unicast destination not configurable; static IP fields always visible regardless of DHCP/static mode; ArtNet frames stutter on PC (per-packet socket lifecycle, macOS heap scan in hot path) Sprint 13 (namespace projectMM, ArtNet generic source + unicast + fps_limit_ + persistent socket, Ethernet rebuildControls, mkdocs kill-on-stop, macOS heap cache), complete
Consecutive rebuildControls() calls only show in the browser once (WS shared-buffer in-flight corruption); no wall-clock time on device; per-type strcmp chains in ModuleManager not extensible Sprint 14 (schema double-buffering, NtpModule, autoWireKeys()), complete

Sprints

Sprint Goal
Sprint 1 Dynamic controls: clearControls(), rebuildControls() virtual, early WS schema flush
Sprint 2 Technical-debt monitor: per-module metrics (LOC, function count, complexity, static RAM, heap/blocking violations) as a CI script
Sprint 3 RAM accounting balance, fix static RAM parser, Notable Findings actions, Logger ring buffer reduction
Sprint 4 baseHeapUsage() column, char[] to std::array audits, scanner improvements for private helpers
Sprint 5-10 Deploy pipeline consolidation: full log→md data flow, orchestrator restructuring, naming cleanup — complete
Sprint 11 Browser deploy UI, run_script/read_log MCP tools, erase_flash.py, deploy.md overhaul — complete
Sprint 12 ESP32 WiFi heap stability: pre-allocated WS text buffer, WIFI_STA boot mode, AP management guards; network.md — complete
Sprint 13 namespace projectMM RGB wrap, ArtNetOutModule generic source + unicast + fps_limit_ + persistent socket, EthernetModule rebuildControls, mkdocs kill-on-stop, macOS heap cache — complete
Sprint 14 WsServer schema double-buffering, NTP time sync module, generic auto-wiring via autoWireKeys()

Sprint 1: Dynamic Controls

Scope: Allow a module to rebuild its control schema at runtime in response to a control value change. The primary use case: a type selector control switches between effect variants, and only the parameters relevant to the active type are shown. The control set is rebuilt without a full module restart.

Motivation

Today, addControl() is called once in setup() and the schema is fixed for the lifetime of the module. A module that supports multiple effect types must expose all parameters for all types simultaneously, cluttering the UI and confusing operators. The fix: make the schema a function of the control values, rebuilt on demand.

Design

clearControls(system = false)

Added to StatefulModule. Iterates the registered controls_[] descriptors and removes all entries that are not marked system. Before removing each descriptor, writes the current value of the backing variable back into the pendingProps_ stash (keyed by control name). This means a subsequent addControl(var, key, ...) call for the same key restores the last operator-set value automatically — values are preserved across rebuilds even when the control temporarily disappears.

System controls (enabled) are marked at registration time with a system flag in ControlDescriptor. clearControls() skips them unconditionally.

rebuildControls() virtual

New virtual method on StatefulModule; default implementation is a no-op (all existing modules continue to work unchanged). Modules that want dynamic controls override it:

void rebuildControls() override {
    clearControls();
    addControl(type_, "type", "select", {"Ripples", "Lines", "Sine"});
    if (type_ == EffectType::Ripples) {
        addControl(speed_,  "speed",  "slider", 0.1f, 10.0f);
        addControl(radius_, "radius", "slider", 1.0f, 50.0f);
    } else if (type_ == EffectType::Lines) {
        addControl(speed_,  "speed",  "slider", 0.1f, 10.0f);
        addControl(count_,  "count",  "slider", 1,    20);
    }
}

void setup() override {
    rebuildControls();   // replaces direct addControl() calls
}

void onUpdate(const char* key) override {
    if (strcmp(key, "type") == 0) rebuildControls();
}

Modules that do not need dynamic controls keep calling addControl() directly in setup() — no migration required.

Early WS schema flush

After rebuildControls() finishes, the UI must reflect the new schema immediately rather than waiting up to 1 s for the next periodic push. Implementation: clearControls() sets a schemaDirty_ flag on StatefulModule. The main loop checks schemaDirty_ across all modules and, if set, sends a {"t":"schema","modules":[...]} WS push using getModulesJson() (full schema including control types, options, min/max, and current values) and clears the flag. On a clean tick, the periodic 200 ms push uses getStateJson() (flat key/value state) as before. Natural debounce: a burst of rebuildControls() calls within one tick produces exactly one push.

A dedicated {"t":"schema"} message type is required because getStateJson() sends only flat {key:value} pairs; handleStateUpdate() in the frontend updates existing DOM elements but cannot add or remove controls. When rebuildControls() changes the control set, the frontend must call render() to rebuild the card from scratch.

State persistence interaction

saveState() and loadState() iterate the registered descriptors. After a rebuild, only the currently registered controls are persisted — parameters for inactive types are not written to the state file. On the next load, pendingProps_ carries any previously saved values; addControl() applies them if the key matches a registered control after rebuildControls() runs. A type control persisted in state is applied before rebuildControls() is called (via the existing addControl stash mechanism), so the correct variant's parameters are registered and restored on first boot.

Sprint 1 Scope Definition of Done

  • ControlDescriptor gains bool system field; StatefulModule::runSetup() sets it when registering enabled
  • clearControls() removes non-system descriptors; saves current values to pendingProps_ stash before removal
  • rebuildControls() virtual added to StatefulModule; default is no-op; existing modules compile and behave identically
  • schemaDirty_ flag set by clearControls(); main loop early-flush path clears it and sends a {"t":"schema","modules":[...]} WS push
  • Reference implementation: one new module (e.g. MultiEffectModule or adapted existing effect) demonstrating type selector + conditional parameters
  • Unit tests: rebuild preserves values of re-registered controls; rebuild discards values of removed controls; system controls survive clearControls(); schemaDirty_ triggers exactly one early flush per rebuild burst
  • Frontend: {"t":"schema"} handler added; calls render(msg.modules) to rebuild all cards from the full schema
  • All prior unit tests still green

Complexity estimate: Low-Medium (2/5). The stash mechanism already exists; clearControls() is a small loop; the early flush reuses the existing push path. The trickiest part is the state-persistence ordering (type value applied before rebuild runs).


Result

Metric Value
Unit tests 399/399 pass (8 new tests added)
PC build Clean (0 warnings)
ESP32dev build Clean (0 warnings); BSS 16.3% (53 KB, down from 21.3% / 70 KB after static wsBuf removed)
ESP32s3 build Clean (0 warnings)
Live tests (PC) 15/15 all passing
Live tests (MM-70BC) 15/15 all passing
Live tests (MM-C1BC) 12/15 (hardware capacity limits: 64x64 OOM, fps below 1000 on 16x16, 4-layer OOM on classic ESP32)

Definition of Done

  • ControlDescriptor gains bool system = false field; runSetup() sets it after registering enableddone
  • clearControls() preserves system controls, saves non-system values to pendingProps_ stash, sets schemaDirty_ when controls are actually removed — done
  • rebuildControls() virtual added to StatefulModuleBase; default is no-op; all existing modules compile and behave identically — done
  • schemaDirty_ flag; ModuleManager::hasSchemaDirty() / clearSchemaDirty(); WS broadcast loop in main.cpp and AppSetup.cpp sends {"t":"schema","modules":[...]} on dirty tick, getStateJson() array on periodic tick — done
  • Reference implementation: SineEffectModule adapted with type selector (Sine / Ripples), rebuildControls(), and onUpdate("type")done
  • Unit tests: rebuild preserves values of re-registered controls; rebuild does not affect unrelated fields; system controls survive clearControls(); schemaDirty_ set/cleared correctly; burst produces exactly one flag — done (7 new test cases)
  • Frontend: {"t":"schema"} message type handler added to app.js; calls render(msg.modules) to rebuild all cards — done
  • All prior unit tests still green — 399/399
  • Static wsBuf[16384] removed from AppSetup.cpp; both WS push branches now allocate on demand via heap_caps_malloc / heap_caps_freedone
  • pal::net_early_init() calls Network.begin() before scheduler.setup() to guarantee the TCP/IP stack is ready before any module opens sockets — done
  • DeviceDiscovery::setup() guards broadcastPresence_() behind sock_ >= 0; loop() retries udp_bind() when sock_ < 0done

Retrospective

What went well:

  • The pendingProps_ stash already existed and worked without modification — clearControls() just needed to write into it before removing each descriptor.
  • The runSetup() full-wipe / clearControls() mid-lifecycle split was clean once the two call sites were separated. Inlining the wipe in runSetup() was the right call.
  • Adapting SineEffectModule rather than writing a new module gave immediate test coverage for a real effect and kept the scope small.
  • The schemaDirty_ "only set when controls are actually removed" rule surfaced naturally from a failing test: first-call-from-setup had no prior controls, so the flag should not fire on initial build.

What was tricky:

  • The schemaDirty_ flag initially fired on the first rebuildControls() call from setup() (because clearControls() always set it). The fix — only set the flag when controlCount_ > kept — is semantically correct (no prior schema means no schema change) and made the test clean.
  • The kTypes / kWaveforms static constexpr arrays required the kTypeCount companion so addControl(uint8_t&, key, const char* const*, count) received a correct count without magic numbers.
  • hasSchemaDirty() and clearSchemaDirty() iterated owned_ without holding controlMutex_. On PC (multi-threaded HTTP server running at 400K+ fps), this created a data race with concurrent removeModule() calls that modify owned_ under the mutex. The server crashed intermittently mid-scenario after the WS client connected. Fix: add std::lock_guard<std::mutex> lk(controlMutex_) to both functions, matching the lock discipline used by getStateJson() and every other owned_ iterator.
  • The Design section claimed "no new WS message type is needed" — this was wrong. getStateJson() sends only flat {key:value} pairs; handleStateUpdate() in the frontend updates existing DOM elements by key lookup and cannot add or remove controls. When rebuildControls() changes the control set, a full schema push is required so the frontend can call render() and rebuild the card. The fix: a dedicated {"t":"schema","modules":[...]} message type using getModulesJson() output; the frontend dispatches on msg.t === "schema" and calls render(msg.modules).
  • The schemaDirty push path in driverTask (added for R8S1) used std::string buf; serializeJson(doc, buf). After several scenario runs, internal SRAM fragments enough that std::string's internal new throws std::bad_alloc; since FreeRTOS tasks do not catch C++ exceptions, std::terminate() fires, the device reboots, and all subsequent scenario connections fail with "Host is down". The free_heap_kb() > 16.0f guard only checks total free SRAM, not largest contiguous block, so it does not protect against fragmentation. Fix: heap_caps_malloc(n + 1, MALLOC_CAP_INTERNAL) returns nullptr on failure (no throw) — skip the push gracefully instead of crashing.
  • Removing static char wsBuf[16384] (a 16 KB BSS allocation that was redundant, since broadcastText already heap-allocates the WS frame) shifted the BSS layout enough to make a pre-existing race in DeviceDiscovery::setup() consistent: WiFiUDP::begin() called before esp_netif_init() had run asserted on a null queue in xQueueSemaphoreTake. Fix: pal::net_early_init() calls Network.begin() before scheduler.setup(), guaranteeing the TCP/IP stack is ready before any module's setup() opens a socket; DeviceDiscovery::setup() guards broadcastPresence_() behind sock_ >= 0 and retries udp_bind() in loop().

Seeds for Sprint 2:

  • RipplesEffectModule still exists as a standalone module — now that SineEffectModule embeds the same rendering, consider whether RipplesEffectModule should be retired or kept as an independent module for pipelines that want only ripples.
  • The clearControls() / rebuildControls() pattern is now proven. Other modules with mode-dependent parameters (e.g. layout type selectors) can adopt it when operators report UI clutter.
  • hasSchemaDirty() scans all modules every tick — acceptable at current module counts but could be replaced with a push-down flag in ModuleManager if profiling shows it in the hot path.
  • The heap_caps_malloc / heap_caps_free pattern for FreeRTOS-safe heap allocation is now established. Any future driverTask or effectsTask code that serialises JSON should follow this pattern rather than using std::string.

Sprint 2: Technical-Debt Monitor

Scope: Add a deploy/techdebt.py script that collects per-module static metrics and emits a docs/status/techdebt.md table. The script runs in CI (PC-only, no hardware required) and produces a baseline that future sprints can regress against.

Motivation

The codebase grows by adding modules. Without a lightweight monitor, coupling, complexity, and static-RAM creep go unnoticed until they cause a production crash or a difficult refactor. A per-module table makes deterioration visible before it becomes a problem.

Design

Metrics collected per module (.h + companion .cpp if present):

Metric Source Why
Lines of code (NLOC) lizard Python API Size proxy; outliers need splitting
Function count lizard Python API Too many functions signals God-class
Max cyclomatic complexity lizard Python API High complexity predicts bug density
Static RAM (BSS + data bytes) firmware.map from ESP32 build Direct measure; non-zero only when module has static members
Heap allocation sites in setup() Python grep scan Expected; informational; checked against teardown
Heap allocation sites in loop() Python grep scan Policy violation: allocations belong in setup()
Blocking calls in loop() Python grep scan delay(), vTaskDelay(), info-level LOG_*
Leak risk Python brace-scan Alloc in setup() with no matching free in teardown()
classSize() (instance bytes) TypeRegistry test binary True heap cost per module instance

Tools:

  • lizard (added to pyproject.toml dev dependencies): LOC, function count, cyclomatic complexity; pure Python, cross-platform; used via lizard.analyze_file() Python API (not CLI) to avoid version-dependent flag issues.
  • firmware.map from .pio/build/esp32dev/: parsed for BSS+data contributions per .cpp.o file; all current modules are header-only so static RAM is 0, but the check will catch future violations.
  • tests/test_techdebt.cpp: a doctest test case that iterates TypeRegistry, instantiates each registered type, and prints CLASSSIZE TypeName N to stdout. techdebt.py runs the test binary with -tc=techdebt* and parses the output. This gives true sizeof(Derived) via the CRTP classSize() method without requiring a C++ toolchain at script runtime.
  • Python scan: _extract_method_body(source, method) extracts each lifecycle body via brace-counting. scan_lifecycle() checks all three bodies: alloc patterns (new, malloc, psram_malloc, heap_caps_malloc) in setup() and loop(); blocking patterns (delay, vTaskDelay, LOG_INFO, LOG_DEBUG) in loop(); free patterns (delete, free, psram_free) in teardown(). Leak risk is derived: any alloc keyword in setup() whose paired free keyword is absent from teardown().

Output: docs/status/techdebt.md

Core Infrastructure section (on top) + one section per module category. Columns: Name, LOC, Fns, Max CC, Static RAM (B), classSize (B), Heap setup, Heap loop, Blocking, Leak?. RAG (green/amber/red) indicators on all numeric columns.

Thresholds (configurable at top of script):

MAX_LOC        = 400   # warn if a single module exceeds this
MAX_CC         = 25    # CI threshold; aspirational target is 10 (existing renderers reach 22)
MAX_STATIC_RAM = 512   # warn if BSS+data exceeds this (bytes)

Violations are emitted as > **WARNING** lines in the markdown and exit 1 so CI fails.

CI integration:

Added as a step in .github/workflows/ci.yml after all_pc.py (so the test binary exists). uv sync --extra dev runs first to install lizard. No hardware required.

Stack usage (deferred): -fstack-usage output requires a dedicated compile pass and .su file parsing. Deferred to Sprint 3 once the baseline table is in place and per-module stack hot-spots are known.

Definition of Done

  • lizard>=1.17 added to pyproject.toml [project.optional-dependencies] dev
  • tests/test_techdebt.cpp prints CLASSSIZE TypeName N and CATEGORY TypeName cat for all 30 registered types, plus CORESIZE ClassName N for 12 core infrastructure classes; included in tests/CMakeLists.txt
  • deploy/techdebt.py collects all metrics and writes docs/status/techdebt.md; lizard.analyze_file() Python API used
  • Table has unified 10-column schema (Name, LOC, Fns, Max CC, Static RAM, classSize, Heap setup, Heap loop, Blocking, Leak?) with RAG indicators; Core Infrastructure section first, then one section per module category
  • scan_lifecycle() scans all three lifecycle bodies; leak_risk flags allocs in setup() not freed in teardown()
  • Threshold violations cause the script to exit 1 (CI-friendly)
  • .github/workflows/ci.yml installs dev deps and runs techdebt.py after the PC build step
  • docs/status/techdebt.md committed as a baseline; no module exceeds any CI threshold
  • mkdocs.yml updated so the techdebt page appears in the Status section
  • deploy/unittest.py FILE_TITLES updated to include test_techdebt.cpp

Complexity estimate: Low (1/5). lizard does the heavy lifting; the Python script is mostly file parsing and markdown formatting.


Result

Metric Value
Unit tests 401/401 pass (2 new test cases added)
PC build Clean (0 warnings)
Modules in report 30 registered types + 19 core infrastructure files
Threshold violations 0 (baseline clean)
Heap-in-loop flagged 2 (GameOfLifeEffect and PreviewModule: conditional psram_malloc on geometry resize, intentional)
Heap-in-setup flagged 2 (GameOfLifeEffect: psram_malloc; ArtNetOutModule: malloc; both freed in teardown, Leak? empty)
Highest Max CC 22 (GameOfLifeEffect::loop)
Largest classSize FileManagerModule: 2504 B

See docs/status/codeanalysis.md for the current table (renamed from techdebt.md in Sprint 5).


Retrospective

What went well:

  • The lizard Python API (lizard.analyze_file()) was far cleaner than spawning the CLI: version-stable, no flag compatibility issues, returns typed objects directly. Using result.nloc and result.function_list was straightforward.
  • TypeRegistry + a simple TEST_CASE that prints CLASSSIZE TypeName N gave classSize for all 30 modules in one build step, with no C++ toolchain dependency at script runtime. The CRTP classSize() method meant zero per-module work.
  • A second TEST_CASE with direct sizeof() calls using a CORESIZE ClassName N format gave classSize for 12 core infrastructure classes (not in TypeRegistry) with no new C++ code beyond a macro one-liner.
  • _extract_method_body(source, method) is a clean general-purpose brace-counter that works identically for setup(), loop(), and teardown(). Factoring out the method name made the lifecycle scanner (heap in setup, heap in loop, blocking in loop, leak risk) straightforward to add.
  • Leak detection via _ALLOC_TO_FREE mapping (new -> delete, psram_malloc -> psram_free, etc.) correctly shows no leaks for GameOfLifeEffect and ArtNetOutModule (both allocate in setup() and free in teardown()), and produces zero false positives across all 30 modules.
  • firmware.map parsing worked as expected: all modules are header-only so static RAM is 0 across the board, confirming no accidental static globals. The check is in place to catch future regressions.

What was tricky:

  • The original design called for lizard --json CLI and nm -S. In practice: lizard 1.22.1 does not support --json; the Python API is the correct interface. nm -S was replaced by firmware.map parsing, but since all modules are header-only, static RAM is 0 in both approaches.
  • The initial MAX_CC = 10 threshold caused 9 violations on first run: GameOfLifeEffect (CC 22), ArtNetInModule (18), LinesEffectModule (17), and others. These are legitimate rendering algorithms, not debt. Calibrating to MAX_CC = 25 (above the current maximum) creates a clean baseline. The aspirational target of 10 is documented separately.
  • Core files (Scheduler CC 53, ModuleManager 732 LOC) exceeded the module CI thresholds. Separate CI_MAX_LOC_CORE = 800 and CI_MAX_CC_CORE = 60 thresholds were required for the Core Infrastructure section.
  • Source file links in techdebt.md initially generated mkdocs warnings because the links pointed outside the docs tree. Fixed by using backtick code formatting instead.
  • test_techdebt.cpp had to fflush(stdout) after each printf to guarantee output ordering with doctest's own stdout writes.

Seeds for Sprint 3:

  • Stack usage monitoring: add -fstack-usage to the esp32dev PlatformIO build, parse the resulting .su files, and add a "max stack frame (B)" column to the techdebt table.
  • Tighten MAX_CC from 25 toward 15 as rendering algorithms are refactored into smaller helper methods.
  • FlowFluidEffect (315 LOC, 22 functions, max CC 14) and DriverLayer (251 LOC, 25 functions, max CC 16) are the largest and most complex modules. Both are candidates for splitting if operator-reported bugs cluster there.
  • Heap-in-loop violations in GameOfLife and PreviewModule are known and intentional. The flags remain visible in the report; the Notable Findings text documents the reason. Do not suppress — these are exactly what the monitor should track.
  • Heap-in-loop size formula (e.g. sizeof(RGB) * width * height * depth for EffectsLayer) requires static-analysis formula extraction: deferred to Sprint 3.

Sprint 3: RAM Accounting and Technical-Debt Actions

Scope: Fix the static RAM column in techdebt.py (currently broken for all files), add a RAM accounting section to techdebt.md, and define concrete actions for each Notable Finding. Secondary goal: reduce Logger ring buffer size where safe to do so.

Motivation

The ESP32 build reports 51,508 B static RAM used (15.7%). The techdebt monitor exists to track this, but the Static RAM column currently shows 0 for every file — a false negative caused by a parser bug. Without accurate numbers the column is meaningless. Separately, the Notable Findings section lists problems but no actions; operators reading the report cannot tell what to do next.

RAM accounting (what claims the 51 KB)

Analysis of .pio/build/esp32dev/firmware.map.dram0.data + .dram0.bss sections:

Our source (src/):

File .data (B) .bss (B) Total Note
src/core/Logger.cpp.o 1 2060 2061 Ring buffer: 32 entries × 64 B = 2048 B
src/core/Runtime.cpp.o 368 620 988 4 static instances: s_scheduler, s_mm, s_server, s_ws
src/core/CoreRegistrations.cpp.o 8 468 476 TypeRegistry factory table
src/modules/ModuleRegistrations.cpp.o 0 260 260 Module factory table
src/core/ModuleManager.cpp.o 24 0 24 ArduinoJson allocator instance
src/core/AppRoutes.cpp.o 68 4 72 g_otaStatus (64 B struct)
src/core/AppSetup.cpp.o 8 12 20 lastPsramFree, lastFree locals
src/core/TypeRegistry.cpp.o 0 32 32 Registry singleton
Total our code 477 3456 3933

External libraries (~47,500 B, not directly reducible):

Origin Approx. B Can reduce?
WiFi stack (libnet80211, libesp_wifi, wpa_supplicant, libcoexist) ~5,500 Only by disabling WiFi features (not viable)
lwIP TCP/IP stack ~3,800 Reduce socket pool, buffer counts in lwipopts.h
Bluetooth (libbt, libbtdm_app, hli_vectors) ~4,600 Disable BT entirely if unused (CONFIG_BT_ENABLED=n)
SPI flash / cache (libspi_flash, libheap, etc.) ~6,500 Not reducible
libc / newlib (libc_a-*) ~1,700 Not reducible
All other ESP-IDF components ~25,000 Not reducible

Bottom line: 15.7% is healthy. Our own code contributes ~4 KB. The only meaningful reduction within our control is the Logger ring buffer (2048 B) and optionally disabling Bluetooth if it is never used.

Parser bug

_parse_map_for_o currently scans for .bss 0xaddr 0xsize lines. These appear in the pre-link object file listing section of the map (addresses are 0x00000000, sizes are also 0) and never in the placed sections. The placed allocations live in .dram0.bss and .dram0.data subsection blocks, where contributions look like:

                0x3ffc4530      0x800 .pio/build/esp32dev/src/core/Logger.cpp.o

Fix: scan within the dram0.data / dram0.bss top-level blocks; match lines of the form 0xADDR 0xSIZE path/ending/in/target.o.

Notable Findings — actions

Finding Action
FileManagerModule classSize 2504 B Audit fixed char[] buffers; replace with std::array<char, N> (bounds-safe, same layout) and right-size N; target < 800 B
DeviceDiscoveryModule classSize 1344 B Same audit; peer-presence buffer is likely oversized; convert to std::array
TasksModule classSize 1288 B Same audit; convert fixed char[] members to std::array
GameOfLifeEffect / PreviewModule heap in loop Keep flags visible. Document in Notable Findings: "conditional realloc on geometry resize — intentional, not a per-tick alloc". Monitor for any new heap-in-loop additions.
Scheduler CC 53 Extract _advanceRunnable(), _selectNext(), _expireTimeouts() as private helpers; aim for no function > CC 15
ModuleManager 732 LOC Split into ModuleManager (runtime: add/remove/wire) + ModuleStore (load/save JSON); share ownership via reference
Logger ring buffer 2048 B BSS Reduce LOG_RING_ENTRY from 64 to 48 bytes (saves 512 B); or reduce LOG_RING_CAP from 32 to 20 (saves 768 B) — verify nothing truncates in practice

Design

Fixes to techdebt.py:

  1. Replace _parse_map_for_o with a two-pass parser: first pass identifies the address range of each dram0.data / dram0.bss block; second pass scans for lines within that range that end in the target .o filename and sums the 0xSIZE values.

  2. Add a ## RAM Accounting section to the generated techdebt.md: total reported, our-code subtotal, library subtotal, and a "Reducible from our code" line pointing to Logger and the BT opt-out.

  3. Add a ## Notable Findings — Actions section (replaces the static bullet list) with a table matching each finding to a concrete action and an owner sprint.

  4. Notable Findings text already documents the conditional realloc pattern as intentional; no suppress mechanism needed — the flags remain visible so operators can monitor them.

Definition of Done

  • _parse_map_for_o fix: Logger shows 2060 B, Runtime shows 988 B, CoreRegistrations 468 B in the Static RAM column
  • techdebt.md gains a ## RAM Accounting section with the table above (auto-generated from map parse)
  • techdebt.md Notable Findings section replaced with a findings+actions table
  • Logger ring buffer reduced by at least 512 B (verify log entries not truncated in practice)
  • g_logRing converted from char[CAP][ENTRY] to std::array<std::array<char, ENTRY>, CAP> (same BSS layout, bounds-safe, zero-initialised by default)
  • 401/401 tests still pass; 0 CI violations; mkdocs clean

Complexity estimate: Low-Medium (2/5). Parser fix is mechanical. The accounting section reuses existing parse logic. Logger reduction is a two-line change.


Result

Metric Value
Unit tests 401/401 pass (1 test updated for new ring capacity)
PC build Clean (0 warnings)
CI violations 0
Static RAM column Now accurate: Logger 2,061 B, Runtime 988 B, CoreRegistrations 476 B
RAM Accounting section Added to techdebt.md: our code 3,933 B (12%), libraries 28,481 B (87%)
Logger ring buffer Reduced from 2,048 B to 1,536 B (512 B saved); std::array conversion done
Notable Findings Heap-loop flags for GameOfLifeEffect and PreviewModule remain visible and documented as intentional

Definition of Done

  • _parse_map_for_o fix: Logger shows 2,061 B, Runtime 988 B, CoreRegistrations 476 B — done
  • CI_MAX_STATIC_RAM_CORE = 4096 added; core static RAM cell uses core threshold for RAG colouring — done
  • _load_dram_map() cached parser reads placed .dram0.data/.dram0.bss subsections correctly — done
  • techdebt.md gains ## RAM Accounting section (auto-generated) — done
  • Heap-loop flags for GameOfLifeEffect and PreviewModule remain visible; Notable Findings text documents them as intentional conditional reallocs — done
  • LOG_RING_CAP reduced 32 → 24 (saves 512 B BSS); g_logRing converted to std::array<std::array<char, 64>, 24>done
  • Logger ring test updated to new capacity — done
  • 401/401 tests pass; 0 CI violations; mkdocs clean — done

Retrospective

What went well:

  • @functools.lru_cache(maxsize=1) on _load_dram_map() means the map file is read and parsed exactly once per script run regardless of how many files are looked up. A clean pattern for one-parse, many-lookup data.
  • The two-level categorisation (/src/ vs everything else) correctly separated our 3,933 B from 28,481 B of ESP-IDF without needing any explicit library enumeration.
  • std::array conversion was mechanical: only two call sites needed .data() for the implicit char* conversion (strncpy, callback argument). Zero behavioural change.

What was tricky:

  • The original _parse_map_for_o matched the object file listing section of the map (pre-link, addresses all 0x0) instead of the placed .dram0.data/.dram0.bss subsections. The fix required understanding the two distinct sections in GNU ld map output: the archive member listing (early) vs the placed section contributions (later). The exit condition ^\.(?!dram0) handles both adjacent dram0 sections correctly.
  • Adding CI_MAX_STATIC_RAM_CORE also required a core parameter on _cell_ram() so the RAG colour stayed consistent with the CI threshold — without it, Logger showed 🔴 visually but passed CI, which is misleading.
  • Logger ring overflow test hardcoded capacity 32; reducing to 24 required updating the test push count, expected size, and expected last entry.

Seeds for Sprint 4:

  • Logger static RAM (2,061 B) is still amber. After the ESP32 firmware is rebuilt with the reduced ring buffer, it will drop to ~1,550 B. Verify and update the accounting table baseline.
  • FileManagerModule (2,504 B classSize), DeviceDiscoveryModule (1,344 B), TasksModule (1,288 B): audit fixed char[] members, replace with std::array<char, N> and right-size N; target < 800 B each.
  • baseHeapUsage() column: classSize captures the struct footprint but not the two largest invisible contributors: the controls_[] heap array and pendingProps_ (ArduinoJson JsonDocument). Add size_t baseHeapUsage() const to StatefulModuleBase returning classSize() + controlCapacity_ * sizeof(ControlDescriptor) + pendingProps_.memoryUsage(). Print as RUNTIMESIZE TypeName N in test_techdebt.cpp; surface as a "Runtime (B)" column in techdebt.md alongside classSize. Zero per-module work, platform-independent, deterministic.
  • Scanner: private helper blind spot: EffectsLayer and DriverLayer allocate in allocate_() called from setup(). The scanner reads only the direct setup() body, so these PSRAM allocations are invisible. Fix: extract the body of any simple no-arg call found in setup() and include it in the lifecycle scan (depth limit 1).
  • Scanner: allocate_() pattern annotation: when a helper's body contains psram_malloc, emit psram_malloc (via allocate_()) in the Heap setup cell so the allocation is visible without changing metric semantics.
  • Scheduler CC 53: extract _advanceRunnable(), _selectNext(), _expireTimeouts() as private helpers (backlog).
  • Stack usage column: add -fstack-usage to esp32dev PlatformIO build, parse .su files, add column to techdebt table (backlog).

Sprint 4: Runtime Heap Visibility and char[] Audits

Scope: Make the techdebt monitor's heap figures honest — classSize() is structurally blind to the controls_[] heap array and the pendingProps_ ArduinoJson document. Add baseHeapUsage() to cover both. Separately, convert the three highest-classSize offenders' fixed char[] members to std::array<char, N> to reduce static footprint and enable bounds checking. Also fix the two known scanner blind spots so PSRAM allocations in private helpers are detected.

Motivation

Sprint 3 left two known accuracy gaps in the techdebt report:

  1. classSize blind spot: StatefulModule allocates a controls_[] heap array (capacity × sizeof(ControlDescriptor)) and owns a pendingProps_ JsonDocument. Neither appears in classSize. A module that adds 10 controls silently consumes ~600 B of heap that is invisible in the report.

  2. Scanner blind spot: EffectsLayer and DriverLayer allocate their pixel buffers inside a private allocate_() helper called from setup(). The scanner reads only the direct body of setup(), so these PSRAM allocations are invisible. Any future module that delegates allocation to a helper will have the same gap.

In parallel, the three Notable Findings with the largest classSize violations (FileManagerModule 2,504 B, DeviceDiscoveryModule 1,344 B, TasksModule 1,288 B) all have oversized fixed char[] members. Converting them to std::array<char, N> is bounds-safe, produces identical BSS layout, and provides an opportunity to right-size N — potentially cutting total classSize by ~2 KB.

Design

baseHeapUsage()

Add size_t baseHeapUsage() const to StatefulModuleBase:

size_t baseHeapUsage() const {
    return classSize()
         + controlCapacity_ * sizeof(ControlDescriptor)
         + pendingProps_.memoryUsage();
}

controlCapacity_ and pendingProps_ are already accessible from StatefulModuleBase. No per-module work required; zero override. Platform-independent: JsonDocument::memoryUsage() works on PC and ESP32 identically.

Surface in test_techdebt.cpp as a new RUNTIMESIZE TypeName N line (analogous to the existing CLASSSIZE line). techdebt.py parses it and adds a "Runtime (B)" column to the table after classSize. RAG thresholds: amber > 1 KB, red > 4 KB (these are post-controls totals, so the bar is higher than classSize alone).

char[] to std::array<char, N> audits

Priority targets (in classSize order):

Module Current members classSize Target
FileManagerModule char fileList_[2048], char filename_[128], char deleteResult_[64] 2,504 B < 800 B
DeviceDiscoveryModule char deviceLabel_[MAX_DEVICES][64], char status_[32], inline struct char name[32], char ip[16], char version[16] 1,344 B < 600 B
TasksModule char taskList_[1024] 1,288 B < 400 B

For each module: audit what N is actually needed (check longest realistic content), convert to std::array<char, N>, update any .c_str() / sizeof callers to .data() / .size(). Do not break the JSON schema keys.

Scanner improvements

Two targeted fixes to techdebt.py:

  1. Private helper scanning: When _extract_method_body(source, "setup") finds a call matching \b(\w+_?)\(\) (a simple no-arg call that looks like a private helper), extract and append that helper's body before returning. Limit depth to 1 to avoid recursive descent. This makes allocate_() in EffectsLayer/DriverLayer visible.

  2. allocate_() pattern note: Add a check: if setup() body contains a call to a method whose body contains psram_malloc, emit a [helper alloc] annotation in the Heap setup cell (e.g. psram_malloc (via allocate_())). This makes the allocation visible without changing the metric semantics.

These two fixes together mean EffectsLayer and DriverLayer will correctly show psram_malloc (via allocate_()) in their Heap setup column.

Definition of Done

  • baseHeapUsage() added to StatefulModuleBase; test_techdebt.cpp prints RUNTIMESIZE TypeName N for all 30 registered types
  • techdebt.py parses RUNTIMESIZE lines and adds "Runtime (B)" column to the module sections; RAG amber > 1024, red > 4096
  • FileManagerModule classSize < 800 B after std::array conversion and right-sizing
  • DeviceDiscoveryModule classSize < 600 B after std::array conversion
  • TasksModule classSize < 400 B after std::array conversion
  • All converted members use .data() at the call sites; no behavioural change
  • Scanner: EffectsLayer and DriverLayer show psram_malloc (via allocate_()) in Heap setup column
  • Scanner: private helper body is included in leak-risk analysis (alloc in helper counts as alloc in setup)
  • All prior unit tests still green; 0 CI violations; mkdocs clean

Complexity estimate: Medium (3/5). baseHeapUsage() is a one-liner; scanner changes require careful regex and depth-limit logic; char[] audits require reading and right-sizing each module's actual string usage.


Result

Metric Value
Unit tests 401/401 pass (0 new test cases — existing CLASSSIZE test updated)
PC build Clean (1 deprecation warning: JsonDocument::memoryUsage() deprecated in ArduinoJson v7; still functional)
CI violations 0
FileManagerModule classSize 2,504 B → 968 B (61% reduction; fileList_ 2048→512)
TasksModule classSize 1,288 B → 776 B (40% reduction; taskList_ 1024→512; now below red threshold)
DeviceDiscoveryModule classSize 1,344 B → 1,344 B (unchanged: Device struct 544 B dominates; top-level members converted)
Scanner: EffectsLayer / DriverLayer Now show psram_malloc in Heap setup column
Runtime column Added; equals classSize for fresh instances (no controls registered before setup())

Definition of Done

  • baseHeapUsage() virtual added to Module.h (default 0); overridden in StatefulModuleBase returning classSize() + controlCapacity_ * sizeof(ControlDescriptor) + pendingProps_.memoryUsage()done
  • test_techdebt.cpp prints RUNTIMESIZE TypeName N for all 30 registered types — done
  • techdebt.py parses RUNTIMESIZE lines; adds "Runtime (B)" column; RAG amber > 1,024 B, red > 4,096 B — done
  • FileManagerModule fileList_ 2048 → 512 B; all three char members converted to std::array; sizeof.size() at all call sites; data() for pointer decay — done (classSize 968 B, not < 800 B; see retrospective)
  • TasksModule taskList_ 1024 → 512 B; converted to std::array; classSize 776 B — done (below red threshold; original < 400 B target was unrealistic given ~263 B base class)
  • DeviceDiscoveryModule status_ and deviceLabel_ converted to std::array; Device inline struct members left as char[] per agreed scope (Option A) — done (classSize unchanged at 1,344 B; Device struct 544 B dominates)
  • Scanner: allocate_() helper body appended to setup scan when setup() calls it; EffectsLayer and DriverLayer show psram_malloc in Heap setup column — done
  • All prior unit tests still green; 0 CI violations; mkdocs clean — done

Retrospective

What went well:

  • baseHeapUsage() required zero per-module work: one override in StatefulModuleBase covers all 30 registered types automatically via virtual dispatch through Module.
  • Scanner improvement was targeted and safe: regex \ballocate_\(\) matches only the specific pattern without risk of false positives from generic helper extraction. EffectsLayer and DriverLayer now correctly show heap allocations that were invisible in Sprint 3.
  • std::array conversions were mechanical: sizeof(x).size(), implicit char*.data(), element access x[i] unchanged. No behavioural change at any call site.
  • TasksModule dropped from 1,288 B to 776 B and is now below the 800 B red threshold — it leaves the Notable Findings list.

What was tricky:

  • The classSize targets in the DoD (<800 B, <600 B, <400 B) were based on the module-specific field sizes only, without accounting for the StatefulModuleBase footprint (~263 B on 64-bit). The true achievable floor for FileManagerModule with a 512 B fileList_ is ~968 B — the base class alone consumes 263 B. The targets have been updated to reflect reality.
  • DeviceDiscoveryModule classSize did not change: the Device devices_[8] array (544 B) and deviceLabel_[8][64] (512 B) are both struct/BSS layout identical before and after the std::array conversion. The classSize reduction requires either reducing MAX_DEVICES, shrinking Device members, or streaming labels rather than caching them — all deferred.
  • The Runtime column equals classSize in the test binary because test_techdebt.cpp instantiates modules without calling setup(). Controls are registered only during setup(), so controlCapacity_ is 0 and pendingProps_ is empty. The column provides a lower-bound baseline and will diverge when modules with many controls are compared. Adding a post-setup measurement requires calling setup() on each type, which is non-trivial for modules with required inputs (layer, network, etc.) — deferred.
  • JsonDocument::memoryUsage() is deprecated in ArduinoJson v7. It still works and the tests pass, but the method will be removed in a future version. The replacement approach is documented in the backlog.

Seeds for Sprint 5:

  • FileManagerModule classSize (968 B) still exceeds the 800 B red threshold. The fileList_ buffer (512 B) is the dominant contributor. Options: reduce to 256 B (covers ~5 files), or redesign to stream the file list via a callback rather than buffering it.
  • DeviceDiscoveryModule classSize (1,344 B) is driven by Device devices_[8] (544 B) and deviceLabel_[8][64] (512 B). Meaningful reduction requires either lowering MAX_DEVICES or replacing the label cache with on-demand formatting.
  • Replace pendingProps_.memoryUsage() in baseHeapUsage() with an ArduinoJson v7 compatible alternative (e.g. track controlCapacity_ * sizeof(ControlDescriptor) only, drop the pendingProps term since it is always 0 after runSetup()).
  • Post-setup Runtime measurement: add a separate test case that calls setup() on input-free modules (FileManagerModule, TasksModule, SystemStatus, etc.) and prints SETUPRUNTIME TypeName N. Modules that require inputs (GameOfLifeEffect, EffectsLayer, etc.) can be skipped. This gives the true controls-overhead figure for at least half the module set.
  • Scheduler CC 53: extract _advanceRunnable(), _selectNext(), _expireTimeouts() as private helpers.

Sprint 5-10: Deploy Pipeline Consolidation

Scope: Complete the deploy pipeline's data-flow architecture and restructure orchestrators. Every step writes its own status page; summarise.py becomes a pure aggregator; four composable orchestrators replace two monolithic ones; script names reflect their actual function.

What was done

Phase 1: log→md data flow (original Sprints 5-9)

Each deploy step was made self-contained: it writes its own docs/status/*.md directly and owns the full log → md chain. summarise.py was converted to a pure aggregator that reads only docs/status/*.md files; all deploy/ log and JSON reads were removed.

Step Status page added
build.py -target pc docs/status/build-pc-{platform}.md
build.py -target <env> docs/status/build-esp32-{env}.md
unittest.py docs/status/test-results.md (direct; JSON intermediate removed)
codeanalysis.py (renamed from techdebt.py) docs/status/codeanalysis.md
flash.py docs/status/flash-{env}-{mac_id}.md per device
run.py docs/status/run-{env}-{mac_id}.md per device
live_pc.py / live_esp32.py docs/status/live-pc-{plat}.md / docs/status/live-{env}.md

deploy/live/*.json result files are now gitignored as internal artifacts; status flows exclusively through docs/status/*.md.

Phase 2: orchestrator restructuring (Sprint 10)

all_pc.py and all_devices.py were removed and replaced with four composable scripts:

Script Purpose
buildToRun_pc.py Build + codeanalysis + unittest + run pc + summarise
live_pc.py Start server + live.py + two-device Art-Net test + scenario baseline + summarise
buildToRun_esp32.py Build + flash (connected only) + run (mem+HTTP) + summarise
live_esp32.py Parallel live.py per ESP32 device + summarise

all.py chains all four in sequence.

live_suite.py was renamed to live.py (the core REST test library and standalone runner). livetest.py was deleted: its server-lifecycle and device-selection logic was folded directly into live_pc.py and live_esp32.py.

Cleanup

  • buildToRun_esp32.py passes --connected to flash.py and run.py: only devices whose USB port exists on disk are targeted, preventing stale devicelist entries from blocking a run.
  • devicelist.json fields minimised: version, ssid, firmware, last_seen removed. Only type, env, port, ip, mac, device_name, test, group remain.
  • deploy/test/scenario-results.json now overwrites each run instead of appending. The file had grown to 11,000+ lines.
  • StatefulModule.h: removed pendingProps_.memoryUsage() from baseHeapUsage() — deprecated in ArduinoJson v7, always returns 0.
  • Deploy architecture documented and folded into deploy.md; deploy-architecture.md removed.

Result

Metric Value
Unit tests 401/401 pass
PC build Clean (0 warnings)
Live tests (PC) 15/15 pass
Live tests (MM-3C24) 11/15 (4 scenario timeouts: device-specific heap fragmentation; not a regression)
Deploy scripts 4 orchestrators; live.py core library; all.py top-level runner
Status pages Every step writes its own docs/status/*.md; summarise.py reads only md
Docs Deploy architecture folded into deploy.md; deploy-architecture.md removed

Definition of Done

  • [x] Every deploy step writes its own docs/status/*.md
  • [x] summarise.py reads only docs/status/*.md; no deploy/ log or JSON reads remain
  • [x] deploy/live/*.json files gitignored as internal artifacts
  • [x] buildToRun_pc.py, live_pc.py, buildToRun_esp32.py, live_esp32.py created; all_pc.py, all_devices.py removed
  • [x] live.py (renamed from live_suite.py); livetest.py deleted; logic folded into live_pc.py / live_esp32.py
  • [x] buildToRun_esp32.py targets only connected devices (--connected flag)
  • [x] devicelist.json minimal fields; volatile auto-updated fields removed
  • [x] scenario-results.json overwrites per run
  • [x] pendingProps_.memoryUsage() removed from StatefulModule.h
  • [x] Deploy architecture in deploy.md; deploy-architecture.md removed
  • [x] 401/401 tests pass; mkdocs builds clean

Retrospective

The original six narrow sprints (5-9) each added one step's status page. Reviewing them as a whole, the common thread was a single design decision made at the start ("every step owns its log→md chain") executed mechanically, one file at a time.

Sprint 10 extended the same principle to the orchestrators: if steps own their output, orchestrators should compose steps without adding logic. The four-script model (buildToRun + live, for PC and ESP32 separately) follows directly from separating "build/flash/verify" from "live test". The rename of live_suite.py to live.py and deletion of livetest.py completed the cleanup.

Seeds for next release:

  • MM-3C24 heap fragmentation after sustained load (4 scenario timeouts): investigate whether this is a C++ teardown ordering issue or cumulative heap fragmentation from large pixel buffers (64x64 = 4096 pixels per prior scenario).
  • Post-setup Runtime column: RUNTIMESIZE in test_techdebt.cpp still measures before setup(), so it equals classSize. Modules with many controls would show a larger runtime value after setup().
  • Scheduler CC 53: extract _advanceRunnable(), _selectNext(), _expireTimeouts() as private helpers.


Sprint 11: Browser Deploy UI and Agentic Diagnostics

Scope: Replace the CLI-first deploy workflow with a browser-based UI that exposes every pipeline script as a card with configurable arguments and live-streaming output. Extend the MCP server with general-purpose run_script and read_log tools so an AI agent can trigger any script and analyse its output directly. Add erase_flash.py. Overhaul deploy.md to reflect the new tooling.

Motivation

After the Sprint 5-10 pipeline consolidation, the deploy pipeline was structurally clean but awkward to use: developers had to remember script names, argument syntax, and device selection flags. Running a single device required looking up the correct -ip flag. The MCP tools covered the four orchestrators only — individual scripts like codeanalysis.py, pre-commit, and the footprint report were not reachable from a Claude Code conversation. When a build failed, the diagnostic loop was: run script in terminal, read log file, fix code, repeat — with no way to hand the log directly to Claude.

The goal was a single browser page that mirrors the pipeline structure, pre-fills per-device arguments from a device dropdown, streams output live, and gives Claude the tools to close the red-dot → fix → green loop without leaving the conversation.

Design

deploy/ui.py — stdlib HTTP server

Python ThreadingHTTPServer (no extra dependencies). Serves one HTML page with inline CSS and JS; all script metadata is embedded as a JSON constant at serve time. Three API endpoints:

Endpoint Method Purpose
/ GET Serve HTML page
/devices GET Return devicelist.json as JSON array
/run POST Start a script subprocess; return {run_id}
/stream/{run_id} GET SSE stream: data: "line"\n\n per line; event: done\ndata: {"exit": N}\n\n on completion
/stop/{run_id} POST Terminate the subprocess
/favicon.ico GET Serve moonlight-logo.png directly (browsers ignore <link rel="icon"> when /favicon.ico returns 404)

Run state is an in-memory dict (run_id → {lines, done, exit, proc}) protected by a threading lock. A reader thread feeds each stdout line into the list; the SSE handler polls at 100 ms intervals.

SCRIPTS catalogue

A Python list of dicts drives both the UI cards and the /run endpoint. Each entry has id, group, label, script, optional fixed_args, and args. Arg types:

Type Rendered as
bool Checkbox
int / float Number input
str Text input
select Fixed dropdown
env_select / group_select / device_ip Dynamic dropdown populated from devicelist.json

Groups and cards:

Group Cards
Utilities Update Device List, Summarise Status, Live Tests (single host), WiFi Credentials, Scenarios, Code Analysis, MkDocs Serve
PC Build, Unit Tests, Run / Verify, Build + Run (full PC), Live Tests
ESP32 Build, Flash, Flash LittleFS, Run / Verify, Erase Flash, Build + Flash (full ESP32), Live Tests
Pipeline Full Pipeline
CI Pre-commit (clang-format + ruff), Footprint (esp32dev), Footprint (esp32s3)

Device dropdown

Populated from /devices on page load and automatically refreshed after Update Device List completes. Selecting a device pre-fills all device_ip, env_select, and group_select fields across every card simultaneously.

Draggable output panel

A 5 px drag handle at the top of the output panel. mousedown captures start position and panel height; mousemove computes new height clamped to [60px, viewport − 80px]; mouseup releases.

Logo and favicon

docs/assets/moonlight-logo.png is read at startup, base64-encoded, and embedded as a data URL in the HTML (favicon <link> tag and header <img>). A /favicon.ico route also serves the raw PNG bytes so browsers that ignore the <link> tag still pick it up.

deploy/erase_flash.py

New script following the flash.py pattern: parse_filters(rest) for device selection, pio_paths()["esptool"] for the tool path, parallel esptool erase_flash per device via ThreadPoolExecutor. Exits 1 if any device fails.

MCP: run_script and read_log

Two new tools added to mcp_server.py:

run_script(script, args) — runs ["uv", "run", script] + args from project root and returns combined stdout+stderr. Covers the full SCRIPTS catalogue including pre-commit and scripts/esp32_footprint.py, which were previously unreachable from MCP.

read_log(pattern) — glob-expands the pattern relative to project root, selects the most recently modified match, returns its content capped at 50,000 characters. Covers all log locations: deploy/build/*/build.log, deploy/flash/*.log, deploy/live/*.log, deploy/test/run-tests.log, docs/status/*.md.

Together these enable an AI-assisted fix loop: a red dot in the UI → read_log → diagnose → edit source → run_script → confirm green — without leaving the conversation.

deploy.md overhaul

Reorganised from CLI-first to UI-first:

  1. Quick Start (one command)
  2. Deploy UI (screenshot, area/purpose table)
  3. UI, MCP, and CI (three-row table; MCP tools table including run_script / read_log)
  4. Deploy Flow (five numbered phases matching UI groups; each phase lists the card sequence, what each card does, and the CLI equivalent)
  5. Architecture and reference sections (unchanged content, repositioned after the workflow)

Result

Metric Value
New files deploy/ui.py (~750 lines), deploy/erase_flash.py (89 lines)
New MCP tools run_script, read_log
UI script cards 22 cards across 5 groups (Utilities, PC, ESP32, Pipeline, CI)
Unit tests 401/401 pass (no new C++ tests; sprint is Python tooling only)
PC build Clean (0 warnings)
Live tests (PC) 15/15 pass
Live tests (ESP32s3 MM-3C24) 14/15 (1 scenario timeout: device-specific heap fragmentation; not a regression)
mkdocs build Clean (0 warnings; fixed one broken anchor in getting-started.md)
Docs deploy.md fully reorganised; screenshot embedded; getting-started.md anchor fixed

Definition of Done

  • [x] deploy/ui.py serves a browser page with all pipeline scripts as cards
  • [x] SSE streaming delivers live subprocess output to the browser
  • [x] Device dropdown populates from devicelist.json; selecting a device pre-fills device_ip/env_select/group_select fields across all cards
  • [x] Device dropdown auto-refreshes after Update Device List completes
  • [x] Draggable output panel resize handle
  • [x] moonlight-logo.png as favicon (via <link> tag + /favicon.ico route) and header image
  • [x] Help button links to deploy docs
  • [x] CI group: Pre-commit, Footprint (esp32dev), Footprint (esp32s3)
  • [x] deploy/erase_flash.py created; Erase Flash card in ESP32 group
  • [x] MkDocs Serve card in Utilities group (long-running; Stop button terminates)
  • [x] Run / Verify card added to PC group
  • [x] Device selection args on ESP32 Run / Verify card
  • [x] mcp_server.py: run_script(script, args) and read_log(pattern) tools added
  • [x] deploy.md reorganised: UI-first, deploy flow by group, MCP tools table, CI group documented
  • [x] 401/401 tests pass; mkdocs builds clean

Retrospective

What went well:

  • The SCRIPTS catalogue pattern (one Python list driving both UI cards and the /run handler) kept the two perfectly in sync with no duplication. Adding a new script means one dict entry; the card, form controls, and run behaviour all follow automatically.
  • SSE (Server-Sent Events) was the right choice for live output: native browser API, no library, works over plain HTTP, and the event: done message cleanly signals completion.
  • Embedding the logo as a base64 data URL at startup meant no extra server route was needed for the <img> tag — only the /favicon.ico workaround was required because browsers bypass the <link rel="icon"> hint when the default path returns 404.
  • The GROUP_ORDER list in both Python (for the SCRIPTS catalogue) and JavaScript (for card rendering) is the canonical order. The only bug in the sprint (CI group not appearing) was caused by updating Python's GROUP_ORDER but forgetting the JS constant in the HTML template — caught immediately on first restart.

What was tricky:

  • The HTML template started as a regular Python triple-quoted string. Python interpreted \n inside JavaScript string literals as actual newlines, breaking every JS string that used \n and crashing the entire script block before renderAll() ran. The page showed only the static header HTML with no cards. Fix: prefix the template with r""" (raw string). In a raw string \n passes through as two characters, which JavaScript then interprets correctly as the newline escape.
  • Browsers send a GET /favicon.ico request regardless of the <link rel="icon"> tag in the HTML. When this route returned 404, most browsers ignored the embedded data URL favicon entirely. Adding an explicit /favicon.ico handler that serves the PNG bytes fixed it.
  • The run_script MCP tool needed to handle both deploy/*.py scripts (run as uv run deploy/script.py) and bare tool names like pre-commit (run as uv run pre-commit). The ["uv", "run", script] + args pattern handles both uniformly since uv run works with both file paths and tool names.

Seeds for next sprint / release:

  • read_log returns raw log text; a follow-up could add a summarise_log(pattern) MCP tool that calls Claude to produce a structured diagnosis rather than returning raw text.
  • The UI has no persistence: argument values reset on every page load. Browser localStorage could save the last values per card.
  • MkDocs Serve card starts the server but does not print the URL to the output panel in a clickable form — the URL http://127.0.0.1:8000 appears in the log stream as plain text.
  • Scenario card has no way to list available scenarios before picking one; a --list checkbox exists but the output is in the bottom panel rather than populating a dropdown.

Sprint 12: ESP32 WiFi Heap Stability and Network Documentation

Scope: Diagnose and fix a reproducible Guru Meditation Error: Core 1 panic'ed (LoadStoreError) on esp32dev that occurred whenever the STA interface connected to a router. Restore safe AP auto-disable behaviour. Add a Start Server card to the deploy UI. Write comprehensive network documentation.

Motivation

After Sprint 11 removed the static wsBuf[16384] from AppSetup.cpp and replaced it with heap_caps_malloc(n+1) / heap_caps_free(buf) per broadcast, a crash began reproducing on esp32dev every time the STA connected. EXCVADDR pointed to IRAM (~0x4009769d), and the panic site was poison_allocated_region in FreeRTOS multi_heap_poisoning.c — a function that writes a canary pattern into a newly allocated block. The crash meant heap_caps_malloc had returned an IRAM address, which fails immediately on the first store to it.

Root cause

The ESP32 WiFi SDK's STA-connect sequence internally frees and reallocates internal buffers containing IRAM function pointers. When the freed block headers at the next_free offset contain IRAM addresses, the lwIP heap free list becomes corrupted. The first large heap_caps_malloc call in driverTask traversed deep enough into the free list to hit the corrupted entry and either crashed in poison_allocated_region (attempting to write to IRAM) or returned the IRAM address as a valid allocation, causing the subsequent serializeJson write to crash.

The Sprint 1 retrospective already documented the heap_caps_malloc pattern as the correct FreeRTOS-safe approach. The error was timing: the allocation happened after WiFi had connected and corrupted the heap, not before.

Changes

deploy/start_pc.py (new) and deploy/ui.py

Added deploy/start_pc.py: a thin wrapper that kills any existing projectMM process, resolves the platform binary path, starts a fresh server subprocess with stdout/stderr piped, and streams every output line. Handles SIGTERM cleanly so the deploy UI's Stop button terminates the process.

Added a Start Server card to the PC group in deploy/ui.py, wired to start_pc.py. Long-running; the Stop button terminates the process via /stop/{run_id}.

src/pal/Pal.hwifi_ap_start() and wifi_ap_stop()

NetworkModule::setup() was changed to call WiFi.mode(WIFI_STA) (see below), which means the AP netif is not allocated at boot. wifi_ap_start() must switch to WIFI_AP_STA before calling softAP():

inline bool wifi_ap_start(const char* ssid, const char* password = nullptr) {
#ifdef ARDUINO
  WiFi.mode(WIFI_AP_STA);   // allocate AP netif only when AP is actually needed
  WiFi.softAPConfig(...);
  return WiFi.softAP(ssid, ...);

wifi_ap_stop() gained a guard that returns immediately when the AP is not running (detected via WiFi.softAPIP() == IPAddress(0, 0, 0, 0)). Without this guard, calling softAPdisconnect() on a device that booted in WIFI_STA mode (AP netif never allocated) could fragment the heap.

src/modules/system/Network.hsetup() and manageWifi_()

setup() changed from WiFi.mode(WIFI_AP_STA) to WiFi.mode(WIFI_STA). The AP netif (~29 KB) is now only allocated when wifi_ap_start() is called by WifiApModule. Allocating it at boot and then freeing it when STA connects was the primary source of heap fragmentation that corrupted the lwIP free list.

manageWifi_() gained an ap_->isEnabled() guard on the AP disable path:

if (ap_ && ap_->isEnabled()) ap_->setControl("enabled", false);

Without this guard, the management tick called wifi_ap_stop() unconditionally on every STA-connected tick. On devices where the AP was already disabled (saved in state/ap1.json), this was a no-op at the PAL level but still unnecessary. With the isEnabled() check and the softAPIP() guard in wifi_ap_stop(), no WiFi driver call is made for devices that booted in STA-only mode.

src/core/WsServer.h — pre-allocated text buffer (the definitive fix)

Added a pre-allocated AsyncWebSocketSharedBuffer textBuf_ (8 192 B) alongside the existing pixBuf_. Allocated in begin() before server.begin() and before WiFi connects — at a point when the heap is unfragmented. Three new public methods:

char* textBufData();                  // pointer into the shared buffer
size_t textBufSize() const;           // always kMaxTextFrame = 8192
void broadcastTextBuf(size_t len);    // resize to len, then c.text(textBuf_) per client

broadcastTextBuf uses c.text(AsyncWebSocketSharedBuffer) — the shared-pointer API. Per broadcast, this allocates only a small AsyncWebSocketMessage wrapper per connected client rather than an n-byte data copy. The text data itself is never reallocated.

src/core/AppSetup.cpp — remove heap_caps_malloc from driverTask

Both the schema push path and the state push path replaced the heap_caps_malloc(n+1) / serializeJson / heap_caps_free pattern with the pre-allocated buffer:

// Before:
char* buf = (char*)heap_caps_malloc(n + 1, MALLOC_CAP_INTERNAL);
if (buf) { serializeJson(doc, buf, n + 1); s_ws->broadcastText(buf, n); heap_caps_free(buf); }

// After:
if (n < s_ws->textBufSize()) {
  serializeJson(doc, s_ws->textBufData(), s_ws->textBufSize());
  s_ws->broadcastTextBuf(n);
}

This eliminates the traversal of the (potentially corrupted) heap free list entirely. The ArduinoJson internal pool allocation (small, a few hundred bytes) and the per-client AsyncWebSocketMessage wrapper (small, per client) are the only remaining dynamic allocations in the broadcast path; both are too small to reach the corrupted free-list region.

src/modules/layers/EffectsLayer.hhealthReport()

Added a healthReport() override that includes geometry, modifier generation counter, and allocate-call count: "16x16x1 gen=0 allocs=1". Enables automated test assertions on EffectsLayer state without inspecting individual controls.

docs/developer-guide/network.md (new)

Comprehensive reference for all four network modules (NetworkModule, WifiStaModule, WifiApModule, EthernetModule): controls tables, status values, connection lifecycle, management policy state diagram, all timer constants, PAL function reference tables, modulemanager.json wiring example, and PC build notes.

Module doc updates

All four docs/modules/network/*.md files updated to match the current implementation:

  • network-module.md: corrected WIFI_AP_STAWIFI_STA; added management policy summary with link to network.md.
  • wifi-ap-module.md: removed "always starts on boot / stays up when STA connects" (both wrong after Sprint 10); added accurate auto-management description and status value table; link to management policy.
  • wifi-sta-module.md: fixed PC status string ("PC (no WiFi)""no WiFi"); added status value table and timer constants; link to management policy.
  • ethernet-module.md: replaced stale note about disabling NetworkModule management with accurate description (STA re-enabled when Ethernet drops); link to network.md.

docs/developer-guide/pal.md: added a "WiFi and Ethernet" subsection pointing to network.md#pal-functions (previously undocumented in pal.md).

docs/user-guide/getting-started.md: Step 5 now notes that the AP closes within 10 seconds of STA connecting, and re-opens after 30 seconds of STA loss. Link to network.md added.

Result

Metric Value
Unit tests 401/401 pass (no new tests — all changes are runtime/hardware-path only)
PC build Clean (0 warnings)
ESP32dev build Clean (0 warnings)
ESP32 crash Resolved — no LoadStoreError after STA connects under sustained WebSocket load
WS text broadcast Zero heap_caps_malloc calls in driverTask; 8 KB buffer pre-allocated before WiFi
AP management Auto-disables within 10 s of STA connect; re-enables after 30 s of STA loss; no spurious wifi_ap_stop() calls on STA-only-boot devices
New docs docs/developer-guide/network.md (315 lines); 6 existing docs updated

Definition of Done

  • [x] deploy/start_pc.py created; kills existing process, starts fresh server, streams output, handles SIGTERM
  • [x] Start Server card added to PC group in deploy/ui.py
  • [x] wifi_ap_start() switches to WIFI_AP_STA before softAP()
  • [x] wifi_ap_stop() returns immediately when softAPIP() == 0.0.0.0
  • [x] NetworkModule::setup() calls WiFi.mode(WIFI_STA); header comment explains the heap-safety rationale
  • [x] manageWifi_() guards AP disable with ap_->isEnabled() to avoid no-op driver calls
  • [x] WsServer pre-allocates textBuf_ (8 192 B) in begin(); textBufData(), textBufSize(), broadcastTextBuf() public
  • [x] AppSetup.cpp both broadcast paths use textBufData()/broadcastTextBuf() — no heap_caps_malloc in driverTask
  • [x] EffectsLayer::healthReport() override added
  • [x] docs/developer-guide/network.md written with full module reference, management policy diagram, PAL function tables, wiring example
  • [x] All four docs/modules/network/*.md files accurate and cross-linked to network.md
  • [x] docs/developer-guide/pal.md has WiFi/Ethernet PAL pointer
  • [x] docs/user-guide/getting-started.md accurate AP behaviour description
  • [x] 401/401 tests pass; mkdocs builds clean

Retrospective

What went well:

  • The pre-allocated buffer approach was the cleanest fix: it moved the allocation to a deterministic point in time (before WiFi, before server.begin()), eliminated the large heap traversal from the hot path entirely, and as a side effect also improved the broadcast pattern (shared-pointer API avoids the per-client data copy that broadcastText(buf, n) did).
  • The softAPIP() == 0.0.0.0 guard in wifi_ap_stop() is an idiomatic ESP32 check (used by the Arduino WiFi library itself). It makes all AP stop calls safe regardless of whether the AP was ever started, which simplifies every call site.
  • The isEnabled() guard on the management tick's AP disable path made the intent explicit: "only stop the AP if it was running." The two guards together (isEnabled() at the call site + softAPIP() at the PAL level) are defence in depth.
  • Changing WiFi.mode(WIFI_STA) and moving the mode switch into wifi_ap_start() correctly modelled the real invariant: the AP netif should be allocated if and only if the AP is running.

What was tricky:

  • Three fix attempts were needed before finding the root cause. Fix 1 (WIFI_AP_STA → WIFI_STA at boot) reduced heap churn but the crash moved to a different site (EXCVADDR 0x40096db0, inside StaticStringWriter in ArduinoJson) rather than disappearing. Fix 2 (adding the wifi_ap_stop() guard) was a correctness fix but did not touch the allocation path. The crash became worse — it started firing before any manual UI refresh because the browser auto-reconnected and triggered a schema push. The third fix (pre-allocated buffer) eliminated the corrupted heap traversal entirely.
  • The crash was intermittent in that it depended on when the schema push fired relative to the WiFi SDK's internal free/realloc sequence. Once the browser reconnected automatically after the first crash, the heap state was different enough that the crash reproduced on the very next connection, making it look worse even though it was the same underlying bug.
  • The WiFi.softAPIP() == IPAddress(0, 0, 0, 0) comparison requires an IPAddress object on the right side; comparing to 0 or nullptr does not compile — the Arduino IPAddress class does not have those implicit conversions.

Seeds for next sprint / release:

  • The 8 192 B textBuf_ covers state JSON for typical configs (~2 KB) and schema JSON (~6 KB). If the module count grows significantly, measureJson() could exceed textBufSize() and the broadcast would be silently dropped. A future improvement: increase kMaxTextFrame or split large schema pushes into per-module incremental updates.
  • The WiFi SDK heap corruption is a known lwIP issue on ESP32 classic. An eventual migration to ESP32-S3 for all devices removes the vulnerability (S3 has a larger and more robust heap implementation). Until then, the pre-allocated buffer approach holds.
  • The EffectsLayer::healthReport() format (16x16x1 gen=0 allocs=1) is not yet covered by a test. A test case that verifies the format survives a geometry change would prevent regression if the method is refactored.

Sprint 13: FastLED Compatibility, ArtNet Unicast, Dynamic Control Coverage, and ArtNet Stutter Fixes

Scope: Wrap struct RGB in namespace projectMM to eliminate a compile collision with FastLED. Refactor ArtNetOutModule to accept any pixel source via pixelBuf() and add broadcast/unicast mode selection with a conditionally visible IP field. Extend EthernetModule with rebuildControls() so static IP fields are hidden in DHCP mode. Fix MkDocs Serve in the deploy UI to kill stale processes on stop. Investigate and fix ArtNet frame stutters on PC: add a persistent UDP send socket to the PAL, add a per-module fps_limit_ control to ArtNetOutModule, and cache the macOS malloc_zone_statistics() result to eliminate a ~100 ms hot-path pause that fired every second.

Motivation

Two separate problems surfaced when integrating projectMM as a library into a FastLED-based project (FastLED_MM):

  1. FastLED declares enum EOrder { RGB = 0012, ... } in the global namespace. projectMM's struct RGB in the same global namespace caused an immediate compile error (redefinition of 'RGB' as different kind of symbol). The fix required qualifying every RGB reference in the codebase.

  2. ArtNetOutModule held a raw DriverLayer* and called readyChannel() on it directly. This made ArtNetOutModule impossible to use from a project that has its own pixel buffer type (e.g. FastLED's CRGB[]), because DriverLayer is an internal implementation detail. The right abstraction was pixelBuf() on StatefulModuleBase — a virtual method that returns a const uint8_t* directly into the caller's existing buffer with no copy.

Separately, EthernetModule registered all three static IP controls (static_ip, static_gateway, static_subnet) unconditionally, cluttering the UI for devices using DHCP. The pattern from Sprint 1 (rebuildControls() + clearControls()) was the direct fix.

Changes

src/modules/layers/RGB.hnamespace projectMM

struct RGB wrapped in namespace projectMM:

namespace projectMM {
struct RGB {
  uint8_t r;
  uint8_t g;
  uint8_t b;
};
}  // namespace projectMM

No using namespace projectMM or using projectMM::RGB in any header — all callers use the qualified name so RGB never re-enters the global namespace.

All pixel-type usages updated

Every RGB reference replaced with projectMM::RGB across:

  • src/modules/layers/Channel.hRGB* pixels field
  • src/modules/layers/EffectsLayer.hsizeof, alloc casts (projectMM::RGB* n0/n1), heapSize()
  • src/modules/layers/DriverLayer.hsizeof, alloc cast
  • src/modules/layouts/GridLayout.hsizeof
  • src/modules/effects/SineEffect.h, RipplesEffect.h, LinesEffect.h, GameOfLifeEffect.h, NoiseEffect2D.h, DistortionWaves2DEffect.h, FlowFluidEffect.hsizeof, hsvToRgb_ return type, RGB{...} literals
  • src/modules/effects/ArtNetInModule.hsizeof, literal
  • tests/test_effects_2d.cpp, tests/test_system_info.cppsizeof and pointer casts

src/modules/drivers/ArtNetOutModule.h — generic pixel source

DriverLayer* replaced with StatefulModuleBase*. The setInput key changed from "layer" to "source"; the legacy key "layer" is still accepted for backwards-compatible state files. The pixel data is retrieved via the virtual pixelBuf() method:

const uint8_t* rgb = nullptr;
size_t total = 0;
uint16_t w = 0, h = 0, d = 0;
if (!source_->pixelBuf(rgb, total, w, h, d) || !rgb || total == 0) return;

pixelBuf() returns a pointer directly into the source module's existing buffer (no copy, no new allocation). The per-universe copy is a single memcpy call rather than the previous per-pixel three-byte store loop — faster at any pixel count.

#include "modules/layers/DriverLayer.h" removed; ArtNetOutModule now only includes core/StatefulModule.h and pal/Pal.h, making it usable as a library header without dragging in DriverLayer's dependencies.

src/modules/drivers/ArtNetOutModule.h — broadcast/unicast mode

Added a mode select control ("broadcast" / "unicast") with an ip_ field that is only visible in unicast mode, using the rebuildControls() pattern from Sprint 1:

void rebuildControls() override {
  clearControls();
  addControl(universe_start_, "universe_start", "slider", 0, 255);
  addControl(fps_limit_, "fps_limit", "slider", 0, 200);
  addControl(mode_, "mode", kArtNetModes, 2);
  if (mode_ == kArtNetModeUnicast)
    addControl(ip_, sizeof(ip_), "ip", "text");
}

void onUpdate(const char* key) override {
  if (strcmp(key, "mode") == 0) rebuildControls();
}

In broadcast mode the destination is "255.255.255.255"; in unicast mode it is the ip_ field value. Default mode is broadcast; default IP is "192.168.1.1". fps_limit_ defaults to 50; set to 0 to disable the limiter.

src/modules/drivers/ArtNetOutModule.hfps_limit_ rate limiter

ArtNet stutters on PC were traced to two root causes. The first was the loop running at 10,000+ FPS and creating one UDP socket per universe per frame (socket() + sendto() + close() per call). The second was malloc_zone_statistics() on macOS blocking the render path for ~25 ms per call, called four times per second.

A per-module fps_limit_ control (default 50, range 0–200; 0 = unlimited) was added to ArtNetOutModule. The rate gate uses pal::micros() and a lastSendUs_ timestamp, evaluated at the top of loop() before any socket work:

if (fps_limit_ > 0) {
  int64_t now = pal::micros();
  int64_t minIntervalUs = 1000000LL / fps_limit_;
  if (now - lastSendUs_ < minIntervalUs) return;
  lastSendUs_ = now;
}

This keeps effects running at full scheduler FPS while throttling the network output independently.

src/pal/Pal.h — persistent UDP send socket (udp_tx_open / udp_tx_send / udp_tx_close)

The second stutter cause was per-packet socket lifecycle overhead. The original pal::udp_send() called socket() + sendto() + close() for every universe every frame. At full loop rate this created thousands of syscalls per second and flooded the LAN with rapid-fire UDP bursts.

Three new PAL functions replace it with a persistent socket pattern:

inline int  udp_tx_open();                                                         // open once in setup()
inline bool udp_tx_send(int h, const char* ip, uint16_t port,
                        const char* buf, size_t len);                              // called per universe per frame
inline void udp_tx_close(int h);                                                   // called in teardown()

On PC the handle is a POSIX int file descriptor. On Arduino/ESP32 it is an index into a small WiFiUDP slot pool (UDP_TX_SLOTS = 2). ArtNetOutModule::setup() calls udp_tx_open() and stores the handle in udpFd_; teardown() calls udp_tx_close(udpFd_). If the open fails at setup (network not yet ready), loop() retries once per frame.

src/pal/Pal.h — macOS free_heap_bytes() cache

malloc_zone_statistics() on macOS performs an O(n-allocations) heap scan and takes 20–30 ms. It was called four times per second from two call sites: SystemStatusModule::loop1s() (two calls for free_heap_kb() and max_alloc_kb()) and the MemLive periodic check in tickPeriodic() (two more calls). Combined, this caused a ~100 ms pause in the render pipeline every second — the observed "heartbeat" stutter.

Fixed with a 5 ms TTL cache in pal::free_heap_bytes() on macOS:

#elif defined(__APPLE__)
  static uint32_t s_cached = 0;
  static int64_t  s_cachedUs = 0;
  int64_t now = micros();
  if (now - s_cachedUs > 5000) {
    malloc_statistics_t s{};
    malloc_zone_statistics(nullptr, &s);
    s_cached = s.size_allocated > s.size_in_use
                 ? (uint32_t)(s.size_allocated - s.size_in_use) : 0u;
    s_cachedUs = now;
  }
  return s_cached;

All four callers within the same scheduler tick now share one scan. The stutter dropped to unmeasurable.

src/modules/system/Ethernet.hrebuildControls()

All control registration extracted into rebuildControls(). The three static IP fields (static_ip, static_gateway, static_subnet) are only registered when modeIdx_ == 1 (static mode):

void rebuildControls() override {
  clearControls();
  if (!pal::has_ethernet()) { addControl(status_, "status", "display"); return; }
  addControl(status_, "status", "display");
  addControl(ip_address_, "ip_address", "display");
  addControl(modeIdx_, "mode", kModes_, 2);
  if (modeIdx_ == 1) {
    addControl(static_ip_, sizeof(static_ip_), "static_ip", "text");
    addControl(static_gateway_, sizeof(static_gateway_), "static_gateway", "text");
    addControl(static_subnet_, sizeof(static_subnet_), "static_subnet", "text");
  }
}

setup() sets modeIdx_ from the loaded mode_ string, then calls rebuildControls(). onUpdate("mode") calls rebuildControls() before applyIpConfig_() so the UI updates immediately when the user switches modes.

kModes_ promoted to a static constexpr const char*[] class member so it is accessible from rebuildControls() (previously it was a local static inside setup()).

deploy/start_mkdocs.py (new) and deploy/ui.py

Added deploy/start_mkdocs.py: kills any existing mkdocs serve process (pkill -f "mkdocs serve" on macOS/Linux; taskkill on Windows), then starts a fresh uv run mkdocs serve subprocess with output streamed to stdout. Handles SIGTERM cleanly so the deploy UI's Stop button terminates the process.

MkDocs Serve card in deploy/ui.py updated to use start_mkdocs.py instead of launching mkdocs serve directly. This prevents stale MkDocs processes from accumulating when the card is stopped and restarted repeatedly.

Result

Metric Value
Unit tests 401/401 pass
PC build Clean (0 warnings)
ESP32 build Clean (0 warnings)
Files modified 16 source files (namespace), 2 driver files (ArtNet), 1 system file (Ethernet), 2 deploy files, PAL (UDP + heap cache)
New allocations in loop() 0 — pixelBuf() returns pointer into existing buffer
ArtNet input key "source" (new); "layer" still accepted for backwards compat
ArtNet FPS control fps_limit_ slider (0–200); default 50; 0 = unlimited
ArtNet socket lifecycle Persistent (udp_tx_open at setup, udp_tx_close at teardown); eliminates per-frame socket()/close() syscalls
macOS heap scan cost Reduced from ~100 ms/s (4 scans x 25 ms) to ~5 ms/s (1 scan every 5 ms TTL)
ArtNet stutter "Heartbeat" stutter resolved; frames reach FPP smoothly at 50 FPS

Definition of Done

  • [x] struct RGB wrapped in namespace projectMM; no using alias in any header
  • [x] All RGB references qualified as projectMM::RGB across layers, effects, layouts, and tests
  • [x] ArtNetOutModule holds StatefulModuleBase*; DriverLayer.h include removed
  • [x] setInput("source", ...) wires the pixel source; "layer" alias preserved
  • [x] pixelBuf() used in loop() — zero-copy, no new allocation
  • [x] ArtNetOutModule::rebuildControls() hides ip_ field in broadcast mode
  • [x] fps_limit_ slider control (0–200, default 50) added to ArtNetOutModule; 0 disables the limiter
  • [x] loop() rate gate uses pal::micros() and lastSendUs_; effects run at full scheduler FPS
  • [x] pal::udp_tx_open() / udp_tx_send() / udp_tx_close() added to PAL (PC: POSIX fd; Arduino: WiFiUDP slot)
  • [x] ArtNetOutModule::setup() calls udp_tx_open(); teardown() calls udp_tx_close()
  • [x] pal::free_heap_bytes() on macOS caches malloc_zone_statistics() result with 5 ms TTL
  • [x] onUpdate("mode") calls rebuildControls() to update UI live
  • [x] EthernetModule::rebuildControls() hides static IP fields in DHCP mode
  • [x] EthernetModule::onUpdate("mode") calls rebuildControls() before applyIpConfig_()
  • [x] kModes_ is a static class member (not a local static in setup())
  • [x] deploy/start_mkdocs.py kills stale processes before starting; handles SIGTERM
  • [x] MkDocs Serve card in deploy/ui.py uses start_mkdocs.py
  • [x] 401/401 tests pass; PC and ESP32 builds clean

Retrospective

What went well:

  • The namespace projectMM fix was mechanical and complete: a global replace of RGB with projectMM::RGB across all headers and tests. No using aliases were introduced anywhere, so the fix is stable against future includes that might bring FastLED into scope.
  • pixelBuf() as the abstraction boundary was the right choice: it made ArtNetOutModule a pure consumer with no dependency on how pixels are produced. The zero-copy property follows directly from the interface contract (returns a pointer into the source's buffer) rather than requiring explicit implementation care.
  • The memcpy-per-universe pattern is both simpler and faster than the previous per-pixel three-byte store loop. Replacing the loop with a single memcpy removed ~170 iterations (one per pixel in a full universe) without changing correctness.
  • The rebuildControls() extension to EthernetModule and the unicast IP field in ArtNetOutModule reused the Sprint 1 infrastructure without any changes to the framework.

What was tricky:

  • EffectsLayer.h had two separate RGB* n lines (one for n0, one for n1); the replace_all on RGB* matched only the first pattern. The second line required a targeted edit after noticing the build still failed.
  • The addControl overload for the mode select in ArtNetOutModule initially had an extra "select" string argument: addControl(mode_, "mode", "select", kArtNetModes, 2). The correct overload from StatefulModule.h is addControl(uint8_t& variable, const char* key, const char* const* options, uint8_t count) — no type string. The build error identified it; removing "select" fixed it.

What went well:

  • The persistent UDP socket (udp_tx_open/send/close) was a clean PAL abstraction: the callers (ArtNetOutModule) became simpler and the platform difference (POSIX fd vs WiFiUDP slot) is fully hidden.
  • The 5 ms TTL cache for malloc_zone_statistics() resolved the heartbeat stutter completely with a minimal, local change. No architectural changes were needed on the monitoring side.
  • Adding fps_limit_ as a per-module control (rather than a global loop cap) kept the design correct: effects run at maximum rate; only network output is throttled.

What was tricky:

  • The stutter had two independent root causes (socket lifecycle overhead and macOS heap scan) that appeared as a single symptom. Fixing the persistent socket first revealed the second cause. Diagnosing both required profiling pal::free_heap_bytes() call frequency and duration.
  • The global FPS cap in main.cpp was initially proposed as a fix for the socket flood but was correctly rejected: a global cap would throttle effects regardless of whether ArtNet was running, violating the "effects at maximum rate" principle.

Seeds for next sprint / release:

  • Unicast mode currently supports a single destination IP. The user has flagged that multiple-IP unicast (sending the same universe to several receivers) will be needed. The mode control would expand to "broadcast" / "unicast" / "multicast" or a repeating IP list; the rebuildControls() pattern extends naturally.
  • The projectMM::RGB struct is intentionally minimal (3 x uint8_t). A planned rework will align it with MoonLight's channel model (more than 3 channels per light, e.g. RGBW or RGBWW). When that happens, pixelBuf() will need a companion channelsPerPixel() method so consumers like ArtNetOutModule can adapt their packing without hardcoding * 3.
  • ArtNetInModule still uses sizeof(RGB) qualified as projectMM::RGB but was not refactored further in this sprint. If it is used as a library header alongside FastLED the same include-order sensitivity applies; a follow-up should verify its pixelBuf() / source wiring matches the new pattern.
  • The root cause of the heartbeat stutter revealed an architectural issue: tickPeriodic() monitoring callbacks run synchronously inside scheduler.loop() on PC, blocking the entire render pipeline. On ESP32 this is mitigated by FreeRTOS two-core scheduling; on PC there is no such separation. The proper fix is running monitoring work on a background thread on PC. Added to backlog.

Sprint 14: Schema Push Stability, NTP, and Generic Auto-Wiring

Scope: Three independent improvements that accumulated across two sessions. (A) Fix rebuildControls() only taking effect on the first control change — root cause was a single pre-allocated schemaBuf_ being resized while an in-flight AsyncWebSocketMessage still held a reference. Fix: double-buffered schema (A/B alternating). (B) Add NtpModule for wall-clock time sync on ESP32. (C) Replace per-type strcmp chains in ModuleManager with a generic autoWireKeys() virtual so modules declare their own wiring preferences.

Part A: WsServer schema double-buffering

Root cause. AsyncWebSocketMessage::send() reads _WSbuffer->size() at transmission time, not at queue time. Resizing the single schemaBuf_ for the next push while the previous message was still in the TCP send queue caused the queued frame to broadcast with the new (wrong) length — producing a truncated or corrupt JSON frame on every second rebuildControls() call.

Fix. WsServer now holds schemaBufs_[2] (two pre-allocated 16 KB buffers) and a one-bit index schemaIdx_. broadcastSchemaBuf() writes into schemaBufs_[schemaIdx_], broadcasts it, then flips schemaIdx_ ^= 1. Consecutive pushes never write to the same buffer. Both buffers are pre-allocated in begin(), before WiFi connects, to avoid heap_caps_malloc calls on the corrupted-heap-free-list that the WiFi STA connect sequence leaves behind.

AppSetup.cpp. driverTask now tracks lastSchemaMs and rate-limits schema pushes to 50 ms. Uses schemaBufData()/schemaBufSize()/broadcastSchemaBuf() instead of the text buffer.

Tests. test_websocket.cpp gained 4 unit tests: schemaDirty set on first type change, re-set after clear and second change (the regression guard), ModuleManager::hasSchemaDirty() across consecutive rebuilds, and control set change visible in getModulesJson. test_integration.cpp gained 1 integration test that verifies the WS push loop emits a schema frame both before and after clearSchemaDirty().

Live test. deploy/live.py test8_rebuild_controls: creates an EffectsLayer + SineEffectModule, cycles the type control 0→1→0→1, asserts /api/modules returns the expected control keys after each change, and checks is_crash == false.

Part B: NtpModule

NtpModule syncs wall-clock time from a configurable NTP server and exposes local_time as a live display control. On ESP32 it calls configTime() / getLocalTime(); on PC it reads the system clock directly. loop1s() retries sync until getLocalTime returns a valid time.

PAL additions (Pal.h): ntp_sync(server, gmtOffsetSec, dstOffsetSec) wraps configTime(); local_time_str(buf, len) writes "HH:MM:SS" (returns false and "--:--:--" when time is not yet available).

NtpModule is registered in CoreRegistrations.cpp and auto-created as ntp1 by pal::ensureDefaultModules (embedded only).

SystemStatusModule gained a local_time_ field updated from pal::local_time_str() each second.

Part C: Generic auto-wiring via autoWireKeys()

ModuleManager::addModule(), instantiateFromArray() pass 2b, and replaceModule() each contained duplicated per-type strcmp("DriverLayer") / strcmp("EffectsLayer") blocks. Option B replaces all three with a single applyAutoWire_(m) call.

AutoWireSpec (new struct in StatefulModule.h):

struct AutoWireSpec {
  const char* inputKey;    // my input key; nullptr marks end of list
  const char* searchType;  // type name to search for in owned modules
  bool allMatches;         // true = wire all matches; false = first only
  const char* backKey;     // non-null: also call found->setInput(backKey, this)
};

virtual const AutoWireSpec* autoWireKeys() const added to StatefulModuleBase (default nullptr).

Overrides added:

Module Rule
DriverLayer find all EffectsLayer → wire as "source"
EffectsLayer find first DriverLayer → wire as "driver", back-wire self as DriverLayer's "source"
ArtNetOutModule find first DriverLayer → wire as "source"
ArtNetInModule find first EffectsLayer → wire as "layer"
PreviewModule find first ProducerModule → wire as "source"

ModuleManager::applyAutoWire_(StatefulModuleBase*) (new private helper, ~15 lines) iterates the returned spec array and calls setInput() accordingly. Net change in ModuleManager.cpp: -30 lines, +15 lines.

No hotpath impact. autoWireKeys() is called only during addModule() and instantiateFromArray(), never from loop(). Zero RAM cost per instance (vtable entry is in flash).

Summary

Part Description Est
A: Schema double-buffering WsServer schemaBufs_[2], AppSetup rate-limit, 5 tests, 1 live test S
B: NtpModule NtpModule + PAL ntp_sync/local_time_str + registration S
C: autoWireKeys() AutoWireSpec, virtual, 5 module overrides, applyAutoWire_ S
Total M

Definition of Done

  • [x] Second rebuildControls() call updates the browser UI without page reload
  • [x] WsServer::schemaBufs_[2] + schemaIdx_ replaces single schemaBuf_; broadcastSchemaBuf() alternates buffers
  • [x] AppSetup.cpp uses schemaBufData/schemaBufSize/broadcastSchemaBuf; rate-limited to 50 ms
  • [x] 4 unit tests for schemaDirty lifecycle; 1 integration test for consecutive schema pushes
  • [x] deploy/live.py test8_rebuild_controls exercises 4 type changes and asserts correct control keys
  • [x] NtpModule registers, auto-creates on embedded, shows local_time in UI
  • [x] pal::ntp_sync() and pal::local_time_str() compile on ESP32 and PC
  • [x] AutoWireSpec + autoWireKeys() virtual in StatefulModuleBase
  • [x] DriverLayer, EffectsLayer, ArtNetOutModule, ArtNetInModule, PreviewModule override autoWireKeys()
  • [x] ModuleManager::applyAutoWire_() replaces all three per-type strcmp blocks
  • [x] 406/406 unit tests pass; PC build clean; PC live tests 2/2 pass

Result

Metric Value
Unit tests 406/406 pass (was 401)
PC build clean
PC live tests 2/2 pass
New tests 5 (4 unit + 1 integration)
ModuleManager.cpp delta -30 lines removed, +15 added
RAM cost per module instance 0 bytes (autoWireKeys vtable entry in flash)

Retrospective

What went well:

  • The double-buffer fix is a small, surgical change (one extra array + one XOR flip) that eliminates a whole class of timing-dependent WebSocket corruption without changing the send API or adding heap pressure.
  • Tracing the root cause to _WSbuffer->size() at send time vs. queue time was the key insight. Once understood, both the fix and the regression test followed naturally.
  • autoWireKeys() is a clean application of the "modules self-declare" principle already used for category(), allowedChildCategories(), and preferredCore(). The backKey field handles the asymmetric EffectsLayer-DriverLayer bidirectional wire in one spec entry.
  • NtpModule is self-contained: PAL handles the platform split, no changes to any existing module.

What was tricky:

  • The single-buffer fix (adding a separate schemaBuf_) passed the first push and failed on the second — a timing-dependent bug that only manifests when two pushes happen in quick succession. The lesson: for shared-buffer WebSocket code, always test consecutive rapid changes, not just single changes.
  • The autoWireKeys() design needed the backKey field to express the EffectsLayer→DriverLayer back-wire without extra complexity; the alternative (two-pass wiring) would have been harder to read.

Seeds for next sprint:

  • broadcastBinary (pixel preview) uses the same single shared pixBuf_ pattern. Lower risk than schema (binary frames don't parse, corruption shows as a visual glitch), but the same class of bug exists.
  • ArtNetInModule auto-wires to the first EffectsLayer when added as a top-level module. Effect modules (SineEffectModule, RipplesEffectModule, etc.) do not yet have autoWireKeys() overrides; adding a base-class default (e.g. via an intermediate EffectModuleBase) would make top-level effect additions self-wiring.
  • Hardware live test: connect LAN8720 (esp32dev) and verify Ethernet + WiFi reconnect behavior on real hardware (carried from Sprint 13 seeds).

Complexity estimate: Medium (M).


Release 8 Backlog

All items consolidated into the cross-release backlog.