Skip to content

Development History

v1 is frozen — active development is in projectMM v2

This repository is in maintenance-only mode. The restart decision is documented in Release 9; the v2 plan to parity lives in v2's Release 1. Tag v1.8.0-pre-restart marks v1's final pre-restart state.

This section records how projectMM is built: the process, the decisions, and the history of each release. Vision, principles, and system overview live in the home page; architecture, design, and implementation details live in their respective documents.


How to Read This Section

projectMM follows an Agile development process adapted for AI-collaborative development. The terminology:

Term Meaning
Release A major deliverable milestone. Each release has a theme (e.g. "Proof of Concept") and corresponds to a GitHub Release tag when complete.
Sprint A scope-boxed work cycle within a release. Each sprint has a goal, a definition of done, a result, and a retrospective.
Backlog Items raised but explicitly deferred. Each entry notes when it should be picked up — a backlog item is not a forgotten one.
Retrospective End-of-sprint reflection: what was learned, what questions were raised, and what seeds the next sprint.

For new contributors: start with the latest release document to understand where the project is. Read the Backlog for contribution opportunities — items marked with a "pick up when" condition that is now met are good candidates.


Development Process

  • Agile sprints. Each sprint begins by reviewing the previous sprint's retrospective and closes by writing a new one. Sprints are scope-boxed (not time-boxed): a sprint ends when its definition of done is met.
  • Top-down refinement. Ideas flow from vision to architecture to design to implementation, with each layer free to push back on the one above.
  • Version control. GitHub is the primary host. The repository is ewowi/projectMM (will be moved/renamed to projectMM under MoonModules when appropriate).
  • CI/CD. Automated build, test, and (where possible) deploy across PC, rPi, and ESP32. Every PR gets a footprint delta report.
  • Automated testing. Unit, integration, and (future) hardware-in-the-loop. Coverage is part of the definition of done.
  • AI in the workflow. Code generation and review are AI-assisted. We stay tool-agnostic: currently Claude Code for generation and CodeRabbit for PR review, with active experiments on others (e.g. Mistral Devstral).
  • Open to contributors. The project welcomes pull requests from day one. The sprint structure, backlog, and documentation are designed so a new contributor (human or AI) can find meaningful work and understand why decisions were made.

Sprint Lifecycle

This is the standard sequence for running a sprint — followed by the maintainer and by AI coding assistants alike. Each step has a prompt or action associated with it.

Step 1 — Pre-sprint analysis

Before the implementation prompt, raise any observations, findings, or new requirements that have come up since the last sprint. These are analysed, documented in the relevant docs (architecture.md, design.md, implementation.md, backlog), and linked from the sprint section.

Prompt pattern: "Before starting the next sprint, I have a few things to analyse — add them to the relevant docs and link from this sprint: [list]"

Step 2 — Implementation prompt

Kick off the sprint with a standard prompt that includes: - Which sprint to implement - Check previous retrospectives and backlog for relevant deferred items - Ask clarifying questions before starting; update the scope or architecture/design/implementation docs based on answers - When done: ask to build and run on PC and ESP32, run the tests, capture and analyse logs, update the sprint's Result section, and prove that the issues targeted by the sprint are resolved - After build-and-run: add Result, DoD verification, and Retrospective to the sprint section - If new modules were created: add a doc page in docs/modules/ and register it in mkdocs.yml

Prompt pattern: "Implement the next sprint. Check previous retrospectives and the backlog for anything relevant. Ask me any questions (provide options and propose the simplest and backlog the rest by default ) and update the scope (or architecture/design/implementation docs, with links). When implemented, ask if you can build and run for PC and ESP32 and run the tests, capture and analyse the logs, update relevant documents and the sprint's Result section (refer to other pages where possible), (follow AI-Specific Writing Rules for all generated docs and comments) and prove the sprint issues are solved. Then add Result, DoD, and Retrospective. Add a complexity estimate"

Step 3 — Q&A

The AI asks scoping questions; the maintainer answers. Answers narrow the scope, defer to backlog, or confirm. This keeps sprints small and reviewable. Aim to focus on small steps, so if alternitives are presented answer with: do the most simple alternative first and add the more complex items to future sprints or to the backlog.

Step 4 — Watch the run

While the AI builds and runs, monitor progress in the git client (e.g. GitKraken): stage files as they appear to get a clearer view of what has changed. Use the diff view to verify that only the expected files are touched.

Step 5 — Approve or redirect tool calls

When the AI requests permission for a tool call (shell command, file write, etc.): - Yes — approve if the action looks correct. - Other — redirect with a reason if the approach is wrong or unsafe. - Never approve git commit — always evaluate the staged changes yourself first and do the commit manually after review.

Step 6 — Sprint close

Before committing, verify the sprint is complete: If complete, ask an AI agent to do the commit with a descriptive commit message.

Prompt (maintainer will ask AI!): "Did we finish this sprint? Did you check the Architecture review checklist? Did you run build and run and log including live tests on PC and ESP32 devices. If not please run it. Did you update DoD, Results, Retrospective (what went well / what was tricky / seeds for next sprint) ? Is mkdocs serve running fine? Anything else before commit? if not run the commit command with a descriptive message, leave out Co-Authored-By: Claude ... as last sentence"

Architecture review checklist — answer these before closing every sprint:

Question Why it matters
Does this change fit the existing architecture, or does it work around a limitation? Workarounds compound; a root-cause fix is cheaper long-term
Did any new heap allocation get added? Is it bounded, and is it freed in teardown()? Unbounded growth or missing teardown causes OOM on long-running devices
Did any new member get added to a base class or widely-shared struct? Base-class bloat multiplies across every instance
Does the solution add a new abstraction, or reuse an existing one? New abstractions add cognitive load; reuse keeps the design learnable
If something was patched rather than fixed, is the root cause recorded in the backlog? Patches are fine short-term if the debt is visible

Standards and Guidelines

The dedicated Standards and Guidelines document covers coding standards for C/C++ and JS, the anti-debt checklist, the human/AI readability rule, non-trivial testing, AI workflow rules, and the contribution workflow.


Releases

Release Theme Status Tag
Release 1 Proof of Concept Complete v1.0.0
Release 2 Observable, Hierarchical, and Network-Ready Complete v1.1.0
Release 3 Pixel Pipeline, MoonLight Effects, and Continuous Integration In progress
Release 4 Platform Reach In progress
Release 5 Virtual/Physical Layer Architecture Complete v1.5.0
Release 6 Deploy Pipeline Hardening and Agentic Harness Planned
Release 7 Firmware Delivery, New Platforms, and Developer Tooling Planned
Release 8 Dynamic Controls, Deploy Pipeline, and Runtime Hardening Complete v1.8.0
Release 9 Evaluation: restart accepted Complete

Release 9 is v1's final release. The restart begins as Release 1 of a new sibling repo projectMM-v2/; see Release 9 § v2 plan to parity.

The original Release 3–10 roadmap (pixel pipeline, 2D effects, modifiers, 3D, drivers, network protocols) is preserved in the MoonLight Scope subfolder for reference.


Backlog

Items raised but deferred. Each entry notes when it should be picked up. Sprint-specific and release-specific items live in the relevant sprint/release doc; this is the cross-release long-term list.


Runtime and infrastructure

  • Effects animate slowly in preview/WebGL on PC-Windows (fine on PC-macOS). The server FPS is high but the WebGL preview window shows the effect moving slowly. Observed on Windows only; macOS is unaffected. Possible causes to investigate: (a) PreviewModule WS push rate is throttled independently of server FPS — browser only receives frames at the push cadence; (b) effect uses frame-count rather than elapsed time, so visual speed is coupled to loop rate which differs between platforms; (c) browser requestAnimationFrame at 60 Hz cap causes a growing WS message queue that lags behind the server push rate; (d) pal::millis() / pal::micros() unit mismatch on Windows — effect passes microseconds where milliseconds are expected (or vice versa), running the animation 1000x too slow. Art-NET output not yet tested; issue may be preview-specific. When: Windows is used as a primary development or demo platform. (From R7S3.)

  • PhysMap index-width optimisation. PhysMap uses uint32_t always (R5S2 Q1:A). On boards without PSRAM, uint16_t would halve the map's memory footprint (20 KB to 10 KB at 5000 LEDs). Requires a PAL-gated allocation or a template parameter. When: large installations on non-PSRAM boards make the 4-byte-per-LED cost visible in RAM budget.

  • EffectsLayer dim wiring to DriverLayer. EffectsLayer retains its own width/height/depth controls as a fallback (R5S2 Q2:A). Removing these props and having EffectsLayer size from setInput("driver", …) eliminates operator redundancy and is consistent with the virtual/physical split. When: the virtual/physical pipeline is proven in production and at least one non-rectangular layout (Sprint 3) is shipping.

  • DriverLayer identity-skip optimisation. DriverLayer now uses a single physical_ buffer (Sprint 3); virtBlend_ was removed. For an identity mapping the blend loop still walks all physical LEDs. The skip would alias the single-source channel directly to readyChannel(). Saves one O(n) pass per frame. When: profiling shows the blend loop is a meaningful fraction of frame time at the target LED count.

  • PreviewModule virtual-side wiring. PreviewModule is wired to DriverLayer output (physical pixels) by default (R5S2 Q4:A). Operators who want to preview virtual-space effects independently of the physical mapping need a preview.source = effectsLayer wiring option. When: Sprint 3 introduces the 0x03 coordinate-per-LED wire format, which requires virtual and physical coordinates to be distinct.

  • Supersync. Multi-instance synchronization (passive mirror → active control → group control → shared processing with synchronized clocks). Includes discovery (mDNS / MQTT / explicit config) and clock sync. See architecture.md. When: single-instance runtime is stable on all three platforms and real users run more than one device together.

  • REST and MQTT endpoints. External control surface. See architecture.md. When: WiFiModule is stable (done Release 2 Sprint 8); REST first, MQTT when Home Assistant integration is on the table.

  • State pub/sub. Lightweight publish/subscribe to replace direct reads. Candidates: callback list, observer pattern, event queue. See architecture.md. When: a Module needs to react to state changes from two independent sources.

  • FreeRTOS Core 0 dispatch. cores_ vector is populated but xTaskCreatePinnedToCore dispatch is not yet wired. Start with the MoonLight producer/consumer split; generalise once stable. See architecture.md. When: WiFiModule is stable (Core 0 is used by WiFi; priorities need tuning).

  • Pending-writes queue. Replace controlMutex_ in the HTTP→scheduler hot path with a lock-free queue drained at the top of each tick. When: profiling shows mutex contention at realistic request rates.

  • Graceful teardown / dirty-flag save on ESP32. teardown() never runs in Arduino; saveAllState() at end of setup() covers first boot. Wire a dirty-flag + debounce timer (2 s after last control change) for runtime saves. When: live control mutation frequency makes per-change saves a flash-wear concern.

  • Multiple WiFi configs + scan-to-select. WifiStaModule currently holds a single SSID/password pair. Add support for a priority-ordered list of networks (e.g. home, office, field) and a WiFi scan endpoint (GET /api/wifi/scan) that returns visible SSIDs so the UI can present a picker instead of requiring manual SSID entry. deploy/wifi.py would gain a --profile <name> flag; deploy/flashfs.py would bake all profiles. When: a device needs to move between networks regularly, or a user-friendly onboarding flow is needed.*

  • Password / credential encryption at rest. Sprint 8 marks passwords sensitive (never broadcast); encrypting the LittleFS state files requires a key management story. When: security requirements are clearer (multi-user / cloud deployment).

  • KvStore hash table. Linear scan is fine for ≤16 entries. When: entry count exceeds 16 and profiling shows KvStore in the hot path.

  • Static RAM cost of log ring buffer and Scheduler fields on non-PSRAM devices. LOG_RING_SIZE set to 2 KB (32 entries x 64 bytes) on all devices (R7S6); ~104 B of Scheduler fields still live in .bss on all platforms. On esp32dev (no PSRAM, ~320 KB total RAM) these are permanent costs that compete with heap. Moving the ring to a PSRAM-backed allocation in setup() would free 2 KB of .bss on PSRAM boards but requires a PAL extension. Increase ring size when longer history is needed (edit LOG_RING_CAP/LOG_RING_ENTRY in Logger.cpp). When: R7/R8 device-class memory budget review, or if esp32dev free heap becomes a reliability concern.

  • WiFi RX buffer tuning on esp32dev. CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM is baked into the pre-compiled Arduino framework sdkconfig.h and cannot be overridden via -D in build_flags (causes redefinition warning; no runtime effect). Reducing from 32 to 16 would save ~16 KB heap on classic ESP32 but requires either a custom framework build or sdkconfig.defaults with the pioarduino custom-framework build path. (R7S6) When: esp32dev heap margin is tight enough to justify a custom framework build.

  • Scenario baseline system_fps threshold too tight for hardware. The 20% regression threshold causes false positives for system_fps on ESP32: WiFi scheduling and background tasks can swing fps 50%+ between runs even on a warm device. Options: (a) exclude system_fps from baseline checks and only check heap_free/max_alloc; (b) widen the fps threshold to 50%; (c) add a "volatile": true flag per metric to skip baseline comparison. (From R7S5.) When: scenario baseline is actively used to catch regressions.

  • Minimize floats in the hot path. Picked up in Release 2 Sprint 9 Part G. Integer ControlDescriptor variant with min/max/step; float controls remain available for sub-integer precision. Full DMX/Art-Net 0–255 / 0–65535 alignment is a Release 3 driver concern.

  • Consolidating ModuleManager and Scheduler — or letting one implement the other. When: experience with both makes the right abstraction boundary obvious.

  • Module footprint optimization — non-hot-path metadata to JSON only. isPermanent(), platform constraints, display label, category — strip from compiled class, store in JSON manifest. Hot path stays in C++. When: Release 3 module count and footprint budgets make this worthwhile.

  • tickPeriodic() monitoring work blocks the PC render pipeline. tickPeriodic() fires 1-second callbacks (health reports, MemLive checks) synchronously inside scheduler.loop(). On macOS, pal::free_heap_bytes() calls malloc_zone_statistics() which does an O(n-allocations) heap scan (~25 ms); four calls per second produced a ~100 ms pause in the render loop, visible as a "heartbeat" stutter in ArtNet output to a remote receiver. The immediate fix (Sprint 13) was a 5 ms TTL cache in free_heap_bytes(). The root cause remains: monitoring work couples with the render pipeline on PC. On ESP32, FreeRTOS two-core scheduling separates these concerns naturally; on PC it does not. Proper fix: run tickPeriodic() callbacks on a background thread on PC, keeping the render loop free of any blocking work. When: PC is used as a primary output platform and monitoring or observability work grows beyond what a 5 ms cache can absorb.


Linting and code quality

  • cppcheck — C++ static analysis. Active as of Sprint 7. Configured with --enable=warning, POSIX socket cast and ESP32 macro suppressions in .cppcheck-suppressions, inline suppression for intentional sizeof(expr) assertions in tests. CI blocks PR on any finding.

  • Heap delta column in techdebt monitor (ESP32 measurement). baseHeapUsage() (Sprint 4) covers controls array and pendingProps, but misses module-specific setup() allocations (PSRAM buffers, etc.). Complete the picture by measuring free_heap() before and after setup() in the test binary and printing HEAPDELTA TypeName N; surface as a "Heap delta setup (B)" column in techdebt.md. Requires an ESP32 test harness and setInput() wiring for modules that need a layer before setup() is valid. On PC pal::free_heap_bytes() is a stub so this column is ESP32-only. When: a dedicated ESP32 test harness is in place and pal::free_heap_bytes() returns real values on device.

  • Virtual setupHeapUsage() override for module-specific reporting. For modules with large PSRAM buffers (e.g. GameOfLifeEffect, PreviewModule), a virtual size_t setupHeapUsage() const { return 0; } override lets each module self-report its known allocation without platform-dependent heap measurement. Lower accuracy risk than the heap-delta approach but requires per-module maintenance. When: heap-delta (above) proves impractical and the PSRAM budget becomes a CI concern.

  • clang-tidy — deep C++ analysis. Understands the full call graph and type system. Good candidates for projectMM: modernize-* (C++17 idioms), readability-* (naming), bugprone-* (common error patterns). Requires a compile_commands.json from CMake and a platform-split suppression strategy for ESP32-only headers. When: cppcheck is in place and a sprint is dedicated to code quality; not before because the setup cost is high.

  • ruff stricter rules. Currently enforcing E/W/F/I. Next tier: add N (naming conventions), B (bugbear — common Python gotchas), UP (pyupgrade — modern Python idioms). All auto-fixable. Add to pyproject.toml select list. When: the deploy/ Python scripts grow significantly or a naming inconsistency causes a real bug.

  • JavaScript linting (ESLint). app.js is currently excluded from linting. ESLint would require a package.json and Node.js in CI. When: app.js is refactored or split into modules.

  • clang-format include ordering. Add SortIncludes: true and IncludeBlocks: Regroup to .clang-format to enforce include ordering across the project. Currently omitted because the initial reformat pass showed several include-order changes that needed manual review. When: a sprint dedicates time to auditing and fixing include order across all files.

  • C++ standard upgrade to C++20. The most useful C++20 additions for this codebase are std::span (buffer views in the PAL layer, avoids raw pointer/length pairs) and concepts (cleaner template constraints for module wiring). The PlatformIO espressif32 platform must ship GCC 13 by default before this is safe; earlier GCC versions have incomplete Xtensa C++20 support. When: espressif32 platform ships GCC 13 as default and the ESP32 CI build passes with -std=c++20 without modification.


Hardware and CI

  • WS2812Driver — GPIO LED driver module using FastLED (or ESP-IDF RMT directly). Reads DriverLayer output and GridLayout mapping to write physical pixel data. Stub on PC. Art-NET out is the higher-priority output path; this is deferred until the Art-NET driver story is clearer. When: Art-NET out is shipping and a GPIO hardware test rig is available.

  • FastLED-MM worked example. Full step-by-step walkthrough in library.md showing how to connect FastLED output to the projectMM pipeline (DriverLayer → WS2812Driver → physical LEDs). When: WS2812Driver is shipping and a hardware test rig is available. (From R5 backlog.)

  • Art-NET output driverArtNetDriver module that sends DriverLayer pixels as Art-NET UDP frames. Requires GridLayout.requestMappings() (Sprint 6) and a UDP send PAL call. When: Sprint 7 PAL is in place (UDP send is a PAL function).

  • FastLED integration — required by WS2812Driver (see above). ESP-IDF vs Arduino framework question affects this. When: WS2812Driver is picked up.

  • ESP32 hardware-in-the-loop live test in CI. Flash device, monitor serial for N seconds, assert no [LiveTest] FAIL lines appear. Requires device management in CI. (Deferred from Sprint 8 Part E.) When: a self-hosted CI runner with a USB-connected board is available.

  • Persistent crash log in RTC RAM. Store the last 16 log events in RTC_NOINIT_ATTR memory; on boot, dump to serial and expose via GET /api/crashlog. Pair with an ESP32 panic handler hook that writes the last known module name, heap state, and fragmentation before reset. (Deferred from Sprint 8.) When: a production device crashes without a serial monitor attached.

  • ESP32 CI build: WiFiUDP.h / NetworkInterface.h not found. Fixed. lib_ldf_mode = chain+ resolved WiFiUDP.h. A follow-on issue (NetworkInterface.h, Network.h) was fixed by changing lib_compat_mode = strict to soft and adding lib_ignore = RPAsyncTCP, ESPAsyncTCPsoft mode allows framework-internal headers to resolve while explicit ignores prevent the Raspberry Pi TCP backend from being compiled on ESP32.

  • rPi and Linux/Windows builds in CI. When: a rPi CI runner is available.

  • Hardware-in-the-loop testing. Automated flash + boot-output assertion on real devices. When: PC/simulator tests stop catching hardware regressions.

  • Multi-board CI test rig. USB hub (16 ports) with classic, S3, P4 boards. deploy/all.py already supports multi-device runs; CI hardware rig requires a self-hosted runner. See deploy.md. When: USB hub assembled and at least two board variants available. Done (software side): Release 6 Sprint 6 delivers parallel flash + test (--workers N) and devicelist.json group field. Self-hosted CI runner remains deferred.

  • ESP32-S3 N16R8 board support. New [env:esp32s3_n16r8] PlatformIO environment; verify PSRAM init, adjust flash/RAM budgets. When: physical N16R8 board available.

  • ESP32-P4 target and Pixel Processing Accelerator (PPA) evaluation. Add [env:esp32p4] to platformio.ini with the correct board, flash, and PSRAM flags. Add an ESP32-P4 compile check to CI (no flash); pair with scripts/list_pio_envs.py + deploy/build.py --all-envs so the ESP32 CI block stays single-sourced. Benchmark DriverLayer blend with and without PPA hardware acceleration. If the speedup is significant, add a pal::ppa_blend() abstraction that falls back to software blend on non-P4 targets. Deferred from R5 original scope, R7S4, R8S1, and the pre-restart R10 plan; the v2 restart leapfrogs this work and v2 will pick it up after parity. When: v2 reaches v1 parity (see Release 9 § v2 plan to parity) and an ESP32-P4 board is available. Carries the prior "PhysMap projection telemetry" baseline above as its before/after measurement.

  • Board farm port management. deploy/devicelist.py enumerates boards by USB serial port → deploy/devicelist.json; all deploy scripts select by any field (-env, -ip, -device_name, etc.). When: second board added to the farm. Done: Release 6 Sprint 6 added "group" field to devicelist.json; all scripts support -group <name> filter.

  • devicelist.py port-change merging. When a known device (identified by device_name from the hello line) reappears on a different USB port, the script currently creates a duplicate bare entry instead of updating the port on the existing rich entry. Should merge by device_name and update port and last_seen in place. When: devices are regularly moved between USB ports or machines. Scheduled: Release 6 Sprint 7 (devicelist.py --merge-ports command).

  • ESP-IDF vs Arduino framework. Affects FastLED compatibility. See implementation.md. When: first ESP32 LED driver Module is needed.


Frontend

  • Navigation beyond flat listPicked up in Release 2 Sprint 9 Part H: hamburger menu with category-based scroll navigation, responsive collapse on mobile.

  • Human-readable control labels. Separate "label" field in ControlDescriptor (e.g. "heap_free_kb""Heap Free (KB)"). Schema change; all addControl() call sites need updating. When: raw key names become usability friction.

  • Alternative frontend technologies — Svelte 5 (requires Node.js build pipeline) or CDN widgets (requires internet; not suitable for standalone ESP32). Evaluate with flash-cost measurements. When: plain HTML frontend becomes a scaling bottleneck.

  • Live control values on page load. GET /api/modules returns last-saved state file values. Truly live values require a scheduler-hot-path read or routing through the WS push path. When: stale-on-load is reported as a user-visible bug.

  • ESP32 WebSocket intermittency. Occasional multi-second delay on first F5 after flash — likely a race between browser fast-reconnect and ESPAsyncWebServer tear-down. Sprint 8 improved the JS reconnect path; root cause needs a packet capture or hardware-in-the-loop F5 test. When: reliable repro sequence found.

  • Slider hang browser automation test. Sprint 3 dragTs suppression prevents mid-drag WS state overwriting the slider, but no automated test covers it. Needs Playwright/Puppeteer drag-while-broadcasting test. When: browser automation harness is in place.

  • Board presets. Named fixture configurations (16x16 matrix, xmas tree, ring) as one-click UI presets. Stores a named modulemanager.json snapshot that can be applied to a fresh device without manual wiring. When: at least three common fixture types are in regular use and onboarding friction is reported. (From R7 backlog.)

  • 3D previewer camera presets. Presets per fixture type (top-down for rings, side-on for matrix walls). Pick up after the 3D preview is proven with real users. When: user feedback identifies preset positions as a usability gap.

  • Per-module WebGL preview pane. Miniature canvas per module card showing that module's pixel output. When: frontend is refactored or split into components.

  • Modifier: palette remapping. Map virtual RGB through a palette LUT before driving physical pixels. When: modifier library is mature and an artist use case drives the requirement.

  • Non-rectangular modifiers. FoldModifier, arbitrary permutation via LUT. When: the modifier library has at least 6 shipping modifiers and a concrete use case exists.

  • Table control type. "table" control in frontend + addControl(char*, size_t, …, "table") in StatefulModule; migrate TasksModule, FileManagerModule, DeviceDiscoveryModule to use it. When: a sprint is dedicated to frontend / control type work.

  • CMake auto-rebuild of frontend bundlefrontend_bundle.h as a proper CMake DEPENDS target. When: frontend edits become frequent enough that manual rebuilds are friction.

  • Frontend source mapsDEBUG_FRONTEND CMake option embeds original JS for readable DevTools traces. When: debugging the frontend is recurring friction.


Scenarios and benchmarking

  • Performance telemetry for PhysMap projection. Measure the requestMappings() projection loop's share of frame time before PPA acceleration is added (R7 Sprint 3). Baseline on PC and ESP32-S3; identify whether per-LED coordinate transform is the bottleneck or negligible vs the blend loop. When: ESP32-P4 PPA evaluation (R7 Sprint 3) is in progress; provides the before/after comparison baseline. (From R5 backlog.)

  • Scenario "extends" inheritance. Resolve "extends" recursively before executing steps to eliminate copy-paste of base pipeline steps in every variant. Straightforward to implement. When: scenario file count grows or copy-paste becomes maintenance friction.

  • Scenario baseline populated from ESP32. Run deploy/scenario.py --update-baseline against a real device to populate deploy/test/scenario-baseline.json with heap and timing numbers. Wire --compare-baseline check into deploy/all.py afterward. When: MM-C1BC or MM-70BC can run all scenarios end-to-end without crashing.

  • Timing baseline regression file. tests/timing_baseline.json stores expected ms-per-tick per module; --count N fails if any module deviates more than 20%. Useful once timing is stable post-Sprint 9. When: timing is stable and a sprint is dedicated to regression hardening. (Deferred from Sprint 8 Part E.)

  • Live scenario fps metric from Scheduler window. The _collect_scenario_metrics() PC fps is always n/a (sub-µs pipeline, below float precision). Replace per-module ms_per_tick sum with the system_fps field from /api/system for a meaningful PC number. When: scenario reporting quality matters to end users.


Observability

  • PC/rPi native system info. SystemInfoModule fields cpu_freq_mhz, flash_size_mb, sdk_version, chip_model are 0/empty on PC. Needs sysctl (macOS) / /proc/cpuinfo (Linux) implementations. When: permanent rPi deployment needs accurate hardware metrics.

  • Health check RAM footprint. Each module's healthReport() currently uses a Scheduler-owned scratch buffer (already shared). No per-module char[] cost. ✅ Resolved Sprint 7.

  • Remove periodic serial logging; verify REST captures everything. Serial logs currently contain periodic background writes (module state, health snapshots). Now that deploy/run.py captures device state via GET /api/system and live tests assert on REST responses, the periodic serial writes are redundant and add noise. Audit what is still only visible via serial and migrate it to REST-accessible state before removing the periodic writes. When: a sprint adds REST endpoints or logging improvements. Scheduled: Release 6 Sprint 2 (TIMING gating, MemLive once-on-crossing, [INFO] heartbeat removal).

  • Revisit the runtime logging system. The current approach writes info-level entries periodically in the hot path, which inflates loop() timing measurements. Design a log system that: (a) writes nothing in loop() by default, (b) buffers log entries in a ring buffer drained by a background task or on REST request, (c) exposes entries via GET /api/log. Structured entries (timestamp, module id, level, message) enable dashboard display and post-run analysis. When: serial logging is no longer the primary observability path (see item above). Scheduled: Release 6 Sprint 2 (256-entry ring buffer + GET /api/log).

  • Extract StatefulModule inline PAL calls to a .cpp file. runSetup() and runLoop() are defined inline in StatefulModule.h and call pal::micros() and pal::free_heap_bytes(). Moving them to StatefulModule.cpp would remove the direct pal/Pal.h include from the header, making StatefulModule.h, TypeRegistry.h, ProducerModule.h, ConsumerModule.h, and ModuleManager.h all PAL-free. The core_only boundary check target would then verify all five. Identified and documented in Release 6 Sprint 1. When: a sprint focuses on PAL boundary hardening or module system refactoring.

  • Windows MemBoot/MemLive frag percentage accuracy. On Windows, free_heap_bytes() and max_alloc_bytes() each do a separate VirtualQueryEx scan; background OS activity between the two scans causes largNow != now, so frag shows ~97% instead of 0%. Additionally, largNow * 100u overflows uint32_t at ~500 MB values in StatefulModule.h (module setup lines) and Scheduler.cpp (MemLive periodic/warning lines), producing a wrong frag% via the same overflow already fixed in pal::memEvent(). Fix: (a) apply the (uint64_t) cast to all remaining largNow * 100u sites, and (b) use the s_freeHeapCache_() pattern so max_alloc_bytes() returns the cached value from the most recent free_heap_bytes() call (no second scan). On embedded targets both functions hit the same heap_caps call so neither issue arises there. When: Windows is used as a primary development platform and frag noise in MemBoot output becomes misleading. (From R7S3.)

  • GET /api/log WebSocket streaming. The ring buffer (R6S2, 256 entries) is accessible via REST but not pushed to the frontend. Streaming new entries over the existing WebSocket connection and rendering them in a frontend log panel would make runtime diagnostics visible without a serial monitor. When: serial-only diagnostics become friction for field debugging or demos. (From R6 backlog.)

  • HEALTH module= serial suppression. printHealthReports() currently emits via LOG_HEALTH which appears on serial even in normal runs. Moving it to LOG_DEBUG would suppress it by default while keeping it accessible via --log-level debug; GET /api/log and GET /api/system already expose the same data. When: serial noise from health lines is reported as friction in deploy/run.py captures. (From R6 backlog.)

  • Architecture Decision Records. A docs/adr/ directory with numbered ADRs (ADR-0001, ...) makes individual design decisions citable and searchable. Currently rationale lives in architecture.md and sprint retrospectives. When: the project has external contributors who need to understand why specific architectural choices were made. (From R6 backlog.)


Project identity

  • Installer / updater. Easy first install and in-place upgrade across platforms. When: first demo runs end-to-end and external contributors want to try it without a build toolchain.

  • Domain / site hosting. moonmodules.org/projectMM as initial home. When: project name is final and docs ready to publish under MoonModules.

  • Final project name. projectMM is a working title. When: proof of concept runs on all three platforms.

  • Repository move. ewowi/projectMMMoonModules/projectMM. When: name is final and CI/CD is in place on all platforms.

  • Copier template for multi-device products. If projectMM is used as a base for multiple independent device products, Copier enables scaffolding and upstream-sync from the projectMM template without manual merges. When: a second product using projectMM as a base template is in development. (From R6 backlog.)


Resolved (kept for reference)

  • controls_[] embedded cost → resolved Release 2 Sprint 1c (lazy heap allocation)
  • WiFiModule as StatefulModule → done Release 2 Sprint 8
  • Sensitive control flag → done Release 2 Sprint 8
  • Layer hierarchy (parent-child modules) → done Release 2 Sprint 6
  • Runtime add/remove of modules from UI → done Release 2 Sprint 5 + 7
  • WebSocket pause when tab hidden → done Release 2 Sprint 8 (close on hide, reconnect on show)