Development History¶
v1 is frozen — active development is in projectMM v2
This repository is in maintenance-only mode. The restart decision is documented in Release 9; the v2 plan to parity lives in v2's Release 1. Tag v1.8.0-pre-restart marks v1's final pre-restart state.
This section records how projectMM is built: the process, the decisions, and the history of each release. Vision, principles, and system overview live in the home page; architecture, design, and implementation details live in their respective documents.
How to Read This Section¶
projectMM follows an Agile development process adapted for AI-collaborative development. The terminology:
| Term | Meaning |
|---|---|
| Release | A major deliverable milestone. Each release has a theme (e.g. "Proof of Concept") and corresponds to a GitHub Release tag when complete. |
| Sprint | A scope-boxed work cycle within a release. Each sprint has a goal, a definition of done, a result, and a retrospective. |
| Backlog | Items raised but explicitly deferred. Each entry notes when it should be picked up — a backlog item is not a forgotten one. |
| Retrospective | End-of-sprint reflection: what was learned, what questions were raised, and what seeds the next sprint. |
For new contributors: start with the latest release document to understand where the project is. Read the Backlog for contribution opportunities — items marked with a "pick up when" condition that is now met are good candidates.
Development Process¶
- Agile sprints. Each sprint begins by reviewing the previous sprint's retrospective and closes by writing a new one. Sprints are scope-boxed (not time-boxed): a sprint ends when its definition of done is met.
- Top-down refinement. Ideas flow from vision to architecture to design to implementation, with each layer free to push back on the one above.
- Version control. GitHub is the primary host. The repository is ewowi/projectMM (will be moved/renamed to projectMM under MoonModules when appropriate).
- CI/CD. Automated build, test, and (where possible) deploy across PC, rPi, and ESP32. Every PR gets a footprint delta report.
- Automated testing. Unit, integration, and (future) hardware-in-the-loop. Coverage is part of the definition of done.
- AI in the workflow. Code generation and review are AI-assisted. We stay tool-agnostic: currently Claude Code for generation and CodeRabbit for PR review, with active experiments on others (e.g. Mistral Devstral).
- Open to contributors. The project welcomes pull requests from day one. The sprint structure, backlog, and documentation are designed so a new contributor (human or AI) can find meaningful work and understand why decisions were made.
Sprint Lifecycle¶
This is the standard sequence for running a sprint — followed by the maintainer and by AI coding assistants alike. Each step has a prompt or action associated with it.
Step 1 — Pre-sprint analysis¶
Before the implementation prompt, raise any observations, findings, or new requirements that have come up since the last sprint. These are analysed, documented in the relevant docs (architecture.md, design.md, implementation.md, backlog), and linked from the sprint section.
Prompt pattern: "Before starting the next sprint, I have a few things to analyse — add them to the relevant docs and link from this sprint: [list]"
Step 2 — Implementation prompt¶
Kick off the sprint with a standard prompt that includes:
- Which sprint to implement
- Check previous retrospectives and backlog for relevant deferred items
- Ask clarifying questions before starting; update the scope or architecture/design/implementation docs based on answers
- When done: ask to build and run on PC and ESP32, run the tests, capture and analyse logs, update the sprint's Result section, and prove that the issues targeted by the sprint are resolved
- After build-and-run: add Result, DoD verification, and Retrospective to the sprint section
- If new modules were created: add a doc page in docs/modules/ and register it in mkdocs.yml
Prompt pattern: "Implement the next sprint. Check previous retrospectives and the backlog for anything relevant. Ask me any questions (provide options and propose the simplest and backlog the rest by default ) and update the scope (or architecture/design/implementation docs, with links). When implemented, ask if you can build and run for PC and ESP32 and run the tests, capture and analyse the logs, update relevant documents and the sprint's Result section (refer to other pages where possible), (follow AI-Specific Writing Rules for all generated docs and comments) and prove the sprint issues are solved. Then add Result, DoD, and Retrospective. Add a complexity estimate"
Step 3 — Q&A¶
The AI asks scoping questions; the maintainer answers. Answers narrow the scope, defer to backlog, or confirm. This keeps sprints small and reviewable. Aim to focus on small steps, so if alternitives are presented answer with: do the most simple alternative first and add the more complex items to future sprints or to the backlog.
Step 4 — Watch the run¶
While the AI builds and runs, monitor progress in the git client (e.g. GitKraken): stage files as they appear to get a clearer view of what has changed. Use the diff view to verify that only the expected files are touched.
Step 5 — Approve or redirect tool calls¶
When the AI requests permission for a tool call (shell command, file write, etc.):
- Yes — approve if the action looks correct.
- Other — redirect with a reason if the approach is wrong or unsafe.
- Never approve git commit — always evaluate the staged changes yourself first and do the commit manually after review.
Step 6 — Sprint close¶
Before committing, verify the sprint is complete: If complete, ask an AI agent to do the commit with a descriptive commit message.
Prompt (maintainer will ask AI!): "Did we finish this sprint? Did you check the Architecture review checklist? Did you run build and run and log including live tests on PC and ESP32 devices. If not please run it. Did you update DoD, Results, Retrospective (what went well / what was tricky / seeds for next sprint) ? Is mkdocs serve running fine? Anything else before commit? if not run the commit command with a descriptive message, leave out Co-Authored-By: Claude ... as last sentence"
Architecture review checklist — answer these before closing every sprint:
| Question | Why it matters |
|---|---|
| Does this change fit the existing architecture, or does it work around a limitation? | Workarounds compound; a root-cause fix is cheaper long-term |
Did any new heap allocation get added? Is it bounded, and is it freed in teardown()? |
Unbounded growth or missing teardown causes OOM on long-running devices |
| Did any new member get added to a base class or widely-shared struct? | Base-class bloat multiplies across every instance |
| Does the solution add a new abstraction, or reuse an existing one? | New abstractions add cognitive load; reuse keeps the design learnable |
| If something was patched rather than fixed, is the root cause recorded in the backlog? | Patches are fine short-term if the debt is visible |
Standards and Guidelines¶
The dedicated Standards and Guidelines document covers coding standards for C/C++ and JS, the anti-debt checklist, the human/AI readability rule, non-trivial testing, AI workflow rules, and the contribution workflow.
Releases¶
| Release | Theme | Status | Tag |
|---|---|---|---|
| Release 1 | Proof of Concept | Complete | v1.0.0 |
| Release 2 | Observable, Hierarchical, and Network-Ready | Complete | v1.1.0 |
| Release 3 | Pixel Pipeline, MoonLight Effects, and Continuous Integration | In progress | — |
| Release 4 | Platform Reach | In progress | — |
| Release 5 | Virtual/Physical Layer Architecture | Complete | v1.5.0 |
| Release 6 | Deploy Pipeline Hardening and Agentic Harness | Planned | — |
| Release 7 | Firmware Delivery, New Platforms, and Developer Tooling | Planned | — |
| Release 8 | Dynamic Controls, Deploy Pipeline, and Runtime Hardening | Complete | v1.8.0 |
| Release 9 | Evaluation: restart accepted | Complete | — |
Release 9 is v1's final release. The restart begins as Release 1 of a new sibling repo
projectMM-v2/; see Release 9 § v2 plan to parity.
The original Release 3–10 roadmap (pixel pipeline, 2D effects, modifiers, 3D, drivers, network protocols) is preserved in the MoonLight Scope subfolder for reference.
Backlog¶
Items raised but deferred. Each entry notes when it should be picked up. Sprint-specific and release-specific items live in the relevant sprint/release doc; this is the cross-release long-term list.
Runtime and infrastructure¶
-
Effects animate slowly in preview/WebGL on PC-Windows (fine on PC-macOS). The server FPS is high but the WebGL preview window shows the effect moving slowly. Observed on Windows only; macOS is unaffected. Possible causes to investigate: (a)
PreviewModuleWS push rate is throttled independently of server FPS — browser only receives frames at the push cadence; (b) effect uses frame-count rather than elapsed time, so visual speed is coupled to loop rate which differs between platforms; (c) browserrequestAnimationFrameat 60 Hz cap causes a growing WS message queue that lags behind the server push rate; (d)pal::millis()/pal::micros()unit mismatch on Windows — effect passes microseconds where milliseconds are expected (or vice versa), running the animation 1000x too slow. Art-NET output not yet tested; issue may be preview-specific. When: Windows is used as a primary development or demo platform. (From R7S3.) -
PhysMap index-width optimisation.
PhysMapusesuint32_talways (R5S2 Q1:A). On boards without PSRAM,uint16_twould halve the map's memory footprint (20 KB to 10 KB at 5000 LEDs). Requires a PAL-gated allocation or a template parameter. When: large installations on non-PSRAM boards make the 4-byte-per-LED cost visible in RAM budget. -
EffectsLayer dim wiring to DriverLayer.
EffectsLayerretains its ownwidth/height/depthcontrols as a fallback (R5S2 Q2:A). Removing these props and havingEffectsLayersize fromsetInput("driver", …)eliminates operator redundancy and is consistent with the virtual/physical split. When: the virtual/physical pipeline is proven in production and at least one non-rectangular layout (Sprint 3) is shipping. -
DriverLayer identity-skip optimisation.
DriverLayernow uses a singlephysical_buffer (Sprint 3);virtBlend_was removed. For an identity mapping the blend loop still walks all physical LEDs. The skip would alias the single-source channel directly toreadyChannel(). Saves one O(n) pass per frame. When: profiling shows the blend loop is a meaningful fraction of frame time at the target LED count. -
PreviewModule virtual-side wiring.
PreviewModuleis wired toDriverLayeroutput (physical pixels) by default (R5S2 Q4:A). Operators who want to preview virtual-space effects independently of the physical mapping need apreview.source = effectsLayerwiring option. When: Sprint 3 introduces the0x03coordinate-per-LED wire format, which requires virtual and physical coordinates to be distinct. -
Supersync. Multi-instance synchronization (passive mirror → active control → group control → shared processing with synchronized clocks). Includes discovery (mDNS / MQTT / explicit config) and clock sync. See architecture.md. When: single-instance runtime is stable on all three platforms and real users run more than one device together.
-
REST and MQTT endpoints. External control surface. See architecture.md. When: WiFiModule is stable (done Release 2 Sprint 8); REST first, MQTT when Home Assistant integration is on the table.
-
State pub/sub. Lightweight publish/subscribe to replace direct reads. Candidates: callback list, observer pattern, event queue. See architecture.md. When: a Module needs to react to state changes from two independent sources.
-
FreeRTOS Core 0 dispatch.
cores_vector is populated butxTaskCreatePinnedToCoredispatch is not yet wired. Start with the MoonLight producer/consumer split; generalise once stable. See architecture.md. When: WiFiModule is stable (Core 0 is used by WiFi; priorities need tuning). -
Pending-writes queue. Replace
controlMutex_in the HTTP→scheduler hot path with a lock-free queue drained at the top of each tick. When: profiling shows mutex contention at realistic request rates. -
Graceful teardown / dirty-flag save on ESP32.
teardown()never runs in Arduino;saveAllState()at end ofsetup()covers first boot. Wire a dirty-flag + debounce timer (2 s after last control change) for runtime saves. When: live control mutation frequency makes per-change saves a flash-wear concern. -
Multiple WiFi configs + scan-to-select.
WifiStaModulecurrently holds a single SSID/password pair. Add support for a priority-ordered list of networks (e.g. home, office, field) and a WiFi scan endpoint (GET /api/wifi/scan) that returns visible SSIDs so the UI can present a picker instead of requiring manual SSID entry.deploy/wifi.pywould gain a--profile <name>flag;deploy/flashfs.pywould bake all profiles. When: a device needs to move between networks regularly, or a user-friendly onboarding flow is needed.* -
Password / credential encryption at rest. Sprint 8 marks passwords
sensitive(never broadcast); encrypting the LittleFS state files requires a key management story. When: security requirements are clearer (multi-user / cloud deployment). -
KvStore hash table. Linear scan is fine for ≤16 entries. When: entry count exceeds 16 and profiling shows KvStore in the hot path.
-
Static RAM cost of log ring buffer and Scheduler fields on non-PSRAM devices.
LOG_RING_SIZEset to 2 KB (32 entries x 64 bytes) on all devices (R7S6); ~104 B ofSchedulerfields still live in.bsson all platforms. On esp32dev (no PSRAM, ~320 KB total RAM) these are permanent costs that compete with heap. Moving the ring to a PSRAM-backed allocation insetup()would free 2 KB of.bsson PSRAM boards but requires a PAL extension. Increase ring size when longer history is needed (editLOG_RING_CAP/LOG_RING_ENTRYinLogger.cpp). When: R7/R8 device-class memory budget review, or if esp32dev free heap becomes a reliability concern. -
WiFi RX buffer tuning on esp32dev.
CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUMis baked into the pre-compiled Arduino frameworksdkconfig.hand cannot be overridden via-Dinbuild_flags(causes redefinition warning; no runtime effect). Reducing from 32 to 16 would save ~16 KB heap on classic ESP32 but requires either a custom framework build orsdkconfig.defaultswith the pioarduino custom-framework build path. (R7S6) When: esp32dev heap margin is tight enough to justify a custom framework build. -
Scenario baseline
system_fpsthreshold too tight for hardware. The 20% regression threshold causes false positives forsystem_fpson ESP32: WiFi scheduling and background tasks can swing fps 50%+ between runs even on a warm device. Options: (a) excludesystem_fpsfrom baseline checks and only checkheap_free/max_alloc; (b) widen the fps threshold to 50%; (c) add a"volatile": trueflag per metric to skip baseline comparison. (From R7S5.) When: scenario baseline is actively used to catch regressions. -
Minimize floats in the hot path.Picked up in Release 2 Sprint 9 Part G. IntegerControlDescriptorvariant withmin/max/step; float controls remain available for sub-integer precision. Full DMX/Art-Net 0–255 / 0–65535 alignment is a Release 3 driver concern. -
Consolidating ModuleManager and Scheduler — or letting one implement the other. When: experience with both makes the right abstraction boundary obvious.
-
Module footprint optimization — non-hot-path metadata to JSON only.
isPermanent(), platform constraints, display label, category — strip from compiled class, store in JSON manifest. Hot path stays in C++. When: Release 3 module count and footprint budgets make this worthwhile. -
tickPeriodic()monitoring work blocks the PC render pipeline.tickPeriodic()fires 1-second callbacks (health reports, MemLive checks) synchronously insidescheduler.loop(). On macOS,pal::free_heap_bytes()callsmalloc_zone_statistics()which does an O(n-allocations) heap scan (~25 ms); four calls per second produced a ~100 ms pause in the render loop, visible as a "heartbeat" stutter in ArtNet output to a remote receiver. The immediate fix (Sprint 13) was a 5 ms TTL cache infree_heap_bytes(). The root cause remains: monitoring work couples with the render pipeline on PC. On ESP32, FreeRTOS two-core scheduling separates these concerns naturally; on PC it does not. Proper fix: runtickPeriodic()callbacks on a background thread on PC, keeping the render loop free of any blocking work. When: PC is used as a primary output platform and monitoring or observability work grows beyond what a 5 ms cache can absorb.
Linting and code quality¶
-
cppcheck — C++ static analysis.Active as of Sprint 7. Configured with--enable=warning, POSIX socket cast and ESP32 macro suppressions in.cppcheck-suppressions, inline suppression for intentionalsizeof(expr)assertions in tests. CI blocks PR on any finding. -
Heap delta column in techdebt monitor (ESP32 measurement).
baseHeapUsage()(Sprint 4) covers controls array and pendingProps, but misses module-specificsetup()allocations (PSRAM buffers, etc.). Complete the picture by measuringfree_heap()before and aftersetup()in the test binary and printingHEAPDELTA TypeName N; surface as a "Heap delta setup (B)" column intechdebt.md. Requires an ESP32 test harness andsetInput()wiring for modules that need a layer beforesetup()is valid. On PCpal::free_heap_bytes()is a stub so this column is ESP32-only. When: a dedicated ESP32 test harness is in place andpal::free_heap_bytes()returns real values on device. -
Virtual
setupHeapUsage()override for module-specific reporting. For modules with large PSRAM buffers (e.g.GameOfLifeEffect,PreviewModule), avirtual size_t setupHeapUsage() const { return 0; }override lets each module self-report its known allocation without platform-dependent heap measurement. Lower accuracy risk than the heap-delta approach but requires per-module maintenance. When: heap-delta (above) proves impractical and the PSRAM budget becomes a CI concern. -
clang-tidy — deep C++ analysis. Understands the full call graph and type system. Good candidates for projectMM:
modernize-*(C++17 idioms),readability-*(naming),bugprone-*(common error patterns). Requires acompile_commands.jsonfrom CMake and a platform-split suppression strategy for ESP32-only headers. When: cppcheck is in place and a sprint is dedicated to code quality; not before because the setup cost is high. -
ruff stricter rules. Currently enforcing E/W/F/I. Next tier: add
N(naming conventions),B(bugbear — common Python gotchas),UP(pyupgrade — modern Python idioms). All auto-fixable. Add topyproject.tomlselectlist. When: thedeploy/Python scripts grow significantly or a naming inconsistency causes a real bug. -
JavaScript linting (ESLint).
app.jsis currently excluded from linting. ESLint would require apackage.jsonand Node.js in CI. When:app.jsis refactored or split into modules. -
clang-format include ordering. Add
SortIncludes: trueandIncludeBlocks: Regroupto.clang-formatto enforce include ordering across the project. Currently omitted because the initial reformat pass showed several include-order changes that needed manual review. When: a sprint dedicates time to auditing and fixing include order across all files. -
C++ standard upgrade to C++20. The most useful C++20 additions for this codebase are
std::span(buffer views in the PAL layer, avoids raw pointer/length pairs) and concepts (cleaner template constraints for module wiring). The PlatformIOespressif32platform must ship GCC 13 by default before this is safe; earlier GCC versions have incomplete Xtensa C++20 support. When:espressif32platform ships GCC 13 as default and the ESP32 CI build passes with-std=c++20without modification.
Hardware and CI¶
-
WS2812Driver — GPIO LED driver module using FastLED (or ESP-IDF RMT directly). Reads
DriverLayeroutput andGridLayoutmapping to write physical pixel data. Stub on PC. Art-NET out is the higher-priority output path; this is deferred until the Art-NET driver story is clearer. When: Art-NET out is shipping and a GPIO hardware test rig is available. -
FastLED-MM worked example. Full step-by-step walkthrough in
library.mdshowing how to connect FastLED output to the projectMM pipeline (DriverLayer → WS2812Driver → physical LEDs). When: WS2812Driver is shipping and a hardware test rig is available. (From R5 backlog.) -
Art-NET output driver —
ArtNetDrivermodule that sendsDriverLayerpixels as Art-NET UDP frames. RequiresGridLayout.requestMappings()(Sprint 6) and a UDP send PAL call. When: Sprint 7 PAL is in place (UDP send is a PAL function). -
FastLED integration — required by WS2812Driver (see above). ESP-IDF vs Arduino framework question affects this. When: WS2812Driver is picked up.
-
ESP32 hardware-in-the-loop live test in CI. Flash device, monitor serial for N seconds, assert no
[LiveTest] FAILlines appear. Requires device management in CI. (Deferred from Sprint 8 Part E.) When: a self-hosted CI runner with a USB-connected board is available. -
Persistent crash log in RTC RAM. Store the last 16 log events in
RTC_NOINIT_ATTRmemory; on boot, dump to serial and expose viaGET /api/crashlog. Pair with an ESP32 panic handler hook that writes the last known module name, heap state, and fragmentation before reset. (Deferred from Sprint 8.) When: a production device crashes without a serial monitor attached. -
ESP32 CI build: WiFiUDP.h / NetworkInterface.h not found. Fixed.lib_ldf_mode = chain+resolvedWiFiUDP.h. A follow-on issue (NetworkInterface.h,Network.h) was fixed by changinglib_compat_mode = stricttosoftand addinglib_ignore = RPAsyncTCP, ESPAsyncTCP—softmode allows framework-internal headers to resolve while explicit ignores prevent the Raspberry Pi TCP backend from being compiled on ESP32. -
rPi and Linux/Windows builds in CI. When: a rPi CI runner is available.
-
Hardware-in-the-loop testing. Automated flash + boot-output assertion on real devices. When: PC/simulator tests stop catching hardware regressions.
-
Multi-board CI test rig. USB hub (16 ports) with classic, S3, P4 boards.
deploy/all.pyalready supports multi-device runs; CI hardware rig requires a self-hosted runner. See deploy.md. When: USB hub assembled and at least two board variants available. Done (software side): Release 6 Sprint 6 delivers parallel flash + test (--workers N) anddevicelist.jsongroupfield. Self-hosted CI runner remains deferred. -
ESP32-S3 N16R8 board support. New
[env:esp32s3_n16r8]PlatformIO environment; verify PSRAM init, adjust flash/RAM budgets. When: physical N16R8 board available. -
ESP32-P4 target and Pixel Processing Accelerator (PPA) evaluation. Add
[env:esp32p4]toplatformio.iniwith the correct board, flash, and PSRAM flags. Add an ESP32-P4 compile check to CI (no flash); pair withscripts/list_pio_envs.py+deploy/build.py --all-envsso the ESP32 CI block stays single-sourced. BenchmarkDriverLayerblend with and without PPA hardware acceleration. If the speedup is significant, add apal::ppa_blend()abstraction that falls back to software blend on non-P4 targets. Deferred from R5 original scope, R7S4, R8S1, and the pre-restart R10 plan; the v2 restart leapfrogs this work and v2 will pick it up after parity. When: v2 reaches v1 parity (see Release 9 § v2 plan to parity) and an ESP32-P4 board is available. Carries the prior "PhysMap projection telemetry" baseline above as its before/after measurement. -
Board farm port management.
deploy/devicelist.pyenumerates boards by USB serial port →deploy/devicelist.json; all deploy scripts select by any field (-env,-ip,-device_name, etc.). When: second board added to the farm. Done: Release 6 Sprint 6 added"group"field todevicelist.json; all scripts support-group <name>filter. -
devicelist.pyport-change merging. When a known device (identified bydevice_namefrom the hello line) reappears on a different USB port, the script currently creates a duplicate bare entry instead of updating the port on the existing rich entry. Should merge bydevice_nameand updateportandlast_seenin place. When: devices are regularly moved between USB ports or machines. Scheduled: Release 6 Sprint 7 (devicelist.py --merge-portscommand). -
ESP-IDF vs Arduino framework. Affects FastLED compatibility. See implementation.md. When: first ESP32 LED driver Module is needed.
Frontend¶
-
Navigation beyond flat list— Picked up in Release 2 Sprint 9 Part H: hamburger menu with category-based scroll navigation, responsive collapse on mobile. -
Human-readable control labels. Separate
"label"field inControlDescriptor(e.g."heap_free_kb"→"Heap Free (KB)"). Schema change; alladdControl()call sites need updating. When: raw key names become usability friction. -
Alternative frontend technologies — Svelte 5 (requires Node.js build pipeline) or CDN widgets (requires internet; not suitable for standalone ESP32). Evaluate with flash-cost measurements. When: plain HTML frontend becomes a scaling bottleneck.
-
Live control values on page load.
GET /api/modulesreturns last-saved state file values. Truly live values require a scheduler-hot-path read or routing through the WS push path. When: stale-on-load is reported as a user-visible bug. -
ESP32 WebSocket intermittency. Occasional multi-second delay on first F5 after flash — likely a race between browser fast-reconnect and ESPAsyncWebServer tear-down. Sprint 8 improved the JS reconnect path; root cause needs a packet capture or hardware-in-the-loop F5 test. When: reliable repro sequence found.
-
Slider hang browser automation test. Sprint 3
dragTssuppression prevents mid-drag WS state overwriting the slider, but no automated test covers it. Needs Playwright/Puppeteer drag-while-broadcasting test. When: browser automation harness is in place. -
Board presets. Named fixture configurations (16x16 matrix, xmas tree, ring) as one-click UI presets. Stores a named
modulemanager.jsonsnapshot that can be applied to a fresh device without manual wiring. When: at least three common fixture types are in regular use and onboarding friction is reported. (From R7 backlog.) -
3D previewer camera presets. Presets per fixture type (top-down for rings, side-on for matrix walls). Pick up after the 3D preview is proven with real users. When: user feedback identifies preset positions as a usability gap.
-
Per-module WebGL preview pane. Miniature canvas per module card showing that module's pixel output. When: frontend is refactored or split into components.
-
Modifier: palette remapping. Map virtual RGB through a palette LUT before driving physical pixels. When: modifier library is mature and an artist use case drives the requirement.
-
Non-rectangular modifiers.
FoldModifier, arbitrary permutation via LUT. When: the modifier library has at least 6 shipping modifiers and a concrete use case exists. -
Table control type.
"table"control in frontend +addControl(char*, size_t, …, "table")inStatefulModule; migrateTasksModule,FileManagerModule,DeviceDiscoveryModuleto use it. When: a sprint is dedicated to frontend / control type work. -
CMake auto-rebuild of frontend bundle —
frontend_bundle.has a proper CMakeDEPENDStarget. When: frontend edits become frequent enough that manual rebuilds are friction. -
Frontend source maps —
DEBUG_FRONTENDCMake option embeds original JS for readable DevTools traces. When: debugging the frontend is recurring friction.
Scenarios and benchmarking¶
-
Performance telemetry for PhysMap projection. Measure the
requestMappings()projection loop's share of frame time before PPA acceleration is added (R7 Sprint 3). Baseline on PC and ESP32-S3; identify whether per-LED coordinate transform is the bottleneck or negligible vs the blend loop. When: ESP32-P4 PPA evaluation (R7 Sprint 3) is in progress; provides the before/after comparison baseline. (From R5 backlog.) -
Scenario
"extends"inheritance. Resolve"extends"recursively before executing steps to eliminate copy-paste of base pipeline steps in every variant. Straightforward to implement. When: scenario file count grows or copy-paste becomes maintenance friction. -
Scenario baseline populated from ESP32. Run
deploy/scenario.py --update-baselineagainst a real device to populatedeploy/test/scenario-baseline.jsonwith heap and timing numbers. Wire--compare-baselinecheck intodeploy/all.pyafterward. When: MM-C1BC or MM-70BC can run all scenarios end-to-end without crashing. -
Timing baseline regression file.
tests/timing_baseline.jsonstores expected ms-per-tick per module;--count Nfails if any module deviates more than 20%. Useful once timing is stable post-Sprint 9. When: timing is stable and a sprint is dedicated to regression hardening. (Deferred from Sprint 8 Part E.) -
Live scenario fps metric from Scheduler window. The
_collect_scenario_metrics()PC fps is alwaysn/a(sub-µs pipeline, below float precision). Replace per-modulems_per_ticksum with thesystem_fpsfield from/api/systemfor a meaningful PC number. When: scenario reporting quality matters to end users.
Observability¶
-
PC/rPi native system info.
SystemInfoModulefieldscpu_freq_mhz,flash_size_mb,sdk_version,chip_modelare 0/empty on PC. Needssysctl(macOS) //proc/cpuinfo(Linux) implementations. When: permanent rPi deployment needs accurate hardware metrics. -
Health check RAM footprint. Each module's
healthReport()currently uses a Scheduler-owned scratch buffer (already shared). No per-modulechar[]cost. ✅ Resolved Sprint 7. -
Remove periodic serial logging; verify REST captures everything. Serial logs currently contain periodic background writes (module state, health snapshots). Now that
deploy/run.pycaptures device state viaGET /api/systemand live tests assert on REST responses, the periodic serial writes are redundant and add noise. Audit what is still only visible via serial and migrate it to REST-accessible state before removing the periodic writes. When: a sprint adds REST endpoints or logging improvements. Scheduled: Release 6 Sprint 2 (TIMING gating, MemLive once-on-crossing,[INFO]heartbeat removal). -
Revisit the runtime logging system. The current approach writes info-level entries periodically in the hot path, which inflates
loop()timing measurements. Design a log system that: (a) writes nothing inloop()by default, (b) buffers log entries in a ring buffer drained by a background task or on REST request, (c) exposes entries viaGET /api/log. Structured entries (timestamp, module id, level, message) enable dashboard display and post-run analysis. When: serial logging is no longer the primary observability path (see item above). Scheduled: Release 6 Sprint 2 (256-entry ring buffer +GET /api/log). -
Extract
StatefulModuleinline PAL calls to a.cppfile.runSetup()andrunLoop()are defined inline inStatefulModule.hand callpal::micros()andpal::free_heap_bytes(). Moving them toStatefulModule.cppwould remove the directpal/Pal.hinclude from the header, makingStatefulModule.h,TypeRegistry.h,ProducerModule.h,ConsumerModule.h, andModuleManager.hall PAL-free. Thecore_onlyboundary check target would then verify all five. Identified and documented in Release 6 Sprint 1. When: a sprint focuses on PAL boundary hardening or module system refactoring. -
Windows MemBoot/MemLive frag percentage accuracy. On Windows,
free_heap_bytes()andmax_alloc_bytes()each do a separateVirtualQueryExscan; background OS activity between the two scans causeslargNow != now, sofragshows ~97% instead of 0%. Additionally,largNow * 100uoverflowsuint32_tat ~500 MB values inStatefulModule.h(module setup lines) andScheduler.cpp(MemLive periodic/warning lines), producing a wrong frag% via the same overflow already fixed inpal::memEvent(). Fix: (a) apply the(uint64_t)cast to all remaininglargNow * 100usites, and (b) use thes_freeHeapCache_()pattern somax_alloc_bytes()returns the cached value from the most recentfree_heap_bytes()call (no second scan). On embedded targets both functions hit the sameheap_capscall so neither issue arises there. When: Windows is used as a primary development platform and frag noise in MemBoot output becomes misleading. (From R7S3.) -
GET /api/logWebSocket streaming. The ring buffer (R6S2, 256 entries) is accessible via REST but not pushed to the frontend. Streaming new entries over the existing WebSocket connection and rendering them in a frontend log panel would make runtime diagnostics visible without a serial monitor. When: serial-only diagnostics become friction for field debugging or demos. (From R6 backlog.) -
HEALTH module=serial suppression.printHealthReports()currently emits viaLOG_HEALTHwhich appears on serial even in normal runs. Moving it toLOG_DEBUGwould suppress it by default while keeping it accessible via--log-level debug;GET /api/logandGET /api/systemalready expose the same data. When: serial noise from health lines is reported as friction indeploy/run.pycaptures. (From R6 backlog.) -
Architecture Decision Records. A
docs/adr/directory with numbered ADRs (ADR-0001, ...) makes individual design decisions citable and searchable. Currently rationale lives inarchitecture.mdand sprint retrospectives. When: the project has external contributors who need to understand why specific architectural choices were made. (From R6 backlog.)
Project identity¶
-
Installer / updater. Easy first install and in-place upgrade across platforms. When: first demo runs end-to-end and external contributors want to try it without a build toolchain.
-
Domain / site hosting.
moonmodules.org/projectMMas initial home. When: project name is final and docs ready to publish under MoonModules. -
Final project name. projectMM is a working title. When: proof of concept runs on all three platforms.
-
Repository move.
ewowi/projectMM→MoonModules/projectMM. When: name is final and CI/CD is in place on all platforms. -
Copier template for multi-device products. If projectMM is used as a base for multiple independent device products, Copier enables scaffolding and upstream-sync from the projectMM template without manual merges. When: a second product using projectMM as a base template is in development. (From R6 backlog.)
Resolved (kept for reference)¶
→ resolved Release 2 Sprint 1c (lazy heap allocation)controls_[]embedded costWiFiModule as StatefulModule→ done Release 2 Sprint 8Sensitive control flag→ done Release 2 Sprint 8Layer hierarchy (parent-child modules)→ done Release 2 Sprint 6Runtime add/remove of modules from UI→ done Release 2 Sprint 5 + 7WebSocket pause when tab hidden→ done Release 2 Sprint 8 (close on hide, reconnect on show)