docs(spec): rewrite programming-model.md as Draft skeleton #76

Merged
navigator merged 1 commit from auto/issue-72-20260525T224929Z_issue72 into main 2026-05-25 19:56:09 -03:00
Owner

Closes #72.

Summary

Rewrites docs/spec/programming-model.md as a Draft skeleton per Issue #72.
The previous draft committed worked examples carrying fabricated PopLink
hop latencies, per-link bandwidths, and barrier/atomic timings. That
conflicts with the deliverable's vendor- and frequency-neutral intent
(PLAN.md Section 10 #4) and with ADR-011 still being Proposed.

Skeleton contents

Eight sections, each a TODO with 2-3 lines of intent:

  1. Overview — 8 SoCs, 2x4 torus, designated master (ADR-008), non-coherent shared address space.
  2. Memory model — ADR-011 contract: no hardware coherence between chips, explicit copy semantics.
  3. Address spaces — per-chip DRAM via fluidpop_alloc, PopLink-mapped peer windows via fluidpop_copy_d2d, host PCIe MMIO via fluidpop_copy_h2d/d2h.
  4. Synchronization primitives — fluidpop_barrier, events, fluidpop_broadcast, fluidpop_all_gather, fluidpop_all_reduce.
  5. Tensor partitioning patterns — TP / PP / EP per PLAN 14.4.
  6. Failure model — link errors, BER recovery hooks per PLAN 12.4 / ADR-009.
  7. Programmer-visible telemetry — fluidpop_query_telemetry contract vs. sysfs.
  8. Open questions — ordering, fence semantics, RoCC cache ops vs. PopLink, partial collectives.

Acceptance checklist

  • File exists with the required sections
  • Status: Draft skeleton + Owner: TBD header
  • Each section has TODO + 2-3 line intent
  • References the API names from Section 14.3 (fluidpop_open, fluidpop_alloc, fluidpop_copy_*, fluidpop_barrier, fluidpop_broadcast, fluidpop_all_gather, fluidpop_query_telemetry)
  • Explicit non-coherence statement (ADR-011) in the header callout and Section 2
  • No fabricated timing/latency numbers (grep for (ns|us|µs|ms|Gb/s|MB/s|GiB|MiB|KiB) returns empty)
  • Cross-references PLAN.md Sections 10, 12.4, 14.1-14.4, ADR-011, runtime API issue (#7 / PR #16), and companion driver-design.md

Why now

Runtime API issue and ADR-011 both need a stable file to point to as the
software-facing contract. The previous draft would have to be unwound when
real PHY/BER numbers arrive; replacing it with a skeleton now keeps the
spec honest while leaving the structure intact.

Closes #72. ## Summary Rewrites `docs/spec/programming-model.md` as a `Draft skeleton` per Issue #72. The previous draft committed worked examples carrying fabricated PopLink hop latencies, per-link bandwidths, and barrier/atomic timings. That conflicts with the deliverable's vendor- and frequency-neutral intent (PLAN.md Section 10 #4) and with ADR-011 still being *Proposed*. ## Skeleton contents Eight sections, each a TODO with 2-3 lines of intent: 1. Overview — 8 SoCs, 2x4 torus, designated master (ADR-008), non-coherent shared address space. 2. Memory model — ADR-011 contract: no hardware coherence between chips, explicit copy semantics. 3. Address spaces — per-chip DRAM via `fluidpop_alloc`, PopLink-mapped peer windows via `fluidpop_copy_d2d`, host PCIe MMIO via `fluidpop_copy_h2d`/`d2h`. 4. Synchronization primitives — `fluidpop_barrier`, events, `fluidpop_broadcast`, `fluidpop_all_gather`, `fluidpop_all_reduce`. 5. Tensor partitioning patterns — TP / PP / EP per PLAN 14.4. 6. Failure model — link errors, BER recovery hooks per PLAN 12.4 / ADR-009. 7. Programmer-visible telemetry — `fluidpop_query_telemetry` contract vs. sysfs. 8. Open questions — ordering, fence semantics, RoCC cache ops vs. PopLink, partial collectives. ## Acceptance checklist - [x] File exists with the required sections - [x] `Status: Draft skeleton` + `Owner: TBD` header - [x] Each section has TODO + 2-3 line intent - [x] References the API names from Section 14.3 (`fluidpop_open`, `fluidpop_alloc`, `fluidpop_copy_*`, `fluidpop_barrier`, `fluidpop_broadcast`, `fluidpop_all_gather`, `fluidpop_query_telemetry`) - [x] Explicit non-coherence statement (ADR-011) in the header callout and Section 2 - [x] No fabricated timing/latency numbers (grep for `(ns|us|µs|ms|Gb/s|MB/s|GiB|MiB|KiB)` returns empty) - [x] Cross-references PLAN.md Sections 10, 12.4, 14.1-14.4, ADR-011, runtime API issue (#7 / PR #16), and companion `driver-design.md` ## Why now Runtime API issue and ADR-011 both need a stable file to point to as the software-facing contract. The previous draft would have to be unwound when real PHY/BER numbers arrive; replacing it with a skeleton now keeps the spec honest while leaving the structure intact.
docs(spec): rewrite programming-model.md as Draft skeleton
All checks were successful
build / scalafmt-check (pull_request) Successful in 2s
build / sbt-compile (pull_request) Successful in 3s
build / shell-lint (pull_request) Successful in 18s
1156e29545
The earlier scaffold of docs/spec/programming-model.md committed worked
examples with fabricated PopLink hop latencies, per-link bandwidths, and
barrier/atomic timings. Per Issue #72 and PLAN.md Section 10's intent for
deliverable #4, the programming model spec must stay vendor- and
frequency-neutral until ADR-011 transitions from Proposed to Accepted and
the PHY survey closes.

Replace the content with a skeleton that:

- carries Status: Draft skeleton + Owner: TBD
- enumerates the eight required sections (Overview, Memory model, Address
  spaces, Synchronization primitives, Tensor partitioning patterns,
  Failure model, Programmer-visible telemetry, Open questions)
- each section has a TODO marker and 2-3 lines of intent
- explicitly states that hardware coherence between chips is NOT provided
- references the runtime API names from PLAN.md Section 14.3 (fluidpop_open,
  fluidpop_alloc, fluidpop_copy_*, fluidpop_barrier, fluidpop_broadcast,
  fluidpop_all_gather, fluidpop_query_telemetry)
- cross-references PLAN.md Sections 10, 12.4 (BER), 14.1-14.4, ADR-011,
  the runtime API design / driver scaffolding (Issue #7 / PR #16), and
  the companion driver-design.md
- contains no fabricated timing or bandwidth numbers
Author
Owner

VERDICT: PASS

This PR walks back the previous draft of docs/spec/programming-model.md and replaces it with an honest skeleton — exactly the fix Issue #72 was filed for. The old text was carrying fabricated numbers (200 ns / 350 ns PopLink hop latency, 64 / 512 Gb/s per-link bandwidth, ~400 ns barrier, ~600 ns cross-chip atomic, "alert at BER < 1e-12") that pre-committed to PHY and ADR-011 outcomes still unresolved; the diff removes every one of those and replaces them with TODO blocks gated on ADR-011 acceptance and the §12.4 PHY/BER work. Spot-checking the new content for residual fabrications: the only numerics that survive are architectural facts (eight SoCs, 2x4 torus, ADR-008 designated master, three address spaces, TP/PP/EP partitioning idioms) — no ns, µs, Gb/s, MB/s, or GiB figures, matching the PR body's grep claim. Section 4 explicitly states "Do NOT commit any latency or bandwidth values here" as a guardrail for future edits. Scope is clean: one file touched, matching Issue #72; SPDX CC-BY-SA-4.0 header at lines 1–2 preserved; path docs/spec/** is outside the ADR-017 off-limits list; no AI/Anthropic attribution anywhere; no Chisel module added so rules 4–6 do not apply. The status line correctly degrades from "Draft" to "Draft skeleton" and pins the document to ADR-011 Proposed with the right forward-link.

Findings

None.

<!-- agent:review verdict=PASS ts=1779749685 --> **VERDICT:** PASS This PR walks back the previous draft of `docs/spec/programming-model.md` and replaces it with an honest skeleton — exactly the fix Issue #72 was filed for. The old text was carrying fabricated numbers (200 ns / 350 ns PopLink hop latency, 64 / 512 Gb/s per-link bandwidth, ~400 ns barrier, ~600 ns cross-chip atomic, "alert at BER < 1e-12") that pre-committed to PHY and ADR-011 outcomes still unresolved; the diff removes every one of those and replaces them with TODO blocks gated on ADR-011 acceptance and the §12.4 PHY/BER work. Spot-checking the new content for residual fabrications: the only numerics that survive are architectural facts (eight SoCs, 2x4 torus, ADR-008 designated master, three address spaces, TP/PP/EP partitioning idioms) — no ns, µs, Gb/s, MB/s, or GiB figures, matching the PR body's grep claim. Section 4 explicitly states "Do NOT commit any latency or bandwidth values here" as a guardrail for future edits. Scope is clean: one file touched, matching Issue #72; SPDX `CC-BY-SA-4.0` header at lines 1–2 preserved; path `docs/spec/**` is outside the ADR-017 off-limits list; no AI/Anthropic attribution anywhere; no Chisel module added so rules 4–6 do not apply. The status line correctly degrades from "Draft" to "Draft skeleton" and pins the document to ADR-011 *Proposed* with the right forward-link. ## Findings _None._
fluidpop-bot approved these changes 2026-05-25 19:55:43 -03:00
Dismissed
fluidpop-bot left a comment
Collaborator

CI green (head 1156e29545), auto-approving

CI green (head 1156e29545e459107936ac7964ad61b3a89ad334), auto-approving
navigator force-pushed auto/issue-72-20260525T224929Z_issue72 from 1156e29545
All checks were successful
build / scalafmt-check (pull_request) Successful in 2s
build / sbt-compile (pull_request) Successful in 3s
build / shell-lint (pull_request) Successful in 18s
to 167b1af0da
All checks were successful
build / scalafmt-check (pull_request) Successful in 2s
build / sbt-compile (pull_request) Successful in 3s
build / shell-lint (pull_request) Successful in 10s
2026-05-25 19:55:48 -03:00
Compare
fluidpop-bot left a comment
Collaborator

CI green (head 167b1af0da), auto-approving

CI green (head 167b1af0da226df36f96f495c8c298a0505c9479), auto-approving
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Fluid/fluidpop-v1!76
No description provided.