feat(runtime): driver UAPI + design specs + mock C scaffold (Issue #7) #16

Merged
navigator merged 1 commit from feature/runtime-driver-spec into main 2026-05-24 11:38:56 -03:00
Owner

Closes Issue #7 partially (Phase 1 specs + scaffolding; full Linux kernel module implementation is Phase 6 per PLAN.md Section 14.7).

What this PR adds:

  • sw/runtime/include/fluidpop_driver.h (AGPL v3): kernel UAPI header. IOCTL magic 0xFD, 15 numbered ioctls covering version/discovery, alloc/free, three DMA directions (H2D/D2H/D2D), kernel launch, barrier, events, telemetry, link info. Packed structs use kernel uapi __u32/__u64/__s32. ABI version macro.

  • docs/spec/driver-design.md (CC BY-SA 4.0): 10-section design spec. Linux 6.6 LTS baseline, 6.10+ upstream target under drivers/accel/. PCI vendor-ID placeholder strategy until PCI-SIG application. Hugepage DMA pools, ADR-011 non-coherence enforcement, MSI-X + eventfd completion, 10 Hz telemetry, capability + cgroup security, KUnit + mock test tiers, upstream submission roadmap.

  • docs/spec/programming-model.md (CC BY-SA 4.0): 12-section guide. 2x4 torus topology, non-coherent memory semantics, streams, broadcast tree, ring all-gather, ring all-reduce, worked Llama-7B 4-bit TP+PP layout, MoE expert parallelism, sync primitives, performance pitfalls, debugging workflow.

  • sw/runtime/src/fluidpop_mock.c (AGPL v3): in-process mock scaffold. C11 + pthread only, no external deps. 8-chip model with per-chip buffer pool, per-stream worker queues, 2x4 torus hop sleep emulation for D2D. Framework backends (ONNX RT, llama.cpp) compile and link-test against it on commodity x86 CI runners without silicon.

Tests: docs+C scaffold only. CI runs shell-lint + scalafmt-skip (no .scalafmt yet) + sbt-skip (no build.sbt yet); future C compile job added when full mock implementation lands. Per GitOps workflow, will be auto-merged via infra/forgejo/auto-merge.sh once CI green.

Closes Issue #7 partially (Phase 1 specs + scaffolding; full Linux kernel module implementation is Phase 6 per PLAN.md Section 14.7). What this PR adds: - sw/runtime/include/fluidpop_driver.h (AGPL v3): kernel UAPI header. IOCTL magic 0xFD, 15 numbered ioctls covering version/discovery, alloc/free, three DMA directions (H2D/D2H/D2D), kernel launch, barrier, events, telemetry, link info. Packed structs use kernel uapi __u32/__u64/__s32. ABI version macro. - docs/spec/driver-design.md (CC BY-SA 4.0): 10-section design spec. Linux 6.6 LTS baseline, 6.10+ upstream target under drivers/accel/. PCI vendor-ID placeholder strategy until PCI-SIG application. Hugepage DMA pools, ADR-011 non-coherence enforcement, MSI-X + eventfd completion, 10 Hz telemetry, capability + cgroup security, KUnit + mock test tiers, upstream submission roadmap. - docs/spec/programming-model.md (CC BY-SA 4.0): 12-section guide. 2x4 torus topology, non-coherent memory semantics, streams, broadcast tree, ring all-gather, ring all-reduce, worked Llama-7B 4-bit TP+PP layout, MoE expert parallelism, sync primitives, performance pitfalls, debugging workflow. - sw/runtime/src/fluidpop_mock.c (AGPL v3): in-process mock scaffold. C11 + pthread only, no external deps. 8-chip model with per-chip buffer pool, per-stream worker queues, 2x4 torus hop sleep emulation for D2D. Framework backends (ONNX RT, llama.cpp) compile and link-test against it on commodity x86 CI runners without silicon. Tests: docs+C scaffold only. CI runs shell-lint + scalafmt-skip (no .scalafmt yet) + sbt-skip (no build.sbt yet); future C compile job added when full mock implementation lands. Per GitOps workflow, will be auto-merged via infra/forgejo/auto-merge.sh once CI green.
feat(runtime): driver UAPI + design specs + mock C scaffold (Issue #7)
All checks were successful
build / scalafmt-check (push) Successful in 3s
build / sbt-compile (push) Successful in 3s
build / shell-lint (push) Successful in 8s
build / scalafmt-check (pull_request) Successful in 3s
build / sbt-compile (pull_request) Successful in 3s
build / shell-lint (pull_request) Successful in 8s
6c02276a2b
Phase 1 software deliverables (specs + scaffolding only; full Linux
kernel module implementation is Phase 6 per PLAN.md Section 14.7).

- sw/runtime/include/fluidpop_driver.h (AGPL v3): kernel UAPI header.
  IOCTL magic 0xFD, 15 numbered ioctls covering version/discovery,
  alloc/free, three DMA directions (H2D/D2H/D2D), kernel launch,
  barrier, events, telemetry, link info. Packed structs use kernel
  uapi __u32/__u64/__s32. ABI version macro.

- docs/spec/driver-design.md (CC BY-SA 4.0): 10-section design spec.
  Linux 6.6 LTS baseline, 6.10+ upstream target under drivers/accel/.
  PCI vendor-ID placeholder strategy until PCI-SIG application.
  Hugepage DMA pools, ADR-011 non-coherence enforcement, MSI-X +
  eventfd completion, 10 Hz telemetry, capability + cgroup security,
  KUnit + mock test tiers, upstream submission roadmap.

- docs/spec/programming-model.md (CC BY-SA 4.0): 12-section guide.
  2x4 torus topology, non-coherent memory semantics, streams,
  broadcast tree, ring all-gather, ring all-reduce, worked Llama-7B
  4-bit TP+PP layout, MoE expert parallelism, sync primitives,
  performance pitfalls, debugging workflow.

- sw/runtime/src/fluidpop_mock.c (AGPL v3): in-process mock scaffold.
  C11 + pthread, no external deps. 8-chip model with per-chip buffer
  pool, per-stream worker queues, 2x4 torus hop sleep emulation for
  D2D. Framework backends (ONNX RT, llama.cpp) compile and link-test
  against it on commodity x86 CI runners without silicon.

Cross-links fluidpop.h, ADR-011, PLAN.md Sections 3 and 14.
fluidpop-bot left a comment
Collaborator

CI green (head 6c02276a2b), auto-approving

CI green (head 6c02276a2b79daf5707cf132047b48ca7c3a9237), auto-approving
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Fluid/fluidpop-v1!16
No description provided.