feat(runtime): driver UAPI + design specs + mock C scaffold (Issue #7) #16
No reviewers
Labels
No labels
adr
agent:blocked-ci
agent:blocked-human
agent:blocked-resolver
agent:done
agent:in-progress
agent:no-touch
agent:pinged
agent:pr-open
agent:queued
agent:wip
area:board
area:funding
area:infra
area:phy
area:poplink
area:rtl
area:software
area:supply-chain
area:verification
ci-failed
ci-timeout
docs
do-not-merge
human-approved
needs-human-approval
needs-rebase
needs-triage
phase:1
ready-for-review
review:findings
review:pass
risk:tripwire
swarm:quarantined
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
Fluid/fluidpop-v1!16
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feature/runtime-driver-spec"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes Issue #7 partially (Phase 1 specs + scaffolding; full Linux kernel module implementation is Phase 6 per PLAN.md Section 14.7).
What this PR adds:
sw/runtime/include/fluidpop_driver.h (AGPL v3): kernel UAPI header. IOCTL magic 0xFD, 15 numbered ioctls covering version/discovery, alloc/free, three DMA directions (H2D/D2H/D2D), kernel launch, barrier, events, telemetry, link info. Packed structs use kernel uapi __u32/__u64/__s32. ABI version macro.
docs/spec/driver-design.md (CC BY-SA 4.0): 10-section design spec. Linux 6.6 LTS baseline, 6.10+ upstream target under drivers/accel/. PCI vendor-ID placeholder strategy until PCI-SIG application. Hugepage DMA pools, ADR-011 non-coherence enforcement, MSI-X + eventfd completion, 10 Hz telemetry, capability + cgroup security, KUnit + mock test tiers, upstream submission roadmap.
docs/spec/programming-model.md (CC BY-SA 4.0): 12-section guide. 2x4 torus topology, non-coherent memory semantics, streams, broadcast tree, ring all-gather, ring all-reduce, worked Llama-7B 4-bit TP+PP layout, MoE expert parallelism, sync primitives, performance pitfalls, debugging workflow.
sw/runtime/src/fluidpop_mock.c (AGPL v3): in-process mock scaffold. C11 + pthread only, no external deps. 8-chip model with per-chip buffer pool, per-stream worker queues, 2x4 torus hop sleep emulation for D2D. Framework backends (ONNX RT, llama.cpp) compile and link-test against it on commodity x86 CI runners without silicon.
Tests: docs+C scaffold only. CI runs shell-lint + scalafmt-skip (no .scalafmt yet) + sbt-skip (no build.sbt yet); future C compile job added when full mock implementation lands. Per GitOps workflow, will be auto-merged via infra/forgejo/auto-merge.sh once CI green.
CI green (head
6c02276a2b), auto-approving