PopSolutions Sails — software stack (Linux driver, userspace runtime, GGML inference backend, distributed multi-card scheduler). The helmsman of the fleet. Mirrored to github.com/popsolutions/Spanker.
- Rust 90.4%
- C 8.4%
- Makefile 1.2%
Add `HOST_LINK_BW_BYTES_PER_SEC = 100_000_000` (100 MB/s) to the bandwidth model, capturing the rev-A GbE host link as the third tier of the bandwidth hierarchy: Local DDR (per card) ~2.0 GB/s LOCAL_DDR_BW Inter-card (per direction) ~500 MB/s INTERCARD_BW Host link (GbE) ~100 MB/s HOST_LINK_BW (NEW) Source-of-truth: Stays `docs/upstream-contributions/2026-05-06-liteeth-ecp5-sgmii.md` (Stays PR #34, merged 2026-05-06). Community measurements on Versa-ECP5 and ECPIX-5 land at 800-940 Mbps UDP iperf3, i.e. 80-94 % of GbE line rate. The 100 MB/s number is the realistic post-IP/UDP/Ethernet-header steady-state ceiling. The host link is 5x slower than inter-card and 20x slower than local DDR — it is the dominant cost when collective ops must reach the host (model load, gradient checkpoint to host RAM, dataset streaming, prompt-embedding upload). ## Scope Minimal — per the issue spec's "if pick_strategy already handles this" branch: - `pick_strategy` is the per-token TP/MP decision and most decode tokens stay on-card; host-link cost is small per-token and only matters at session boundaries. - No callers exist today for a session-level cost-budget API, so introducing `bytes_per_second_per_token_estimate` would be speculative generality (YAGNI). Defer until the runtime needs it. - This PR keeps the public surface to a constant + module-level doctest update + tests. ## Tests 3 new unit tests in `bandwidth.rs`: - `host_link_bw_constant_matches_recon_doc` — pins value to 100_000_000 (guards against silent "round up to 125 MB/s line rate" drift). - `host_link_bw_is_slowest_hop` — pins the three-tier ordering HOST_LINK < INTERCARD < LOCAL_DDR. - `host_link_bw_is_inside_observed_range` — pins 80-125 MB/s envelope (community recon range, with line-rate ceiling). Plus the existing `constants_are_positive` test extended to cover the new constant. Module-level doctest in `bandwidth.rs` updated to demonstrate all three constants. Crate-root doctest in `lib.rs` updated to assert the three-tier ordering. ## Cargo gates - `cargo build -p spanker-scheduler`: green - `cargo test -p spanker-scheduler`: 27 unit + 9 integration + 6 doctests, all green (delta: +3 unit tests vs PR #19 baseline) - `cargo clippy -p spanker-scheduler --all-targets -- -D warnings`: green - `cargo fmt -p spanker-scheduler -- --check`: clean Refs: - popsolutions/Spanker#21 (this issue) - popsolutions/Stays#34 (LiteEth ECP5 SGMII recon, source-of-truth) - popsolutions/Spanker#17 (PR that landed initial 2-tier model) - popsolutions/Spanker#19 (PR that landed pick_strategy) Authored by Agent 3 (Software Stack — Spanker). Signed-off-by: Marcos <m@pop.coop> Co-authored-by: Marcos <m@pop.coop> |
||
|---|---|---|
| .github/workflows | ||
| docs/adr | ||
| external | ||
| src | ||
| .gitignore | ||
| .gitmodules | ||
| Cargo.lock | ||
| Cargo.toml | ||
| CONTRIBUTING.md | ||
| LICENSE | ||
| README.md | ||
| rust-toolchain.toml | ||
Spanker — PopSolutions Sails software stack
In a tall ship, the spanker sail is the aftmost driving sail — the one that holds the ship's direction at the helm. The Spanker is the helmsman of the fleet.
Spanker is the software counterpart to the PopSolutions Sails hardware. It contains:
- Linux kernel driver — PCIe enumerator, ioctl interface, DMA management for the Sails accelerator boards
- Userspace runtime — process for orchestrating workloads, memory management, kernel binary loading
- GGML inference backend — port of GGML kernel ops to RVV + custom matrix extension (Xpop_matmul)
- Distributed scheduler — TP/MP/PP across N Sails (multi-card parallelism is first-class per project mission)
The first target Sail is InnerJib7EA (POPC_16A) — embedded entry
SBC. Spanker integrates with the AXI4 boundary defined in
popsolutions/MAST and validated by popsolutions/InnerJib7EA.
Status
Bootstrap (2026-05-05). Skeleton landed; real work begins with the first PR by Agent 3 (Software Stack) per the 4+1 operating model.
License
- Software contributions: Apache 2.0 (kernel driver may be GPL-2.0-only if linking to Linux kernel headers requires; documented per file)
- Documentation: CC-BY-SA 4.0
See LICENSE for the full text.
Layout
Spanker/
├── README.md
├── LICENSE
├── CONTRIBUTING.md
├── src/
│ ├── driver/ (Linux kernel module — C)
│ ├── runtime/ (userspace orchestrator — Rust; per ADR-001)
│ └── backends/
│ └── ggml/ (GGML kernel ports for Sails)
├── tests/ (pytest harness; integration tests against MAST cocotb sim)
├── docs/
│ ├── adr/ (architectural decisions)
│ └── api/ (versioned interface contracts: ioctl, runtime API, GGML)
└── .github/workflows/ (CI)
Related repos
popsolutions/MAST— RTL trunkpopsolutions/InnerJib7EA— first product targetpopsolutions/Stays— FPGA hardware
Contributing
Per popsolutions/MAST/GOVERNANCE.md: cooperative-affiliate-only code
contribution. DCO sign-off on every commit.