PopSolutions Sails — software stack (Linux driver, userspace runtime, GGML inference backend, distributed multi-card scheduler). The helmsman of the fleet. Mirrored to github.com/popsolutions/Spanker.
  • Rust 90.4%
  • C 8.4%
  • Makefile 1.2%
Find a file
Marcos Méndez Quintero 2f57196fea
Some checks are pending
ci / Docs / SPDX sanity (push) Waiting to run
ci / Driver / kbuild (push) Waiting to run
ci / Runtime / cargo (push) Waiting to run
feat(scheduler): add HOST_LINK_BW constant + 3-way bandwidth model (closes #21) (#22)
Add `HOST_LINK_BW_BYTES_PER_SEC = 100_000_000` (100 MB/s) to the
bandwidth model, capturing the rev-A GbE host link as the third
tier of the bandwidth hierarchy:

  Local DDR  (per card)      ~2.0 GB/s   LOCAL_DDR_BW
  Inter-card (per direction) ~500 MB/s   INTERCARD_BW
  Host link  (GbE)           ~100 MB/s   HOST_LINK_BW (NEW)

Source-of-truth: Stays
`docs/upstream-contributions/2026-05-06-liteeth-ecp5-sgmii.md`
(Stays PR #34, merged 2026-05-06). Community measurements on
Versa-ECP5 and ECPIX-5 land at 800-940 Mbps UDP iperf3, i.e.
80-94 % of GbE line rate. The 100 MB/s number is the realistic
post-IP/UDP/Ethernet-header steady-state ceiling.

The host link is 5x slower than inter-card and 20x slower than
local DDR — it is the dominant cost when collective ops must
reach the host (model load, gradient checkpoint to host RAM,
dataset streaming, prompt-embedding upload).

## Scope

Minimal — per the issue spec's "if pick_strategy already handles
this" branch:

- `pick_strategy` is the per-token TP/MP decision and most decode
  tokens stay on-card; host-link cost is small per-token and only
  matters at session boundaries.
- No callers exist today for a session-level cost-budget API, so
  introducing `bytes_per_second_per_token_estimate` would be
  speculative generality (YAGNI). Defer until the runtime needs
  it.
- This PR keeps the public surface to a constant + module-level
  doctest update + tests.

## Tests

3 new unit tests in `bandwidth.rs`:

- `host_link_bw_constant_matches_recon_doc` — pins value to
  100_000_000 (guards against silent "round up to 125 MB/s line
  rate" drift).
- `host_link_bw_is_slowest_hop` — pins the three-tier ordering
  HOST_LINK < INTERCARD < LOCAL_DDR.
- `host_link_bw_is_inside_observed_range` — pins 80-125 MB/s
  envelope (community recon range, with line-rate ceiling).

Plus the existing `constants_are_positive` test extended to cover
the new constant.

Module-level doctest in `bandwidth.rs` updated to demonstrate all
three constants. Crate-root doctest in `lib.rs` updated to assert
the three-tier ordering.

## Cargo gates

- `cargo build -p spanker-scheduler`: green
- `cargo test -p spanker-scheduler`: 27 unit + 9 integration + 6
  doctests, all green (delta: +3 unit tests vs PR #19 baseline)
- `cargo clippy -p spanker-scheduler --all-targets -- -D warnings`:
  green
- `cargo fmt -p spanker-scheduler -- --check`: clean

Refs:
- popsolutions/Spanker#21 (this issue)
- popsolutions/Stays#34 (LiteEth ECP5 SGMII recon, source-of-truth)
- popsolutions/Spanker#17 (PR that landed initial 2-tier model)
- popsolutions/Spanker#19 (PR that landed pick_strategy)

Authored by Agent 3 (Software Stack — Spanker).

Signed-off-by: Marcos <m@pop.coop>
Co-authored-by: Marcos <m@pop.coop>
2026-05-06 13:18:02 -03:00
.github/workflows feat(ggml): bindgen over spanker UAPI + addresses #7 review nits (#15) 2026-05-06 11:34:09 -03:00
docs/adr docs(adr): ADR-002 — out-of-tree kernel module driver model (#2) 2026-05-05 23:29:51 -03:00
external feat(ggml): ggml-spanker crate with MatmulInt4 trait + MockSail (#5) 2026-05-06 00:24:38 -03:00
src feat(scheduler): add HOST_LINK_BW constant + 3-way bandwidth model (closes #21) (#22) 2026-05-06 13:18:02 -03:00
.gitignore feat(driver): PCIe driver skeleton with /dev/spankerctl ioctl stub (#3) 2026-05-05 23:51:06 -03:00
.gitmodules feat(ggml): bindgen over spanker UAPI + addresses #7 review nits (#15) 2026-05-06 11:34:09 -03:00
Cargo.lock feat(ggml): bindgen over spanker UAPI + addresses #7 review nits (#15) 2026-05-06 11:34:09 -03:00
Cargo.toml feat(scheduler): spanker-scheduler crate with Topology + collective ops (#6) 2026-05-06 01:15:56 -03:00
CONTRIBUTING.md feat: initial Spanker repo bootstrap 2026-05-05 23:05:31 -03:00
LICENSE feat: initial Spanker repo bootstrap 2026-05-05 23:05:31 -03:00
README.md docs(adr): ADR-001 — choose Rust for userspace runtime (#1) 2026-05-05 23:22:47 -03:00
rust-toolchain.toml feat(runtime): spanker-runtime crate with v0 ioctl wrappers (#4) 2026-05-05 23:52:53 -03:00

Spanker — PopSolutions Sails software stack

In a tall ship, the spanker sail is the aftmost driving sail — the one that holds the ship's direction at the helm. The Spanker is the helmsman of the fleet.

Spanker is the software counterpart to the PopSolutions Sails hardware. It contains:

  • Linux kernel driver — PCIe enumerator, ioctl interface, DMA management for the Sails accelerator boards
  • Userspace runtime — process for orchestrating workloads, memory management, kernel binary loading
  • GGML inference backend — port of GGML kernel ops to RVV + custom matrix extension (Xpop_matmul)
  • Distributed scheduler — TP/MP/PP across N Sails (multi-card parallelism is first-class per project mission)

The first target Sail is InnerJib7EA (POPC_16A) — embedded entry SBC. Spanker integrates with the AXI4 boundary defined in popsolutions/MAST and validated by popsolutions/InnerJib7EA.

Status

Bootstrap (2026-05-05). Skeleton landed; real work begins with the first PR by Agent 3 (Software Stack) per the 4+1 operating model.

License

  • Software contributions: Apache 2.0 (kernel driver may be GPL-2.0-only if linking to Linux kernel headers requires; documented per file)
  • Documentation: CC-BY-SA 4.0

See LICENSE for the full text.

Layout

Spanker/
├── README.md
├── LICENSE
├── CONTRIBUTING.md
├── src/
│   ├── driver/        (Linux kernel module — C)
│   ├── runtime/       (userspace orchestrator — Rust; per ADR-001)
│   └── backends/
│       └── ggml/      (GGML kernel ports for Sails)
├── tests/             (pytest harness; integration tests against MAST cocotb sim)
├── docs/
│   ├── adr/           (architectural decisions)
│   └── api/           (versioned interface contracts: ioctl, runtime API, GGML)
└── .github/workflows/ (CI)

Contributing

Per popsolutions/MAST/GOVERNANCE.md: cooperative-affiliate-only code contribution. DCO sign-off on every commit.