Pending Design Topics
Captured decisions and open questions that need guide/prescriptive pages once the first physical node is running. These are aligned in principle but not yet written up.
Audit Trail
Append-only Merkle tree of signed operations. Each entry:
{actor (signing pubkey), operation, timestamp, vector_clock, signature}
The chain: Human (IdP identity) -> auth method -> signing key -> CRDT actor -> operation. Every link is cryptographically signed. The audit
log replicates via shard storage (same as everything else). Tamper-evident:
modifying an entry breaks the Merkle root.
Government compliance (FedRAMP, NIST 800-53, CMMC) requires: WHO did WHAT, WHEN, from WHERE, with WHAT authority. We have all the ingredients:
- CRDTs: WHAT and WHEN
- conn_auth: WHERE and with WHAT signing key
- IdP: WHO (human identity linked to signing key)
Open: queryable log vs raw Merkle walk. Probably both -- raw tree for integrity, indexed view for search.
Auth Method Linking
Multiple external auth methods map to one internal org identity:
- "I sign in with Outlook and Gmail, both go to my one account"
- Org policy dictates 2FA requirements per auth method
- Can revoke whole auth chains (disallow Yahoo if Yahoo breached)
- Provider revocation = CRDT config change, gossip-propagated
The IdP section in 03-networking-security already covers federation. What's missing: the internal identity model (how multiple methods link), the revocation granularity (per-method, per-provider, per-user), and how this interacts with the audit trail (which auth method was used for which operation).
Multisig Quorum Scaling
How does K-of-N multisig adapt as nodes join/leave?
- 1 node: K=1 (no protection, but single-node org has no adversary)
- 2 nodes: K=2 (both must agree)
- 5 nodes: K=3 (majority)
- Node leaves: K recalculates based on current membership
Percentage-based (majority of current members) fits the self-organizing philosophy. The security guarantee scales with the cluster.
Open: what about transient membership changes (node rebooting during an upgrade)? Probably: nodes in MEMBER_DOWN state don't count toward N for the recalculation, so K doesn't spike during rolling upgrades.
Admin Tool Implementation
fortros-admin is designed (guide + prescriptive pages written) but not implemented. Key pieces:
- Rust binary with axum HTTP server
- htmx frontend (vendored, no npm)
- Setup wizard (org create, domain config, DNS verify)
- Invite manager (bootstrapper image creation/revocation)
- Provisioner endpoint (TLS on 7443)
- Cloudflare Tunnel integration for WAN provisioning
- Migration to org-hosted admin UI
Disk Partitioning Implementation
GPT GUIDs and partitioning strategy are designed (prescriptive 02 + 06) but not implemented. Key pieces:
- disk-probe: scan all disks for FortrOS GUIDs, build map
- Preboot auto-partitioning: ESP + hibernate + persist on fastest disk
- Main OS pool creation: dm-thin for shards + scratch
- Disk trust: UEFI var storage of provisioned disk serials
- Org-controlled layout via CRDT desired state
WAN Provisioning (Cloudflare Tunnel)
Designed in guide but not implemented:
- cloudflared on admin workstation
- DNS:
$FORTROS_GATEWAY(org-specific) -> tunnel - Provisioner serves through tunnel (raw TCP, FortrOS handles TLS)
- Bootstrapper conn_auth (Ed25519 invite key) through tunnel
- Random subdomain system for customer org DNS
Secure Boot Key Management
For most orgs: add org signing key to UEFI DB alongside Microsoft/OEM keys. Sign preboot UKI with org key. UEFI verifies chain. Microsoft keys stay because firmware drivers (NVMe, network, GOP) are signed with them.
For high-security orgs that want to drop default keys entirely:
- Need to sign or bundle all required UEFI drivers ourselves
- NVMe driver (essential -- can't read boot disk without it)
- Network driver (needed for PXE, though could skip if USB-only provisioning)
- GOP/video driver (needed for display output during boot)
- Option: extract drivers from existing firmware, re-sign with org key
- Option: use open-source UEFI drivers where available
- This is a deep rabbit hole -- only pursue for government/air-gapped orgs that require full key sovereignty
Open: do we build a driver bundle as part of the org image, or is this a per-hardware-model effort? Probably per-model with a community-maintained driver database.
KHO Usage in Preboot
KHO is enabled in kernel configs but not used yet by the preboot agent. Future work:
- Preserve LUKS key in KHO memory region (replace appended initramfs)
- Preserve zram state for instant hibernate
- Preserve swap device keys across kexec
- Eventually: LIVEUPDATE for device handover (needs per-driver support)