04 Disk Encryption

The Problem

The preboot has authenticated to the org and received key material (03 Trust and Identity). Now it needs to access the local disk -- but the disk is encrypted. The node's persistent data (identity keys, boot state, cached org state) lives on an encrypted partition that won't open without the right key.

This sounds like standard disk encryption (BitLocker, FileVault), but FortrOS uses encryption for a purpose beyond confidentiality:

Encryption is an authorization gate. The org controls who can boot. Revoke a machine's key material, and it literally cannot access its own disk. A stolen device is a paperweight -- not because the thief can't read your files (though they can't), but because the device can't start.

What Is Disk Encryption?

Disk encryption transforms every read and write through a cryptographic layer. Without the key, the disk contents are indistinguishable from random noise.

The dominant standard on Linux is LUKS (Linux Unified Key Setup), which uses dm-crypt (device-mapper crypt) in the kernel to handle the actual encryption. LUKS adds a standard header format on top of dm-crypt that manages multiple keyslots, allows key changes without re-encrypting, and stores all metadata in one place.

A LUKS partition has:

A header containing metadata, salt, and encrypted copies of the master key
Up to 32 keyslots (LUKS2), each holding a differently-encrypted copy of the same master key
The encrypted data using the master key (AES-256-XTS by default)

The critical insight: keyslots are independent. You can unlock the same partition with a password (keyslot 0), a key file (keyslot 1), or a YubiKey (keyslot 2). Adding or removing a keyslot doesn't touch the encrypted data -- only the keyslot's encrypted copy of the master key.

How Others Do It

BitLocker (Windows): TPM + PIN

BitLocker seals the disk encryption key to the TPM's PCR values. On boot, if the firmware and bootloader match the expected measurements, the TPM releases the key and Windows boots automatically. Optionally, a PIN adds a second factor.

Strength: Seamless user experience (no password prompt on normal boot). Weakness: No remote revocation -- a stolen device with the same boot chain can still unlock.

FileVault (macOS): User Password + Secure Enclave

FileVault encrypts the startup volume with a key derived from the user's login password, stored in the Secure Enclave (Apple's equivalent of a TPM). The key is released when the user types their password at the login screen.

Strength: Simple, user-understands-it model. Weakness: Requires a human to type a password on every boot. No remote management of encryption state.

Talos Linux: TPM-Sealed Key

Talos seals the disk encryption key to TPM PCR values (measured boot). The key is only released if the boot chain matches. Talos uses two sets of measurements: one for Secure Boot state (PCR 7) and one for the UKI contents (PCR 11, measured by the boot stub).

Strength: Fully automatic, no passwords, tied to verified boot chain. Weakness: Firmware updates change PCR values, requiring key re-sealing. Recovery requires the recovery key generated at install time.

Android: File-Based Encryption

Modern Android uses file-based encryption (FBE) where different directories have different keys, derived from the user's PIN/pattern plus hardware-bound keys. This allows "Direct Boot" -- certain apps (alarm clock, phone dialer) work before the user unlocks the device.

Strength: Granular, per-file key management. Weakness: Complex key hierarchy, many attack surface points.

The Tradeoffs

Approach	Remote revocation	Unattended boot	Firmware update handling	Stolen device protection
TPM-sealed (BitLocker, Talos)	No	Yes	Must re-seal after update	Boot chain must match
Password-based (FileVault)	No	No (human needed)	N/A	Password strength
Org-derived key (FortrOS)	Yes	Yes	Key is firmware-independent	Org revokes = bricked
No encryption	N/A	Yes	N/A	None

FortrOS wants: remote revocation, unattended boot, firmware-update-resilient. The org-derived key approach achieves all three.

How FortrOS Does It

/persist: The Encrypted Partition

Each FortrOS node has a LUKS-encrypted partition called /persist. It stores:

WireGuard keypair (main OS identity)
Ed25519 signing key (for gossip and conn_auth)
Boot state (current generation markers, rollback flags)
Cached org state snapshot (for offline operation)
Cached kernel generations

/persist is thin -- only identity, boot state, and cache. Everything else is derived at boot or fetched from the org.

Key Derivation

The LUKS key is derived using HKDF:

luks_key = HKDF-SHA256(
    input   = preboot_secret,    # 32 bytes, stored in TPM NV
    salt    = ca_pubkey,         # org CA public key (ties to this org)
    info    = generation_id      # current generation (ties to this version)
)

Each ingredient locks out a different attack:

1. preboot_secret (machine-specific, TPM-stored): Unique per machine (random 32 bytes from enrollment). An attacker who boots their own compromised OS on the hardware can't derive the key -- the preboot_secret is in the TPM, which only releases it to the genuine preboot. You need the right software on the right hardware.

2. ca_pubkey (org-specific, baked into preboot UKI): The org's CA public key. A /persist partition from one org can't be unlocked by a machine enrolled in a different org. But more importantly: the preboot has this key baked into its initramfs and uses it to verify the TLS connection to the generation authority. An attacker-in-the-middle can't replay a revoked generation_secret because the preboot won't trust a TLS connection that doesn't chain to the org's CA. You need the right org.

3. generation_id (version-specific, from generation authority): Ties the key to a specific generation. The generation authority holds a generation_secret used to derive generation_id. An attacker who intercepts a generation_secret in transit can't use it -- they'd also need the machine's preboot_secret (TPM) AND would need to present it through a TLS connection the preboot trusts (org CA). Deleting the generation_secret from the generation authority is irreversible -- no machine can derive the key for that generation ever again. You need the right version.

The three ingredients form an interlocking chain: compromise any one, and the other two block you. Compromise the OS? TPM won't release the secret. Intercept the generation_secret? Can't fake the TLS connection. Steal the hardware? The org revokes the generation and the TPM's secret derives nothing useful.

One Partition, Multiple Keyslots

/persist is a single LUKS partition, not one partition per generation. The LUKS master key (which actually encrypts the data) never changes. What changes is which keyslots can unlock that master key.

FortrOS uses LUKS2 keyslots for different unlock paths:

Keyslot	Purpose	When Used
0	Current generation's org-derived key	Normal boot
1	Previous generation's key (during upgrade)	Rollback
2-N	Admin YubiKey keyslots	Disaster recovery (org unreachable)

With 32 keyslots available (LUKS2), there's room for the current and previous generation keys plus up to 30 admin YubiKeys. Whether to use a small set of global admin keys or regional key lists is an org policy decision, not a technical limitation.

Admin YubiKeys are enrolled via the org and propagated through gossip. Each keyslot holds the same LUKS master key encrypted with a different credential. Adding a YubiKey keyslot (cryptsetup luksAddKey) doesn't change the master key or re-encrypt data -- it just adds another way to unlock it.

What gets propagated: Only the challenge (or credential ID) needed to set up the keyslot -- what to ask the YubiKey, not what it answers. The actual LUKS key for the keyslot is derived from the YubiKey's HMAC response, which requires physical possession of the YubiKey + PIN. The LUKS keyslot itself is the verification: if the YubiKey's response derives a key that opens the keyslot, it's the right YubiKey. No separate verification material travels the network. An attacker who intercepts the gossip propagation gets a challenge -- useless without the physical YubiKey.

The Unlock Sequence

The preboot uses a "send what you have" pattern to negotiate which generation to boot:

Preboot sends H(preboot_secret) + list of cached generation IDs to the generation authority (proves identity, offers what it has)
Gen-auth authenticates the preboot, selects the best available generation from the offered list, and responds with the generation_secret + generation_id for that generation
Preboot derives the LUKS key from preboot_secret + ca_pubkey + generation_id
Preboot runs cryptsetup luksOpen with the derived key

The gen-auth is the decision maker. It knows which generations are current, which are revoked, and which the org wants this specific node to run. The preboot just offers what it has cached and follows instructions.

If the gen-auth rejects all offered generations (all revoked, or the org wants a clean slate), it provides material for a new generation only. The preboot can't unlock /persist with any cached keyslot, so it reformats from scratch. This is remote wipe without revocation -- the preboot identity (TPM) survives, but /persist gets a fresh start.

The timing of a generation change becomes a control lever:

Generation change while main OS is running (normal rolling upgrade): The node adds a new LUKS keyslot, reboots, boots the new generation with /persist intact. WireGuard identity, cached state, everything preserved.
Generation change while in preboot (nuke + fresh start): Gen-auth refuses old generations, provides only new material. /persist is reformatted. New WireGuard identity, fresh main OS. The preboot identity survives -- the node re-enrolls automatically.

The gen-auth controls whether a node gets an upgrade (preserve state) or a fresh start (nuke state) by choosing which generation materials to provide. This is useful for: suspected OS-level compromise (wipe without revoking hardware identity), fleet refresh (force clean state across nodes), or decommissioning a node's data while keeping its org membership.

Rolling Upgrades and Keyslots

When upgrading to a new generation, the LUKS key changes (because generation_id changes in the derivation). The upgrade process:

prepare-upgrade: Add a new keyslot for the new generation's derived key (the old keyslot stays active)
Reboot: The node boots the new generation, unlocks /persist with the new keyslot
cleanup-upgrade: Remove the old keyslot

During the upgrade window, both keyslots exist -- the node can boot either generation. After cleanup, only the new generation's keyslot remains. This is how FortrOS does rolling upgrades without downtime: nodes upgrade one at a time, each maintaining backward compatibility during the transition.

First Boot vs Subsequent Boot

First boot: preboot_secret is new (just received during enrollment). LUKS partition doesn't exist yet. The preboot runs cryptsetup luksFormat to create the LUKS header and format /persist. This is a one-time operation.

Subsequent boot: preboot_secret is in TPM NV. The preboot derives the LUKS key and runs cryptsetup luksOpen. If it fails (generation revoked, /persist corrupted), the preboot reformats -- identity and cached state are lost, but the machine can re-enroll and rejoin the org.

The Key Across kexec

After the preboot unlocks /persist, it needs to pass the LUKS key to the main OS (which will need it for the persist-mount s6 service). The preboot builds a minimal compressed cpio archive containing the key file, appends it to the generation's initramfs, and passes the combined image to kexec. The main OS reads the key from the appended archive, opens LUKS, zeroes the key from memory, and deletes the file. No trace remains after boot.

This is covered in detail in 05 Loading the Real OS.

Disk Discovery

Before the preboot can unlock /persist, it needs to FIND it. On a laptop with one NVMe, this is trivial. On a server with six drives, or a VPS where /dev/vda and /dev/sda might swap between boots, it's not.

Why Device Paths Are Unreliable

Linux assigns device names based on discovery order. The first SATA disk is /dev/sda, but if you plug in a USB drive before boot, the USB might become /dev/sda and the SATA becomes /dev/sdb. NVMe devices (/dev/nvme0n1) are more stable but still not guaranteed across kernel versions or hardware changes.

Hardcoding /dev/vdb works in a VM with a fixed disk layout. It fails on real hardware.

GPT Partition Type GUIDs

FortrOS solves this with custom GPT partition type GUIDs. Each partition role gets its own GUID:

Role	GUID	Purpose
Hibernate	`FORTROS_HIBERNATE`	RAM image for resume (LUKS, sized >= RAM)
Persist	`FORTROS_PERSIST`	Node identity + state (~2GB, LUKS)
Pool	`FORTROS_POOL`	dm-thin: shards + scratch (bulk of disk)
Swap	`FORTROS_SWAP`	Encrypted swap device(s)

The ESP uses the standard EFI System Partition GUID (already defined by UEFI spec).

On every boot, the disk-probe service scans all block devices, reads GPT headers, and builds a map: "persist is on /dev/nvme0n1p3, pool is on /dev/sda1 and /dev/nvme0n1p4." The rest of the system uses the role names, never raw device paths.

Who Creates the Partitions

Partitioning happens in two phases:

Preboot (first boot): Creates the fixed-size partitions needed before the org is involved. The preboot probes the hardware, picks the fastest disk, and creates:

ESP (512MB)
Hibernate (= detected RAM size)
/persist (2GB)

These sizes are deterministic from the hardware. No org connection needed.

Main OS (after enrollment): Creates dynamic partitions based on org policy. The org's desired disk layout (stored in CRDTs) specifies where swap and pool partitions go, how large they are, and what the dm-thin limits are. The maintainer reconciles the actual layout toward the desired state. This is a level-triggered operation -- the org says "this is what I want," the node makes it so.

Partitions can be added to new disks or resized on existing ones without repartitioning. The dm-thin pool handles scratch/shard allocation dynamically. Swap can be striped or mirrored across multiple disks.

Stage Boundary

What This Stage Produces

After disk encryption is handled:

/persist is unlocked and mounted (or freshly formatted on first boot)
Identity keys are accessible (WireGuard key, Ed25519 signing key)
Boot state is accessible (generation markers, rollback flags)
The LUKS key is in memory (will be passed to main OS via kexec)

What Is Handed Off

The preboot now has:

An open /persist partition with all persistent state
The current generation marker (which kernel to boot)
Key material for the main OS (LUKS key for the kexec transition)

The next stage -- 05 Loading the Real OS -- selects the kernel generation and kexec's into the main OS.

What This Stage Does NOT Do

It does not select or load a kernel (that's 05 Loading the Real OS)
It does not configure networking (that's 07 Overlay Networking)
It does not join the org (that's 08 Cluster Formation)
It does not decide WHICH generation to boot (it derives the key for the current generation -- generation selection is the next stage's job)