Using a TPM or an HSM

Sometimes it is desirable to use a Trusted Platform Module (TPM) or a Hardware Security Module (HSM) to further secure the chain of trust at a site.

Without a TPM or HSM the site is unsealed, ie, the seal key to the Strongbox vault is made available to the site, using the following procedure.

Pre-requesite: When the site is initially set up the following procedure is followed to enable unseal at a later time if the entire site is restarted.

A unique key is created by the site, ie a seal key. This key is never stored or communicated outside the site in clear text.
A token is obtained from the Control Tower. This token gives limited API access to the Control Tower for storing and fetching an unseal secret.
The token is encrypted using a site specific key-pair (the Site Key), where the private part never leaves the site.
The now encrypted token is broken into a number of pieces equal to the number of controllers at the site. This is done using the Shamir secret sharing algorithm. A key property of the Shamir algorithm is that the whole can be recovered from the parts if, and only if, a majority of shares are present.
Each controller host at the site stores a unique Shamir share of the token.
The seal key of the site is encrypted with the public key of the Site Key used to encrypt the token above. The result is stored at the Control Tower for later retrieval by the site, if needed. Since it is encrypted using the public key, it can only be decrypted using the private key, which is only known to the site.

When a site is restarted it is initially sealed, ie, the keys to decrypt the secret store is unknown to the site. In order to unseal the site the following procedure is used.

Each controller host at the site has a piece of a token that can be used to request unseal from the Control Tower. The full token is initially split into a number shares using the Shamir Secret Sharing algorithm and distributed to all controller hosts, as described above. The token can later be recovered from the shares provided that a majority of shares are present.

When a site is started each host attempts to gather shares from the other controller hosts at the site. When enough shares are present the token is recovered.
The recovered token is de-crypted using the Site Key and a clear text version of the token is obtained.
The clear text token is used to request the previously stored encrypted seal key from the Control Tower.
The encrypted seal key is decrypted using the private key of the Site Key.

This is a fairly secure scheme if the site consists of multiple controller hosts since a majority of hosts needs to be present in order to recover the unseal token. If a single controller host is copied, or compromised, then access cannot be obtained through that host. However, if the site consists of a single host, then it would be desirable to have some additional security mechanism.

The problem is even worse if a site is allowed to perform automatic unseal without contacting the Control Tower. This can be allowed by setting the allow-local-unseal setting to true. It may be desirable if a site is expected to be off-line for long stretches of time.

When allow-local-unseal is true the procedure is a bit different from above. When the system is initiated it encrypts the seal key directly instead of encrypting the access token to the Control Tower. The encrypted seal key is then split using the Shamir Secret Sharing algorithm, and each share is stored on a controller host.

This works well when a site consists of multiple controller hosts, but works less well for single host sites since the seal key is then available directly on the host.

One way of improving the situation is to use a TPM or HSM to protect the share stored by each host. On multi-host sites there will be multiple shares, each protected by a TPM/HSM. On single host sites the unseal token (or seal key in the case of allow-local-unseal) will be protected by a TPM/HSM, vastly improving the situation.

Direct TPM 2.0 Support

The Edge Enforcer has built-in support for TPM 2.0 devices accessed directly via tpm2-tools. This is the simplest way to enable TPM protection and does not require any additional services.

When the /dev/tpmrm0 device is present on the host, the Edge Enforcer automatically passes it into the container and uses tpm2-tools to protect the Shamir share. A persistent AES-256-CFB key is created under the owner hierarchy on first use and survives reboots. The key is stored at TPM NV handle 0x81000002 by default; if that handle is already occupied by a foreign object, the next free handle in the range 0x81000002–0x81000005 is used instead.

The TPM firmware must support the TPM2_CC_EncryptDecrypt2 command. This is checked at initialisation; if the command is not available TPM protection is skipped and the share is stored in plaintext (see protect-with-tpm below for how to enforce TPM protection).

No configuration is required for the default behaviour. When a host is added and the TPM device is accessible there should be an INFO level log message of the form Using tpm2-tools to protect host (handle 0x81000002). identifying the handle that was selected.

If the host is later restored to hardware with a different TPM (or a host with no TPM), the system detects that the stored share cannot be decrypted and automatically reinitializes the strongbox state after the reset timeout. After reinitialisation the host needs to fetch its configuration from the Control Tower. Since a site bundle can only be unwrapped once (for security reasons), an operator must first allow the host to retry by running the following command on the Control Tower:

supctl do system sites <site-name> reallow-site-unwrap

The same recovery applies when protect-with-tpm is set to require and the TPM becomes permanently broken: the TPM must be repaired (and supd typically restarted so that the TPM device is re-mounted into the container) and the site must be allowed to unwrap again with the command above.

Controlling TPM Protection per Site

Whether the Shamir share should be protected with a TPM is controlled by the per-site protect-with-tpm setting:

Value	Behaviour
`best-effort`	Default. Use the TPM (via Parsec or `tpm2-tools`) when it is present and functional; otherwise store the share in plaintext on disk.
`require`	The TPM must be usable. If it is not, the node refuses to leave the sealed state on first boot, and raises a security alert if the TPM becomes unavailable on a running site.
`never`	Never use the TPM, even when one is present. The share is always stored in plaintext.

Example: enforce TPM protection for a site.

supctl merge system sites <site-name> <<EOF
protect-with-tpm: require
EOF

A change to protect-with-tpm is propagated to the site bundle and applied at each controller host. On a brand-new host that has never become operational, require combined with an unavailable TPM leaves the node waiting (repeatedly logging protect-with-tpm=require but TPM protection is unavailable; deferring unseal share write.) until the TPM can be used. On an already-operational host, flipping to require while the TPM is broken falls back to plaintext storage so the node stays consistent, and raises the alert described below.

Per-host TPM protection status

Each cluster host exposes a read-only protected-with-tpm boolean indicating whether its on-disk unseal token share is currently protected by a TPM. This is derived from the actual on-disk share format, so it reflects the state after the most recent write and not just the configured policy.

supctl show --site <site-name> system cluster hosts <host-name>

Example output (abbreviated):

hostname: h06
host-id: 17fdf47c-5b55-432a-85ec-da8d76f094ad
protected-with-tpm: true
controller: true

Alert: `tpm-protection-unavailable`

When protect-with-tpm is set to require but TPM-based protection cannot be used on a node — neither Parsec nor tpm2-tools is available — a tpm-protection-unavailable alert (severity major, kind security) is published on the system:alerts Volga topic. The alert carries the affected hostname and a short reason, and indicates that the share has been written unprotected as a fallback so the site can keep running.

Restore TPM availability (for example by installing missing TPM libraries or fixing the /dev/tpmrm0 device) to resume protected storage. The next write of the unseal share will again use the TPM and the per-host protected-with-tpm leaf will flip back to true.

Accessing TPM/HSM using Parsec

In order for the Edge Enforcer to access a local TPM or HSM it can also rely on a system service called Parsec (https://github.com/parallaxsecond/parsec). This becomes an integration point between the Edge Enforcer and any local TPM or HSM, including HSMs that are not directly accessible via tpm2-tools.

When both a Parsec service and a direct TPM device are present, Parsec takes priority over the direct tpm2-tools path.

The system assumes that a Parsec service is properly configured running. It further assumes that the service is configured to create the parsec socket at /run/parsec/parsec.sock and that the service is configured for auth_type = UnixPeerCredentials.

Parsec provides a common abstraction layer for both TPMs and HSMs and allows the local unseal token to be secured by a key stored in a TPM or a HSM .

It is possible to test the setup even without an actual TPM or HSM by letting the Parsec service use a software based emulator of a TPM.

Note that the Parsec service must be running before a host is added since the root of trust configuration is part of the very early setup of a host.

Parsec Service for Testing

Setting up a Parsec service for testing is fairly straightforward. However, observe that this must be done before adding a host to the system.

Download the Parsec quickstart package from Github and unpack. Note that the quickstart package has dependencies on specific glibc versions. If they do not match your system you may have to build locally (see Parsec Service with TPM below).

 wget https://github.com/parallaxsecond/parsec/releases/download/1.3.0/quickstart-1.3.0-linux-x86_64.tar.gz
 tar xfz quickstart-1.3.0-linux-x86_64.tar.gz

The package contains a config file for testing. Modify the parsec socket location to the production location.

cd quickstart-1.3.0-linux-x86_64.tar.gz/quickstart
sed -i -e 's|"./parsec.sock"|"/run/parsec/parsec.sock"|' config.toml

The config.toml will look like this:

[core_settings]
log_level = "info"
allow_root = true

[listener]
listener_type = "DomainSocket"
timeout = 200 # in milliseconds
socket_path = "/run/parsec/parsec.sock"

[authenticator]
auth_type = "UnixPeerCredentials"

[[key_manager]]
name = "sqlite-manager"
manager_type = "SQLite"
sqlite_db_path = "./sqlite-key-info-manager.sqlite3"

[[provider]]
provider_type = "MbedCrypto"
key_info_manager = "sqlite-manager"

Some directories also needs to be created, if not present.

sudo mkdir /etc/parsec
sudo mkdir /run/parsec
sudo chown docker /run/parsec
sudo chmod 755 /run/parsec
sudo cp config.toml /etc/parsec
sudo mv ../bin/parsec /usr/bin/parsec

The Parsec service may now be started manually using the following command:

sudo /usr/bin/parsec --config /etc/parsec/config.toml

It should result in a a printout like this:

[INFO  parsec] Parsec started. Configuring the service...
[INFO  parsec_service::key_info_managers::sqlite_manager] SQLiteKeyInfoManager - Found 0 key info mapping records
[INFO  parsec_service::utils::service_builder] Creating a Mbed Crypto Provider.
[WARN  parsec_service::front::domain_socket] Removing the existing socket file at /run/parsec/parsec.sock.
[INFO  parsec] Parsec is ready.

As an alternative if systemd is used a service specification can be created like this (as root):

cat > /etc/systemd/system/parsec.service <<EOF
[Unit]
Description=Parsec Service
Documentation=https://parallaxsecond.github.io/parsec-book/parsec_service/install_parsec_linux.html

[Service]
WorkingDirectory=/home/docker
ExecStart=/usr/bin/parsec --config /etc/parsec/config.toml
User=docker
Group=docker

[Install]
WantedBy=multi-user.target
EOF

Then enabled and started like this:

systemctl --no-pager daemon-reload
systemctl --no-pager enable parsec.service
systemctl --no-pager start parsec.service

When a host is added, that has a Parsec service running, there should be an INFO level log message saying Using Parsec service to protect host.. This indicates that the Shamir share assigned to the host has been encrypted using the local Parsec service.

Parsec Service with TPM

In a production environment a real TPM or HSM should be used. In this case the Parsec service needs to be compiled with the proper drivers for the hardware that should be used. See https://parallaxsecond.github.io/parsec-book/getting_started/linux_x86.html and https://parallaxsecond.github.io/parsec-book/parsec_service/install_parsec_linux.html

For example, if an onboard TPM is used Parsec can be prepared as follows:

Install some tools

sudo apt install llvm-dev libclang-dev clang cmake rustc

Clone the Parsec service repo:

git clone --branch 1.3.0 https://github.com/parallaxsecond/parsec.git

Build the source code with with the tpm provider enabled

cd parsec
cargo build --features "tpm-provider,direct-authenticator,unix-peer-credentials-authenticator"

Create a config file where the provider type is set to Tpm and tcti points to "device:/dev/tpmrm0".

cat > /etc/parsec/config.toml <<EOF
[core_settings]
log_level = "info"
allow_root = true

[listener]
listener_type = "DomainSocket"
timeout = 3000 # in milliseconds
socket_path = "/run/parsec/parsec.sock"

[authenticator]
auth_type = "UnixPeerCredentials"

[[key_manager]]
name = "sqlite-manager"
manager_type = "SQLite"
sqlite_db_path = "/etc/parsec/sqlite-key-info-manager.sqlite3"

[[provider]]
provider_type = "Tpm"
key_info_manager = "sqlite-manager"
tcti = "device:/dev/tpmrm0"
EOF

The service may now be started in the same way as above, ie create some directories, if not present, and copy the parsec executable to /usr/bin/parsec.

sudo mkdir /etc/parsec
sudo mkdir /run/parsec
sudo chown docker /run/parsec
sudo chmod 755 /run/parsec
sudo cp config.toml /etc/parsec
sudo mv target/release/parsec /usr/bin/parsec

The Parsec service may now be started manually using the following command:

sudo /usr/bin/parsec --config /etc/parsec/config.toml

It should result in a a printout like this:

[INFO  parsec] Parsec started. Configuring the service...
[INFO  parsec_service::key_info_managers::on_disk_manager] Found 0 mapping files
[INFO  parsec_service::utils::service_builder] Creating a TPM Provider.
[INFO  parsec_service::providers::tpm] Checking for ciphers supported by the TPM.
[INFO  tss_esapi::context] Closing context.
[INFO  tss_esapi::context] Context closed.
[INFO  parsec] Parsec is ready.

As an alternative if systemd is used a service specification can be created like this (as root):

cat > /etc/systemd/system/parsec.service <<EOF
[Unit]
Description=Parsec Service
Documentation=https://parallaxsecond.github.io/parsec-book/parsec_service/install_parsec_linux.html

[Service]
WorkingDirectory=/home/docker
ExecStart=/usr/bin/parsec --config /etc/parsec/config.toml
User=docker
Group=docker

[Install]
WantedBy=multi-user.target
EOF

Then enabled and started like this:

systemctl --no-pager daemon-reload
systemctl --no-pager enable parsec.service
systemctl --no-pager start parsec.service

Direct TPM 2.0 Support​

Controlling TPM Protection per Site​

Per-host TPM protection status​

Alert: tpm-protection-unavailable​

Accessing TPM/HSM using Parsec​

Parsec Service for Testing​

Parsec Service with TPM​