Using a TPM or an HSM
Sometimes it is desirable to use a Trusted Platform Module (TPM) or a Hardware Security Module (HSM) to further secure the chain of trust at a site.
Without a TPM or HSM the site is unsealed, ie, the seal key to the Strongbox vault is made available to the site, using the following procedure.
Pre-requesite: When the site is initially set up the following procedure is followed to enable unseal at a later time if the entire site is restarted.
-
A unique key is created by the site, ie a seal key. This key is never stored or communicated outside the site in clear text.
-
A token is obtained from the Control Tower. This token gives limited API access to the Control Tower for storing and fetching an unseal secret.
-
The token is encrypted using a site specific key-pair (the Site Key), where the private part never leaves the site.
-
The now encrypted token is broken into a number of pieces equal to the number of controllers at the site. This is done using the Shamir secret sharing algorithm. A key property of the Shamir algorithm is that the whole can be recovered from the parts if, and only if, a majority of shares are present.
-
Each controller host at the site stores a unique Shamir share of the token.
-
The seal key of the site is encrypted with the public key of the Site Key used to encrypt the token above. The result is stored at the Control Tower for later retrieval by the site, if needed. Since it is encrypted using the public key, it can only be decrypted using the private key, which is only known to the site.
When a site is restarted it is initially sealed, ie, the keys to decrypt the secret store is unknown to the site. In order to unseal the site the following procedure is used.
-
Each controller host at the site has a piece of a token that can be used to request unseal from the Control Tower. The full token is initially split into a number shares using the Shamir Secret Sharing algorithm and distributed to all controller hosts, as described above. The token can later be recovered from the shares provided that a majority of shares are present.
When a site is started each host attempts to gather shares from the other controller hosts at the site. When enough shares are present the token is recovered.
-
The recovered token is de-crypted using the Site Key and a clear text version of the token is obtained.
-
The clear text token is used to request the previously stored encrypted seal key from the Control Tower.
-
The encrypted seal key is decrypted using the private key of the Site Key.
This is a fairly secure scheme if the site consists of multiple controller hosts since a majority of hosts needs to be present in order to recover the unseal token. If a single controller host is copied, or compromised, then access cannot be obtained through that host. However, if the site consists of a single host, then it would be desirable to have some additional security mechanism.
The problem is even worse if a site is allowed to perform automatic
unseal without contacting the Control Tower. This can be allowed by
setting the allow-local-unseal setting to true. It may be
desirable if a site is expected to be off-line for long stretches of
time.
When allow-local-unseal is true the procedure is a bit different
from above. When the system is initiated it encrypts the seal key
directly instead of encrypting the access token to the Control
Tower. The encrypted seal key is then split using the Shamir Secret
Sharing algorithm, and each share is stored on a controller host.
This works well when a site consists of multiple controller hosts, but works less well for single host sites since the seal key is then available directly on the host.
One way of improving the situation is to use a TPM or HSM to protect
the share stored by each host. On multi-host sites there will
be multiple shares, each protected by a TPM/HSM. On single host sites
the unseal token (or seal key in the case of allow-local-unseal) will
be protected by a TPM/HSM, vastly improving the situation.
Direct TPM 2.0 Support
The Edge Enforcer has built-in support for TPM 2.0 devices accessed
directly via tpm2-tools. This is the simplest way to enable TPM
protection and does not require any additional services.
When the /dev/tpmrm0 device is present on the host, the Edge
Enforcer automatically passes it into the container and uses
tpm2-tools to protect the Shamir share. A persistent AES-256-CFB key
is created under the owner hierarchy on first use and survives
reboots. The key is stored at TPM NV handle 0x81000002 by default;
if that handle is already occupied by a foreign object, the next free
handle in the range 0x81000002–0x81000005 is used instead.
The TPM firmware must support the TPM2_CC_EncryptDecrypt2 command.
This is checked at initialisation; if the command is not available
TPM protection is skipped and the share is stored in plaintext (see
protect-with-tpm below for how to enforce TPM protection).
No configuration is required for the default behaviour. When a host
is added and the TPM device is accessible there should be an INFO
level log message of the form Using tpm2-tools to protect host (handle 0x81000002). identifying the handle that was selected.
If the host is later restored to hardware with a different TPM (or a host with no TPM), the system detects that the stored share cannot be decrypted and automatically reinitializes the strongbox state after the reset timeout. After reinitialisation the host needs to fetch its configuration from the Control Tower. Since a site bundle can only be unwrapped once (for security reasons), an operator must first allow the host to retry by running the following command on the Control Tower:
supctl do system sites <site-name> reallow-site-unwrap
The same recovery applies when protect-with-tpm is set to require
and the TPM becomes permanently broken: the TPM must be repaired (and
supd typically restarted so that the TPM device is re-mounted into
the container) and the site must be allowed to unwrap again with the
command above.
Controlling TPM Protection per Site
Whether the Shamir share should be protected with a TPM is controlled
by the per-site protect-with-tpm setting:
| Value | Behaviour |
|---|---|
best-effort | Default. Use the TPM (via Parsec or tpm2-tools) when it is present and functional; otherwise store the share in plaintext on disk. |
require | The TPM must be usable. If it is not, the node refuses to leave the sealed state on first boot, and raises a security alert if the TPM becomes unavailable on a running site. |
never | Never use the TPM, even when one is present. The share is always stored in plaintext. |
Example: enforce TPM protection for a site.
supctl merge system sites <site-name> <<EOF
protect-with-tpm: require
EOF
A change to protect-with-tpm is propagated to the site bundle and
applied at each controller host. On a brand-new host that has never
become operational, require combined with an unavailable TPM leaves
the node waiting (repeatedly logging protect-with-tpm=require but TPM protection is unavailable; deferring unseal share write.) until the
TPM can be used. On an already-operational host, flipping to require
while the TPM is broken falls back to plaintext storage so the node
stays consistent, and raises the alert described below.
Per-host TPM protection status
Each cluster host exposes a read-only protected-with-tpm boolean
indicating whether its on-disk unseal token share is currently
protected by a TPM. This is derived from the actual on-disk share
format, so it reflects the state after the most recent write and not
just the configured policy.
supctl show --site <site-name> system cluster hosts <host-name>
Example output (abbreviated):
hostname: h06
host-id: 17fdf47c-5b55-432a-85ec-da8d76f094ad
protected-with-tpm: true
controller: true
Alert: tpm-protection-unavailable
When protect-with-tpm is set to require but TPM-based protection
cannot be used on a node — neither Parsec nor tpm2-tools is available
— a tpm-protection-unavailable alert (severity major, kind
security) is published on the system:alerts Volga topic. The alert
carries the affected hostname and a short reason, and indicates
that the share has been written unprotected as a fallback so the site
can keep running.
Restore TPM availability (for example by installing missing TPM
libraries or fixing the /dev/tpmrm0 device) to resume protected
storage. The next write of the unseal share will again use the TPM and
the per-host protected-with-tpm leaf will flip back to true.
Accessing TPM/HSM using Parsec
In order for the Edge Enforcer to access a local TPM or HSM it can
also rely on a system service called Parsec
(https://github.com/parallaxsecond/parsec). This becomes an
integration point between the Edge Enforcer and any local TPM or HSM,
including HSMs that are not directly accessible via tpm2-tools.
When both a Parsec service and a direct TPM device are present, Parsec
takes priority over the direct tpm2-tools path.
The system assumes that a Parsec service is properly configured
running. It further assumes that the service is configured to create
the parsec socket at /run/parsec/parsec.sock and that the service is
configured for auth_type = UnixPeerCredentials.
Parsec provides a common abstraction layer for both TPMs and HSMs and allows the local unseal token to be secured by a key stored in a TPM or a HSM .
It is possible to test the setup even without an actual TPM or HSM by letting the Parsec service use a software based emulator of a TPM.
Note that the Parsec service must be running before a host is added since the root of trust configuration is part of the very early setup of a host.
Parsec Service for Testing
Setting up a Parsec service for testing is fairly straightforward. However, observe that this must be done before adding a host to the system.
Download the Parsec quickstart package from Github and unpack. Note that the quickstart package has dependencies on specific glibc versions. If they do not match your system you may have to build locally (see Parsec Service with TPM below).
wget https://github.com/parallaxsecond/parsec/releases/download/1.3.0/quickstart-1.3.0-linux-x86_64.tar.gz
tar xfz quickstart-1.3.0-linux-x86_64.tar.gz
The package contains a config file for testing. Modify the parsec socket location to the production location.
cd quickstart-1.3.0-linux-x86_64.tar.gz/quickstart
sed -i -e 's|"./parsec.sock"|"/run/parsec/parsec.sock"|' config.toml
The config.toml will look like this:
[core_settings]
log_level = "info"
allow_root = true
[listener]
listener_type = "DomainSocket"
timeout = 200 # in milliseconds
socket_path = "/run/parsec/parsec.sock"
[authenticator]
auth_type = "UnixPeerCredentials"
[[key_manager]]
name = "sqlite-manager"
manager_type = "SQLite"
sqlite_db_path = "./sqlite-key-info-manager.sqlite3"
[[provider]]
provider_type = "MbedCrypto"
key_info_manager = "sqlite-manager"
Some directories also needs to be created, if not present.
sudo mkdir /etc/parsec
sudo mkdir /run/parsec
sudo chown docker /run/parsec
sudo chmod 755 /run/parsec
sudo cp config.toml /etc/parsec
sudo mv ../bin/parsec /usr/bin/parsec
The Parsec service may now be started manually using the following command:
sudo /usr/bin/parsec --config /etc/parsec/config.toml
It should result in a a printout like this:
[INFO parsec] Parsec started. Configuring the service...
[INFO parsec_service::key_info_managers::sqlite_manager] SQLiteKeyInfoManager - Found 0 key info mapping records
[INFO parsec_service::utils::service_builder] Creating a Mbed Crypto Provider.
[WARN parsec_service::front::domain_socket] Removing the existing socket file at /run/parsec/parsec.sock.
[INFO parsec] Parsec is ready.
As an alternative if systemd is used a service specification can
be created like this (as root):
cat > /etc/systemd/system/parsec.service <<EOF
[Unit]
Description=Parsec Service
Documentation=https://parallaxsecond.github.io/parsec-book/parsec_service/install_parsec_linux.html
[Service]
WorkingDirectory=/home/docker
ExecStart=/usr/bin/parsec --config /etc/parsec/config.toml
User=docker
Group=docker
[Install]
WantedBy=multi-user.target
EOF
Then enabled and started like this:
systemctl --no-pager daemon-reload
systemctl --no-pager enable parsec.service
systemctl --no-pager start parsec.service
When a host is added, that has a Parsec service running, there should
be an INFO level log message saying Using Parsec service to protect host.. This indicates that the Shamir share assigned to the host
has been encrypted using the local Parsec service.
Parsec Service with TPM
In a production environment a real TPM or HSM should be used. In this case the Parsec service needs to be compiled with the proper drivers for the hardware that should be used. See https://parallaxsecond.github.io/parsec-book/getting_started/linux_x86.html and https://parallaxsecond.github.io/parsec-book/parsec_service/install_parsec_linux.html
For example, if an onboard TPM is used Parsec can be prepared as follows:
Install some tools
sudo apt install llvm-dev libclang-dev clang cmake rustc
Clone the Parsec service repo:
git clone --branch 1.3.0 https://github.com/parallaxsecond/parsec.git
Build the source code with with the tpm provider enabled
cd parsec
cargo build --features "tpm-provider,direct-authenticator,unix-peer-credentials-authenticator"
Create a config file where the provider type is set to Tpm and
tcti points to "device:/dev/tpmrm0".
cat > /etc/parsec/config.toml <<EOF
[core_settings]
log_level = "info"
allow_root = true
[listener]
listener_type = "DomainSocket"
timeout = 3000 # in milliseconds
socket_path = "/run/parsec/parsec.sock"
[authenticator]
auth_type = "UnixPeerCredentials"
[[key_manager]]
name = "sqlite-manager"
manager_type = "SQLite"
sqlite_db_path = "/etc/parsec/sqlite-key-info-manager.sqlite3"
[[provider]]
provider_type = "Tpm"
key_info_manager = "sqlite-manager"
tcti = "device:/dev/tpmrm0"
EOF
The service may now be started in the same way as above, ie
create some directories, if not present, and copy the parsec
executable to /usr/bin/parsec.
sudo mkdir /etc/parsec
sudo mkdir /run/parsec
sudo chown docker /run/parsec
sudo chmod 755 /run/parsec
sudo cp config.toml /etc/parsec
sudo mv target/release/parsec /usr/bin/parsec
The Parsec service may now be started manually using the following command:
sudo /usr/bin/parsec --config /etc/parsec/config.toml
It should result in a a printout like this:
[INFO parsec] Parsec started. Configuring the service...
[INFO parsec_service::key_info_managers::on_disk_manager] Found 0 mapping files
[INFO parsec_service::utils::service_builder] Creating a TPM Provider.
[INFO parsec_service::providers::tpm] Checking for ciphers supported by the TPM.
[INFO tss_esapi::context] Closing context.
[INFO tss_esapi::context] Context closed.
[INFO parsec] Parsec is ready.
As an alternative if systemd is used a service specification can
be created like this (as root):
cat > /etc/systemd/system/parsec.service <<EOF
[Unit]
Description=Parsec Service
Documentation=https://parallaxsecond.github.io/parsec-book/parsec_service/install_parsec_linux.html
[Service]
WorkingDirectory=/home/docker
ExecStart=/usr/bin/parsec --config /etc/parsec/config.toml
User=docker
Group=docker
[Install]
WantedBy=multi-user.target
EOF
Then enabled and started like this:
systemctl --no-pager daemon-reload
systemctl --no-pager enable parsec.service
systemctl --no-pager start parsec.service
When a host is added, that has a Parsec service running, there should
be an INFO level log message saying Using Parsec service to protect host.. This indicates that the Shamir share assigned to the host
has been encrypted using the local Parsec service.