By Filipe Casal and Jim Miller

Trail of Bits is publicly disclosing two bugs that affect Shamir’s Secret Sharing implementation of Binance’s threshold signature scheme library (tss-lib) and most of its active forks. Here is the full list of affected repositories:

These bugs allow a malicious user of the threshold signature scheme to steal the secret keys of other users or to crash their nodes. Exploiting these vulnerabilities is simple: an attacker just needs to configure a malicious ID at the start of either the key generation protocol or the resharing protocol.

Threshold signature schemes are a powerful cryptographic object; however, they require complex, non-standardized primitives such as zero-knowledge proofs, commitment schemes, and verifiable secret shares. Unfortunately, aside from academic publications, there is essentially no guidance or documentation on implementing these schemes or their security pitfalls, which leads to several issues in practice, such as the two bugs we are disclosing today.

Along with our disclosure of these vulnerabilities, we are releasing ZKDocs, our documentation for non-standardized cryptographic primitives. We hope that our work in this area can benefit the larger cryptography community.

What are threshold signature schemes?

A threshold signature scheme is a protocol that allows a group of users to generate and control a private signing key. The users can jointly produce digital signatures on messages, but none can sign messages individually.

Many are familiar with the concept of multisignature (multisig) protocols, mainly used in cryptocurrency wallets that execute transactions only after receiving enough signatures from different users. The main difference between these schemes is that in multisig schemes, each user has a personal private/public key pair for signing, and in threshold signature schemes, each user holds one share of the same key. When signing with a multisig scheme, the number of signatures is proportional to the number of users; when signing with a threshold signature scheme, one group signature is produced.

Threshold signatures are complicated. The advanced, technical details of how these schemes work are out of scope for this blog post, but if you are curious about how these schemes work in practice or how they compare with other schemes, like multisig, you can find several blog posts describing them in more detail, such as here and here.

What is verifiable secret sharing (VSS)?

Secret sharing is a cryptographic protocol for splitting a secret key (or other secret data) into key shares. These key shares should look entirely random so that, on their own, they reveal no information about the underlying secret. Still, when enough shares are combined, the original secret can be recovered. The most common technique for secret sharing is Shamir’s Secret Sharing scheme, which leverages properties of polynomials.

The high-level idea behind Shamir’s scheme is that for n users, you want at least t of them (where t ≤ n) to recover the secret by combining their shares. To achieve this, we generate a random polynomial p of degree t-1 over a finite group with the constant term set to the secret value.

\displaystyle \begin{aligned}  p(x) = \mathsf{secret} + a_1 x + \ldots + a_{t-1} x^{t-1}  \end{aligned}

Then, we create the secret shares by evaluating the polynomial at n different points, one point for each user. One single point (or even a couple of points, depending on the value of t) reveals no information about the polynomial. However, if users combine enough points, they can recover the original polynomial using polynomial interpolation. Since the secret value is encoded in the polynomial, recovering the polynomial recovers the secret.

Threshold signature schemes use secret sharing to generate a signing key that is shared among multiple users, but in practice, most schemes have to use a more advanced version of secret sharing known as verifiable secret sharing (VSS). Often, users cannot assume that others running these protocols are honest. VSS allows users to verify that the shares they received were generated honestly. The most common VSS scheme was developed by Feldman. His scheme uses the same technique as Shamir’s scheme for generating shares (i.e., generating a random polynomial using the secret as the constant term) and creates additional values to make these shares verifiable.

From zero to hero

We are disclosing two bugs that affect Feldman’s verifiable secret sharing within different threshold signature scheme implementations. These bugs are not a result of some novel analysis that could not have been foreseen; on the contrary, these bugs stem from one of the few known weaknesses of secret sharing. We highlight them today not only due to the number of affected vendors but also because they are representative of a whole host of critical bugs that stem from the same recurring problem in non-standard cryptography: a lack of documentation and guidance.

The first bug is related to how secret shares are generated. Since we defined the constant term of the polynomial as the secret value, it is essential that, when generating shares, the x-value of the polynomial point is non-zero. If we create shares at the 0 point, then the polynomial evaluates to the constant term, which leaks the secret value entirely:

\begin{aligned}  p(0) &= \mathsf{secret} + a_1 0 + \ldots + a_{t-1} 0^{t-1} \\  &= \mathsf{secret}  \end{aligned}

Most implementations avoid this possibility altogether by evaluating the polynomial at values (1, 2, …, n), where n is the number of shares needed. However, some implementations evaluate the polynomials at specific values; for instance, Binance’s implementation evaluates the polynomial at each user’s unique ID value. Implementations designed in this way must verify that these IDs are non-zero; most succeed in doing this. However, some implementations forget that these sharing schemes operate over a finite group, so this zero check has to be performed modulo the group’s order! If this check is not performed, the secret value is immediately leaked to malicious users who set their unique IDs equal to the group’s order. If the group order is q, then:

\begin{aligned}  p(q) &= \mathsf{secret} + a_1 q + \ldots + a_{t-1} q^{t-1} \pmod{q} \\  &= \mathsf{secret} + a_1 0 + \ldots + a_{t-1} 0^{t-1} \pmod{q} \\  &= \mathsf{secret}  \end{aligned}

The affected implementations checked that the user IDs used to generate these secret shares were non-zero but did not perform this check modulo the elliptic curve group order.

From zero to crash

The second bug is related to the mishandling of modular arithmetic operations. Depending on the group of users signing a message, users calculate the Lagrangian coefficient, which is the product of terms of the form IDi / (IDi – SelfID). Since we are working on a finite field, we compute the division with the modular inverse of (IDi – SelfID). If a user’s IDi is modularly equal to the current user’s ID (SelfID), the subtraction will be modularly equal to zero—but zero does not have a modular inverse! The vulnerable implementations did not validate the modular inverse and would panic with a null dereference.

We often find these bugs in our audits; they are easy to miss without scrupulous attention to modular arithmetic details. Most times, there are even validations in place, but these are insufficient if the arguments are not checked in the context of the finite field.

When using modular arithmetic with a generic Big Integer class, take the following steps:

  • Always modularly reduce the numbers before validations, such as comparisons.
  • Always validate operations such as modular inverses and modular square roots. Depending on the API, either check the return value or catch errors to ensure that the function does not panic.

If you are still unsure or would like a second opinion, contact us for an audit.

ZKDocs

Today, we are releasing ZKDocs, our documentation for non-standardized cryptographic primitives. We hope it will help developers avoid these bugs in the future by providing comprehensive implementation details and security considerations of these protocols.

As we discovered more instances of these bugs, we began to think about why they were occurring and how we could prevent them from occurring in the future. Unfortunately, for non-standardized cryptographic protocols, the burden is on the developers to figure out all of the low-level implementation details and security pitfalls. To understand how limited the resources are, try searching for information on Feldman’s verifiable secret sharing scheme (the most common scheme of its kind!). The only results you will likely find are a Wikipedia article and Feldman’s original paper from 1987. Aside from that, you may be able to find some Stack Overflow discussions or old lecture notes. But that’s about it.

These schemes are complicated! With such limited documentation and guidance available, we shouldn’t be surprised that these types of bugs end up occurring in practice. With ZKDocs, we aim to fill in that gap. For instance, to read more about the details of the first bug related to zero-shares that we found, check out the secret sharing section in ZKDocs!

The “Shamir’s Secret Sharing Scheme” section of ZKDocs

Coordinated disclosure

October 19, 2021: Discovered secret data leaks in tss-lib
October 21, 2021: Reported to Binance
November 1–December 3, 2021: Internal discovery of issues affecting Clover, Keep Network, Swingby, THORChain, and ZenGo X
December 6, 2021: Reported to Clover, Keep Network, Swingby, THORChain, and ZenGo X

As of December 20, 2021, Binance, Keep Network, Swingby, THORChain, and ZenGo X have patched their implementations with the required fixes. The one exception is Clover, who has not replied to our emails.

We would like to thank the Binance, Keep Network, SwingBy, THORChain, and ZenGo X teams for working swiftly with us to address these issues.