Pitfall

Ambiguous Hash Encoding

What can go wrong. Many protocols need to hash several values together, such as group elements, integers, or commitments. In the Fiat-Shamir transform, for example, the challenge is just the hash of the transcript. The naive encoding concatenates the values with a delimiter, $H(m_1 ,|, D ,|, m_2 ,|, \cdots ,|, D ,|, m_n)$, where $D$ is a fixed byte sequence such as 0x00 or ||. This is not injective: because each $m_i$ is an arbitrary byte string that may itself contain $D$, two different input tuples can serialize to the same byte string, and therefore hash to the same value.

Security implication. Because the encoding is ambiguous, an adversary can shift boundaries around, manipulate which parts of the input get interpreted as which values, without changing the hash output. In the context of discrete log proofs, the adversary sends a single commitment stream whose bytes can be parsed several ways, all hashing to the same challenge. After observing the challenge bits, the adversary retroactively chooses the parse that makes the verification equation hold for every bit, producing a valid-looking proof of a discrete-log relation the adversary does not satisfy. Applied to threshold-ECDSA signing, the adversary can forge the dlnproof, the discrete-log relation proof over the auxiliary RSA modulus used in GG18/GG20 setup, leading to recovery of other parties’ secret shares and ultimately the shared key. The attack is documented by Hexens and catalogued as the TSSHOCK α-shuffle attack.

How to avoid. Make the encoding injective: length-prefix each element with a fixed-width tag; an 8-byte little-endian length is enough. Better still, use the protocol’s specified serialization format where one exists.

Example bnb-chain/tss-lib variadic SHA512_256 (PR #233)

The audit finding KS-IOF-F-02 pointed out that bnb-chain’s tss-lib applied an ambiguous encoding by using a single dollar-sign delimiter with no per-element length tag.

The vulnerable helper represented that delimiter as '$' (source):

 1// common/hash.go — bnb-chain/tss-lib v1.3.5 (vulnerable)
 2const hashInputDelimiter = byte('$')
 3
 4func SHA512_256(in ...[]byte) []byte {
 5    inLenBz := make([]byte, 8)
 6    binary.LittleEndian.PutUint64(inLenBz, uint64(len(in))) // counts inputs, not sizes
 7    data = append(data, inLenBz...)
 8    for _, bz := range in {
 9        data = append(data, bz...)
10        data = append(data, hashInputDelimiter) // no length tag per element
11    }
12}

The collision: SHA512_256([]byte("a$"), []byte("b")) and SHA512_256([]byte("a"), []byte("$b")) both serialize to a$$b$ and therefore produce the same digest. The fix (IoFinnet’s commit 369ec50, imported into bnb-chain/tss-lib as PR #233) appends an 8-byte length tag after each delimiter (source):

1// common/hash.go — bnb-chain/tss-lib v2.0.0 (fixed)
2for _, bz := range in {
3    data = append(data, bz...)
4    data = append(data, hashInputDelimiter)
5    dataLen := make([]byte, 8)
6    binary.LittleEndian.PutUint64(dataLen, uint64(len(bz)))
7    data = append(data, dataLen...) // length tag makes encoding injective
8}