Poseidon Hash for ZK Applications
Many use cases of practical computational integrity proof systems such as SNARKs, STARKs, Bulletproofs, involve proving the knowledge of a preimage under a certain cryptographic hash function, which is expressed as a circuit over a large prime field. However, the majority of hash functions do not perform computations in finite field, (SHA-256 in Zcash cryptocurrency). As a result, more constraints must be added in the circuit to represent the operations (for example, XOR, AND, etc.) in hash function as arithmetic operations in finite field. Hence, the expressed circuit could become very expensive due to the enormous number of hash constraints, leading to a huge computational penalty. Therefore, new hash functions that performs natively with ZKP systems are needed.
In 2021, Grassi et al. GKRRS21 introduced , a cryptographic hash function that supports arithmetic operations for values in finite field, and therefore friendly with ZK applications. uses times fewer constraints per message bit than [Pedersen Hash], according to here. The goal of this whole tutorial is to give a simpler explainations about Poseidon hash described in the referenced paper.
Cryptographic sponge functions
is a sponge-based cryptographic hash function, so it is neccessary to understand the design of such hash functions to understand the components of . In this section, we look at the definitions and design of a sponged-based cryptographic hash function, before showing its applications in hash functions and other problems.
In 2007, the sponge construction was introduced by Guido Bertoni and others BDPA08.
A sponge construction or sponge function takes input bit data of arbitrary length (called a message) and outputs bit data of fixed length (called a hash digest). In the context of a sponge function, we can say that the message is absorbed to the sponge, and the digest is squeezed out.
A sponge construction hash phases:
- Absorb: the message is compressed iteratively.
- Squeeze: the digest is extracted iteratively.
Below is the construction of a sponge function:
- The first component (the leftmost) is the state memory and it is divided into parts: the Bitrate part of size and the Capacity part of size . is initially set to , and we denote the initial state as :
- The second component is the compress function whose input and output are of the same length. is updated whenever passing the compress function.
- The third component is the padding function which increases the size of the message to a suitable length (specifically, the length of the padded message is a multiple of bitrate ).
- The squeeze phase takes the Bitrate of as output parts, before going to the compress function . The final output: . If the length of is not a multiple of , it will be truncated.
Applications of cryptographic sponge functions
Sponge functions have both theoretical and practical uses. In theoretical cryptanalysis, a random sponge function is a sponge construction where is a random permutation or transformation, as appropriate. Random sponge functions capture more of the practical limitations of cryptographic primitives than does the widely used random oracle model, in particular the finite internal state BDPA08.
The sponge construction can also be used to build practical cryptographic primitives. For example, cryptographic sponge with a -bit state has been selected by NIST as the winner in the competition. The strength of derives from the intricate, multi-round permutation developed by its authors. The -redesign called RS16 refers to the sponge-construct to define the algorithm.
In GKRRS21, they introduced hash function that uses the cryptographic sponge function as its architecture with the compression function being the permutation .
Poseidon Hash Overview
Now that we understood the architecture of a sponge-based hash function, we are now ready to go to the details of . is the proposed hash function in the paper, which maps strings over (for a prime ) to fixed length strings over . Specifically, they define: where is the output length ( elements), usually .
The overall structure of hash is described in the figure below.
The hash function is constructed by instantiating a sponge function with the compression function being the permutation (denoted by the block labeled in the figure). The hashing design strategy is based on GLRRS20, which employes different round functions throughout the permutation to destroy all of its possible symmetries and structural properties.
The workflow of hash is described below, which is used the same in different use cases:
- Determine the capacity value and the input padding depending on each use cases.
- Split the obtained input in to chunks of size .
- Apply the permutation to the capacity element and the first chunk: .
- Add the result to the state and apply the permutation again until no more chunks are left.
- Output output elements from the bitrate part of the state. If needed, iterate the permutation one more time.
Given that the overall structure of is determined by the instatiation of , in the next Sections, we would go to the details of each block (or equavilently, the block labeled in the figure).
Poseidon permutation design
At the heart of the hash is the -based permutation . As mentioned before, employs efficient round functions which is applied many times in order ti make the permutation behave like a random permutation. The same round function is applied throughout the permutation to destroy its symmetries and structural properties.
This section explains in detail the design strategy, the detailed design of -based permutation, and the different instantiations of permutation in different use cases.
Hades design strategy
Overview
In , they mix rounds with full SBox layers and rounds with partial SBox layers. They want to strike the balance between:
- Full layers are expensive in software and ZK proof systems, but well resists statistical attacks.
- Partial layers are computationally cheap, but often serves as good as full ones against algebraic attacks.
Details
The Hades design strategy receives input a state and consists of initial rounds, in which SBoxes are applied to the full state. After these rounds, partial rounds in the middle contain a single SBox for each round, and the rest of the state goes through the nonliner layer unchanged (you can say that the rest goes through the identity functions ). Finally, full rounds at the end are applied again:
An overview of the construction of the permutation can be described in the figure below:
The round function
As in the figure, in each round, several functions (illustrated as the blocks) are applied to the state to transform it into a new state , these functions are called round functions. Each round function consists of components:
- , denoted by : essentially an addition of the state with a random constant.
- , denoted by or . This is simply the SBox substitution.
- , denoted by . This is the linear layer of the construction. It involves multiplication between the state and a MDS(Maximum Distance Separable) matrix. This is used to apply the wide trail strategy (explained in DJRV01) which helps provide arguments against statistical attacks.
Note that: we can use the same number of full rounds instead of partial rounds in the construction without decreasing the security. However, this leads to significantly higher computation costs in our applications.
Maximum Distance Separable
A matrix is called maximum distance saparable (MDS) iff it has a branch number . The branch number of is defined as where is the Hamming weight in wide trail terminology.
Equivalently, a matrix is MDS iff every submatrix of is non-singular (a non-singluar matrix is a matrix whose determinant is not ). This definition is easier to understand.
There will be a construction of these matrices in later sections, because we do not need regular MDS matrices, but secure MDS matrices.
Hades-based POSEIDON permutation design
Recall in the previous section, the permutation is a composition of round functions, each of which consists of components such as , , and , each takes input a state and produces a new state . In this Section, we describe the main components of the round function of .
Notation
The function takes input of words in , where is a prime of size and to denote the size of texts in bits.
AddRoundConstants component
This is the first component of the round function described in Hades design strategy. Given the input state , where each of is a word, it outputs where is the round constant vector that is randomly generated in a method described in Poseidon concrete instantiation. Essentially, this component performs a field addition between the state and the round constant.
The SubWords (SBox) layer
This is the second component of the round function described in Hades design strategy. Subtitution boxes (or S-Boxes) are widely used in cryptography. In this context of an architecture that resembles block ciphers, SBoxes are used to ensure Shannon's property of confusion.
Given the state , where each of is a word, this component outputs a new state Essentially, this component performs multiple field multiplications on the state.
In GKRRS21, the authors focus on the setting: where is the smallest positive integer that satisfies . These permutations are called "".
Specifically, they use:
- if
- if and
It turns out that , which is , is suitable for the prime subfields of the scalar field of BLS12-381 and BN254, which are two of the most popular prime fields in ZK applications.
MixColumns - The linear layer
This is the last layer (also called the linear layer) of the round function. Given the state . This layer outputs the new state
where is the secure Maximum Distance Separable matrix of size , which is explained in the Hades design strategy. The MDS matrices are randomly and securely generated using a method described in Poseidon concrete instantiations.
It is proved MS78 that a MDS matrix with elements in exists if .
One method of constructing such a matrix introduced in this paper is using a Cauchy matrix. For , where , the entries of the matrix are: where the entries of and are pairwise distinct and , where and
Constructing the matrix
The paper suggest the following method to generate matrices to avoid some known vulnerabilities which this note skipped:
- Randomly select pairwise distinct and where and where and .
- Determine if the matrix is secure using this test introduced in GRS20
- Repeat this procedure until we find a secure matrix.
Concrete POSEIDON instantiation
Main instance
- SBox function: The authors proposed using for all use cases
- With the security level , the size of the Capacity bits ( field element). And can be or to achieve the 2-to-1 or 4-to-1 compression functions.
- Psudeonumber generation: The paper uses Grain LFSR in self-shrinking mode. This can be used to generate the round constants and matrices. This usage can be reminiscent to nothing-up-my-sleeve numbers.
- MDS matrix: It is recommended that we should use the Cauchy matrix for the linear layer, which described in Hades-based permutation design.
The grain LFSR
The grain LFSR is used to generate pseudorandom numbers for the round constants and the MDS matrices described in Hades design strategy. The technical details of the LFSR is provided in Appendix E of GKRRS21. The state in Grain LFSR is bits in size and is computed as follows:
- Initialize the state with bits as follows:
- : describe the field.
- : describe the SBox.
- : binary representation of .
- : binary representation of .
- : binary representation of .
- : binary representation of .
- : set to .
- Update the bits:
- Discard the first bits.
- Evaluate bits in pairs: if the first bit is , output the second bit. If it is , discard the second bit.
If a randomly sampled integer is , we discard this value and take the next one. We generate numbers starting from the most significant bit. However, starting from MSB or LSB makes no difference regarding the security
Choosing number of rounds
Proved in Proposition 5.1, 5.2, and 5.3 in GKRRS21, this table represents the parameters that can protect the construction from statistical and algebraic attacks:
Construction | ||
---|---|---|
-- | ||
-- | ||
-- |