# Verifiable Random Function (VRF)

We present an overview of verifiable random functions (VRF) and describe a construction a VRF based on elliptic curves in \cite{PWHVNRG17}.

Informally speaking, a VRF is a function that generates values that looks indistinguishable from random, and these values can be verified if they were computed correctly. Later, we discuss a few cryptographic applications that VRF possibly plays an important building blocks in constructing them.

The chapter is separated into $$2$$ two major parts. In the first part, we state the formal definition of VRF including its syntax and security properties. Then we talk about the history of VRFs to see its development. In the second major part, we describe the VRF based on elliptic curve and our implementation of the VRF.

# Overview of VRF

In this chapter, we present an overview of VRFs. First, we give a short introduction of VRF including its intuition and importance in cryptography. After that, we discuss the formal definition of VRF and its security requirements. Finally, we talk about the history of VRF to see its development.

### Introduction

In cryptography, a verifiable random function (VRF) is a public key version of a pseudorandom function. It produces a pseudorandom output and a proof certifying that the output is computed correctly.

A VRF includes a pair of keys, named public and secret keys. The secret key, along with the input is used by the holder to compute the value of a VRF and its proof, while the public key is used by anyone to verify the correctness of the computation.

The issue with traditional pseudorandom functions is that their output cannot be verified without the knowledge of the seed. Thus a malicious adversary can choose an output that benefits him and claim that it is the output of the function. VRF solves this by introducing a public key and a proof that can be verified publicly while the owner can keep secret key to produce numbers indistinguishable from randomly chosen ones.

VRF has applications in various aspects. Among them, in internet security, it is used to provide privacy against offline enumeration (e.g. dictionary attacks) on data stored in a hash-based data structure irtf-vrf15. VRF is also used in lottery systems [MR02] and E-cashes [BCKL09].

### VRF Algorithms

Formally, a Verifiable random function consists of three algorithms $$(\mathsf{Gen}, \mathsf{Eval}, \mathsf{Verify})$$ where:

$$(pk,sk) \leftarrow \mathsf{Gen}(1^{\lambda})$$: This algorithm takes as input as a security parameter $$\lambda$$ and outputs a key pair $$(pk,sk)$$.

$$(Y,\pi) \leftarrow \mathsf{Eval}(X,sk)$$: This algorithm takes as input a secret key $$sk$$ and a value $$X$$ and outputs a value $$Y \in {0,1}^{out(\lambda)}$$ and a proof $$\pi$$.

$$b \leftarrow \mathsf{Verify}(pk,X,Y,\pi)$$: This algorithm takes an input a public key $$pk$$, a value $$X$$, a value $$Y$$, a proof $$\pi$$ and outputs a bit $$b$$ that determines whether $$Y=\mathsf{Eval}(X,sk)$$.

### VRF Security Properties

We need a VRF to satisfy the following properties:

Correctness: If $$(Y,\pi)=Eval(sk,X)$$ then $$Verify(pk,X,Y,\pi)=1$$

Uniqueness: There do not exist tuples $$(Y,\pi)$$ and $$Y',\pi'$$ with $$Y \ne Y'$$ and: $$\mathsf{Verify}(pk,X,Y,\pi)=\mathsf{Verify}(pk,X,Y',\pi')=1$$

Pseudorandomess: For any adversary $$\mathcal{A}$$ the probability $$|Pr[ExpRand^A_{VRF}(\lambda)=1]-\dfrac{1}{2}|$$ is negilible where $$ExpRand_{VRF}^{\mathcal{A}}(1^\lambda)$$ is defined as follows:

$$ExpRand_{VRF}^{\mathcal{A}}(1^\lambda)$$:

• $$(sk,pk) \leftarrow \mathsf{Gen}(1^{\lambda})$$
• $$(X^*,st) \leftarrow \mathcal{A}^{\mathcal{O_{VRF}}(.)}(pk)$$
• $$Y_0 \leftarrow \mathsf{Eval}(X*,sk)$$
• $$Y_1 \leftarrow \{0,1\}^{out(\lambda)}$$
• $$\{0,1\} {\stackrel{}{\leftarrow}} b$$
• $$b' \leftarrow \mathcal{A}(Y_b,st)$$
• Return $$b=b'$$

The oracle $$\mathcal{O_{VRF}}(.)$$ works as follow: Given an input $$x$$, it outputs the VRF value computed by $$x$$ and its proof.

In the paper of [PWHVNRG17], the authors stated that a VRF must also be collision resistant. This property is formally defined below:

Collision Resistant: Collision Resistant: For any adversarial prover $$\mathcal{A}=(\mathcal{A_1},\mathcal{A_2})$$ the probability $$Pr\left[ExpCol_{VRF}^\mathcal{A}(1^\lambda)=1\right]$$ is negligible where $$ExpCol_{VRF}^\mathcal{A}(1^\lambda)$$ is defined as follows:

$$ExpCol_{VRF}^\mathcal{A}(1^\lambda)$$:

• $$(pk,sk) \leftarrow \mathcal{A_1}(\mathsf{Gen}(1^{\lambda}))$$
• $$(X,X',st) \leftarrow \mathcal{A_2}(sk,pk)$$
• Return $$X \ne X'$$ and $$\mathsf{Eval}(X,sk)=\mathsf{Eval}(X',sk)$$

It is interesting to see that, VRF can be used for signing messages. However, several signing algorithms such as ECDSA cannot be used to build a VRF. For a given message and a secret key, there can be multiple valid signatures, thus an adversiral prover could produce different valid outputs from a given input, and choose the one that benefits him. This contradict the uniqueness property of VRF.

### History of Verifiable Random Function

Verifiable Random Function is introduced by Micali, Rabin and Vadhan [MRV99]. They defined a Verifiable Random Function (VUF) and gave a construction based on the RSA assumption [MRV99], then proved that a VRF can be constructed from a VUF [MRV99].

In 2002, Lysyanskaya [Unknown bib ref: Lys02] also followed the same path by constructing a VUF instead VRF. However, Lysyanskaya's VUF is based on Diffie-Hellman assumption instead, and it is the first construction that is based on Diffie-Hellman assumption.

In 2005, Dodis and Yampolsky [DY05] gave a direct and efficient construction using bilinear map, then improved. Later, during the 2010s, many constructions {{#cite HW10,BMR10,Jag15}} also used bilinear map, all of them used non standard assumptions to prove the security of their VRF.

In 2015, Hofheinz and Jager [HJ15] constructed a VRF that is based on a constant-size complexity assumption, namely the $$(n-1)$$-linear assumption. The $$(n-1)$$-linear assumption is not easier to break as $$n$$ grows larger, as opposed to all the assumptions used by previous constructions [Unknown bib ref: HS07].

In 2017, [PWHVNRG17] construct an efficient VRF based on elliptic curve that does not use bilinear map to produce output and verify, instead they only use hash functions and elliptic curve operations. In the other hand, their hash function is viewed as random oracle model in the construction.

In 2019, Nir Bitansky [Bit19] showed that VRF can be constructed from non-interactive witness-indistinguishable proof (NIWI).

In 2020, Esgin et al [EKSSZSC20] was the first to construct a VRF based on lattice based cryptography, which is resistant to attack from quantum computers. Thus, VRF remains as a good candidate for generating randomness in the future.

## VRF Based on Elliptic Curves (ECVRF)

In this chapter, we describe a VRF Based on Elliptic Curves in the paper of [PWHVNRG17]. The Internet Research Task Force (IRTF) also describes this VRF in their Internet-Draft irtf-vrf15. The security proof of the VRF is in \cite{PWHVNRG17}, we do not present it here. Then we will present our implementation of the VRF.

### Why using ECVRF

There are many VRF constructions, as listed in the history of VRF. However, ECVRF is chosen because its efficient in both proving and verification complexity, and can be make distributed.

For example, in the construction of Hohenberger and Waters [Unknown bib ref: HW10], the proof of the VRF contains $$\Theta(\lambda)$$ group elements. The number of evaluation steps is $$\Theta(\lambda)$$, because we need to compute $$\Theta(\lambda)$$ group elements. Verification require $$\Theta(\lambda)$$ computations using bilinear map, where $$\lambda$$ is the security parameter. It is unknown whether this VRF can be made distributed or not.

In the other hand, the proof size and the number of evaluation and verification steps of ECVRF are all constant and can be make distributed, as in the paper of Galindo et al [GLOW20]. The VRF construction of [DY05] also have constant proof size, evaluation and verification steps, and can be make distributed. However, in their paper, they require a cyclic group whose order is a 1000 bit prime, while ECVRF require a cyclic group whose order is a 256 bit prime such that the Decisional Diffie-Hellman (DDH) assumption holds. Hence, we can implement the VRF using Secp256k1, a curve used by Ethereum for creating digital signature. The embeeding degree of Secp256k1 is large, thus the DDH assumption is believed to be true.

### Public Parameters

Let $$\mathbb{G}$$ be a cyclic group of prime order $$p$$ with generator $$g$$. Denote $$\mathbb{Z}_p$$ to be the set of integers modulo $$p$$. Let $$\mathsf{EncodeToCurve}$$ be a hash function mapping a bit string to an element in $$\mathbb{G}$$. Let $$\mathsf{ChallengeGeneration}$$ be a hash function mapping arbitary input length to a $$256$$ bit integer.

Note that, in the paper of [PWHVNRG17], the functions $$\mathsf{EncodeToCurve}$$ and $$\mathsf{ChallengeGeneration}$$ are modeled as random oracle model. This is used to prove the security of the VRF.

The cofactor parameter mentioned in the irtf draft is set to $$1$$.

The $$\mathsf{Eval}$$ function is split into 2 functions: $$\mathsf{Prove}$$ and $$\mathsf{ProofToHash}$$. The $$\mathsf{Prove}$$ function returns the proof of the ECVRF, and the $$\mathsf{ProofToHash}$$, returns the ECVRF output.

### ECVRF Construction

$$\mathsf{KeyGen}(1^{k})$$: Choose a random secret value $$sk \in \mathbb{Z}_p$$ and the secret key is set to be $$sk$$. The public key is $$pk=g^{sk}$$.

$$\mathsf{Prove}(sk,X)$$: Given the secret key $$sk$$ and an input $$X$$, the proof $$\pi$$ of ECVRF is computed as follow:

1. Compute $$h=\mathsf{EncodeToCurve}(X,pk)$$.

2. Compute $$\gamma=h^{sk}$$.

3. Choose a value $$k$$ uniformly in $$\mathbb{Z}_p$$.

4. Compute $$c=\mathsf{ChallengeGeneration}(h,pk,gamma,g^k,h^k)$$

5. Compute $$s \equiv k-c.sk \pmod{q}$$

6. The proof $$\pi$$ of the VRF is computed as $$\pi=(\gamma,c,s)$$

$$\mathsf{ProofToHash}(gamma)$$: Given input $$\gamma$$ that is calculated during the $$\mathsf{Prove}$$ function, this function returns the output of ECVRF.

1. Compute $$gammastr=\mathsf{PointToString}(\gamma)$$

2. Let $$gammastr=PointToString(\gamma)$$

3. Let $$suite-string$$="0x01"

4. Let $$separator-front$$="0x03"

5. Let $$separator-back$$="0x00"

6. Let Y=$$\mathsf{keccak}(suite-string || seperator-front || gammastr || seperator-back)$$

7. Return Y

$$\mathsf{Verify}(pk,X,Y,\pi)$$: Given the public key $$pk$$, the VRF input $$X$$, the VRF output $$Y$$ and its proof $$\pi=(\gamma,c,s)$$, the verification step proceed as follow:

1. Check if $$\gamma$$ and $$pk$$ is on the curve

2. Compute $$u=pk^cg^s$$

3. Compute $$h=\mathsf{EncodeToCurve}(X,pk)$$

4. Compute $$v=\gamma^ch^s$$

5. Check if $$c=\mathsf{ChallengeGeneration}(h,pk,gamma,g^k,h^k)$$. If the check is valid, output $$Y=\mathsf{ProofToHash}(\gamma)$$, otherwise output $$Invalid$$.

### ECVRF Auxiliary Functions

In this section, we describe the construction of $$HashToCurve$$ and $$HashPoint$$ in the Internet-Draft of irtf. More details can be found in irtf-vrf15.

$$\mathsf{EncodeToCurve}(X,pk)$$: Given two group elements $$X, pk \in \mathbb{G}$$, the function output a hash value in $$\mathbb{Z}_p$$ as follow:

1. Let $$ctr=0$$

2. Let $$suite-string$$="0x01"

3. Let $$seperator-front$$="0x01"

4. Let $$seperator-back$$="0x00"

5. Compute $$pkstr=\mathsf{PointToString}(pk)$$

6. Define $$H$$ to be "INVALID"

7. While $$H$$ is "INVALID" or $$H$$ is the identity element of the group:

• Compute $$ctrstr=\mathsf{IntToString}(ctr)$$

• Compute $$hstr=\mathsf{keccak}$$$$( suite-string || seperator-front || pkstr || X || ctrstr || seperator-back)$$

• Compute $$H$$=$$\mathsf{StringToPoint}(hstr)$$

• Increment $$ctr$$ by $$1$$

8. Output $$H$$.

$$\mathsf{ChallengeGeneration}(P_1,P_2,...,P_n)$$: Given n elements in $$\mathbb{G}$$, the hash value is computed as follow:

1. Let $$suite-string$$="0x01"

2. Let $$seperator-front$$="0x02"

3. Initialize $$str=suite-string || seperator-front$$

4. For $$i=1,2,...,n$$:

• Update $$str= str || \mathsf{PointToString}(P_i)$$
5. Let $$separator-back$$="0x00"

6. Update $$str=str || separator-back$$

7. Update $$str=\mathsf{keccak}(str)$$

8. Compute $$c=\mathsf{StringToInt}(str)$$

9. Output $$c$$

The function $$\mathsf{PointToString}$$ converts a point of an elliptic curve to a string. Many programming supports this function. For example, in python, we can use $$str(G)$$ to return the string representation of a point$$G$$.

The function $$\mathsf{StringToPoint}$$ converts a string to a point of an elliptic curve. It is specified in section 2.3.4 of [SECG1]

### ECVRF implementation in Python

The implememtation of the ECVRF in python. The steps and details are written on the comments of the implementation. Below are the global variables. We use the ecdsa library, but the curve curve_256 of the library is replaced with the curve secp256k1. Instead of using SHA256 as in irtf-vrf15, we use the keccak hash function.

G = generator_256
ORDER = G.order()
order_minus_one=0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364140
INFINITY=Point(None,None,None)
suite_string=b"0x01"


#### ECVRF main functions

The main functions of the ECVRF.

##### The Key generation function

We create an ECVRF class and use the key generation function as the constructor of the class.

class ECVRF():

def __init__(self,sk=None):
if sk==None:
self.sk = random.randint(0,order_minus_one)
self.pk = G*self.sk
else:
self.sk = sk
self.pk = G*self.sk

##### The Prove function

The prove function of the ECVRF. The function closely follow the steps in Section 5.1 of irtf-vrf15

def prove(self, x):
# the evaluation function, based on the paper [PWHVNRG17]
# step 1 compute h

H = ECVRF_encode_to_curve_try_and_increment(self.pk,x,suite_string)

#step 2 let gamma=h^self.sk

gamma = H*self.sk

#step 3 choose a random k

k = random.randint(0,order_minus_one)

#step 4 compute c=Hash_point(g,h,g^sk,h^sk,g^k,h^k)

point_list=[H, self.pk, gamma, G*k, H*k]
c = ECVRF_challenge_generation(point_list)

#step 5 compute s=k-c*sk (mod order)
s = (k - c*sk)% ORDER

# the proof consists of gamma, c and s
pi = {'gamma': gamma, 'c': c,  's': s}

# the output is the keccak hash of gamma
y=proof_to_hash(gamma)

return {'output': y, 'proof': pi, 'public key': self.pk}

##### The ProofToHash function

The prove function of the ECVRF. The function closely follow the steps in Section 5.2 of irtf-vrf15

def proof_to_hash(gamma):

# the output is the keccak hash of gamma
hash = keccak.new(digest_bits=256)
hash.update(b"\x01")
hash.update(b"\x03")
hash.update(str(gamma).encode())
hash.update(b"\x00")
y = int(hash.hexdigest(), 16)


##### The Verify function

The verify function of the ECVRF. The function closely follow the steps in Section 5.3 of irtf-vrf15

def verify(self, x, y, pi, pk):
# this function, given an input x, a value y, a proof pi and
# the public key pk,
# verify if the output y was calculated correctly from x

gamma = pi['gamma']
c = pi['c']
s = pi['s']

#step 1 compute U=c*pk+G*s

U = c*pk + G*s

#step 2 compute V=c*gamma+H*s

H = ECVRF_encode_to_curve_try_and_increment(pk,x,suite_string)

#step 3 compute V=c*gamma+h*s

V = c*gamma + H*s

#calculate the value Hash_point(G,H,pk,gamma,U,V)

point_list=[H,pk,gamma,U,V]
c2 = ECVRF_challenge_generation(point_list)

#calculate the keccak hash of gamma

hash = keccak.new(digest_bits=256)
hash.update(str(gamma).encode())

#step 4 check if c=Hash_point(g,h,pk,gamma,u,v) and y=keccak(gamma)

return c == c2 and y == hash_to_proof(gamma)


#### ECVRF auxiliary functions

The auxiliary functions of the ECVRF.

##### The HashToCurve function

The HashToCurve of the converts a 256 bit integer into a point of the curve secp256k1. We ignore the cofactor check in irtf-vrf08, since the cofactor value is set to be 1.

def ECVRF_encode_to_curve_try_and_increment(pk, x, suite_string):

ctr=0
pk_string=str(pk).encode()
one_string=int(0).to_bytes(1,'big')
zero_string=int(1).to_bytes(1,'big')

#because the == operation in the elliptic curve class only compare
#two Points, we cannot use H=="INVALID" (can't compare a Point and a
# string) but instead use H==INFINITY
#because if H==INFINITY is also an invalid condition and it does not
#change anything.

H=INFINITY
while  H==INFINITY:
hash=keccak.new(digest_bits=256)
ctr_string=str(ctr).encode()
hash.update(suite_string)
hash.update(one_string)
hash.update(pk_string)
hash.update(str(x).encode())
hash.update(ctr_string)
hash.update(zero_string)
ctr+=1
hash=hash.digest()
H=string_to_curve(hash)
return H

##### The HashPoint function

The HashPoint function converts a list of point into a 256 bit integer. The function closely follow the steps in Section 5.4.3 of irtf-vrf08

def ECVRF_challenge_generation(point_list):
# based on the irtf internet draft
#we use the keccak instead of sha256

hash = keccak.new(digest_bits=256)
hash.update(b"\x02")
for i in point_list:
hash.update(str(i).encode())
hash.update(b"\x00")
return int(hash.hexdigest(), 16) % ORDER

##### The StringToCurve function

The StringtoCurve converts a string into a point of secp256k1. We only need to implement Step 1, Step 2.2 and Step 2.4.1 in [SECG1], since we use the curve secp256k1.

def string_to_curve(string):
#specified in 2.3.4 of https://www.secg.org/sec1-v2.pdf
#since the curve is secp256k1, then q=p is an odd prime
#we want to implement for secp256k1 therefore we will just implement step 1,
# 2.2 and 2.4.1

#Step 1

if string==int(2).to_bytes(1,'big'):
return INFINITY

#Step 2.2

x=string_to_field(string)
if x=="INVALID":
return INFINITY
p=secp256k1._CurveFp__p

#Step 2.4.1
# let t=x^3+7 (mod p)

t=(pow(x,3,p)+secp256k1._CurveFp__a*x+secp256k1._CurveFp__b)%p
QR=pow(t,(p-1)//2,p)
if QR==(p-1):
return INFINITY

# because p=3 (mod 4), we see that y=t^((p+1)/4)

beta=pow(t,(p+1)//4,p)
if beta%2==0:
return Point(secp256k1,x,beta)
else:
return Point(secp256k1,x,p-beta)

##### The StringToField function

The StringtoCurve converts a string into an element in $$\mathcal{Z}_p$$, where $$p=2^{256}-2^{32}-977$$. We only need to implement Step 1 and Step 2.3.6 in [SECG1].

def string_to_field(string):
#specified in 2.3.6 of https://www.secg.org/sec1-v2.pdf
#since i just want to implement for secp256k1, i will just implement step 1
#in fact, step 1 is just the function string_to_integer in part 2.3.8 of the
#same paper

x=0
for i in range (0,len(string)):
x+=pow(256,len(string)-1-i)*int(hex(string[i]),16)
if x<secp256k1._CurveFp__p:
return x
return "INVALID"


# Distributed Key Generation (DKG)

We give an overview of Distributed Key Generation (DKG) and describe the DKG protocol used in the paper [GJKR99]. This, along with the ECVRF, will be two main components for the Distributed Verifiable Random Function (DVRF) protocol used for generating pseudorandom values. First, we give an overview of DKG. Then, we mention Verifiable Secret Sharing (VSS), the main building block for a DKG protocol. Finally, we describe the DKG protocol of [GJKR99].

## Overview to DKG

Distributed key generation is a main component of threshold cryptosystems. It allows $$n$$ participants to take part and generate a pair of public key and secret key required for the threshold cryptosystem without having to rely on a trusted party (dealer). Each participant in additional receive a partial secret key and a partial public key. While the public key and partial public keys is seen by everyone, the private key is maintained as a (virtual) secret of a $$t,n$$ secret sharing scheme, where each share is the partial secret key of a participant [GJKR99]. No adversary with limited computational power can learn any information of the secret key unless it control the required amount of participants.

### DKG Security Properties

As in [GJKR99], we wish a $$(t,n)$$ DKG protocol to have the following properties:

Correctness:

1. There is an efficient algorithm that, given any $$t+1$$ shares provided by honest parties, output the same unique secret key $$sk$$.
2. All honest parties agree on the same value of the public key $$pk=g^{sk}$$.
3. $$sk$$ is uniformly distributed in $$\mathbb{Z}_p$$.

Secrecy:: For any adversary who controls at most $$t$$ participants, he cannot learn any additional information except the value $$pk=g^{sk}$$.

### Applications

DKG has been used in numerous aspects:

1. Threshold Encryption and Signatures: In threshold cryptosytems, one problem arises: single point of failure. If a single participant fails to follow the protocol, then the entire system will abort. When applied as a component of a threshold cryptosystem, $$(n,t)$$DKG solves this problem by ensuring that any $$t+1$$ participants who behave honestly will allow the protocol to execute successfully.

2. Identity based Cryptography (IBC): In IBC, a private-key generator (PKG) is required to generate a secret, called the master key and provide private keys to clients using it. Since he know the private key of the client, he can decrypt the messages of the client. A $$(n,t)$$ DKG solve this by distributing the PKG into many participants where any adversary who control at most $$t$$ parties cannot compute the master key, and the client receive his secret key by collecting $$t+1$$ partial private keys from the participants.

3. Distriuted Pseudo-random Functions: Pseudo-random Functions (PRF) and Verifiable Random Functions (VRF) is used to produce values that looks indistinguishable from random. Both require a secret key to compute the output. However, the secret key holder can select the value of the secret key to manipulate the output, or share the secret key with others to benefit himself. This can be prevented by applying a $$(n,t)$$ distributed version of PRF or VRF, thus no adversary can learn or affect the value of the secret key.

## Verifiable Secret Sharing (VSS)

In this chapter, we discuss about Verifiable Secret Sharing (VSS), the main building block for a DKG protocol. We state the syntax and security properties of a VSS, then describe the VSS construction due to Pedersen.

### Introduction

Secret Sharing was first introducted by Shamir in 1979 [Unknown bib ref: Sha79]. It allows a person to share a secret $$s$$ among $$n$$ parties such that any \textit{authorized} subset of parties can use all their shares to reconstruct the secret, while any other (non-authorized) subset learns nothing about the secret from their shares. Shamir proposed a secret sharing scheme in his paper [Unknown bib ref: Sha79]. However, in Shamir's scheme, parties can commit wrong shares, leading the protocol to reconstruct the wrong secret, thus a verification process of the shares is required. To solve this problem, Chor et al. [CGMA85] introduced Verifiable Secret Sharing.

### Syntax and Properties

#### Syntax

A $$(t,n)$$ Verifiable Secret Sharing consists of two phases: Share and Reconstruct as follow:

Share: The dealer $$D$$ has a secret value $$s$$. Initially, there are $$n$$ parties $$P_1, P_2, ..., P_n$$. The sharing phase may consist of several rounds of interaction between parties. At the end of the phase, each party $$P_i$$ holds a share $$s_i$$ that will be required to reconstruct the secret of the dealer later.

Reconstruct: In this phase, each party $$P_i$$ publishes its share $$s_i$$ from the sharing phase. Then, from these shares, a reconstruction function will be applied to output the original secret.

#### Properties

As in [BKP11], a VSS need to have the following properties:

Correctness: If $$D$$ is honest, then the reconstruction function outputs $$s$$ at the end of the Reconstruct phase and all honest parties agree on the value $$s$$.

Secrecy: If $$D$$ is honest, then any adversary that control at most $$t$$ parties does not learn any information about $$s$$ during the Share phase.

Commitment: If $$D$$ is dishonest, then at the end of the sharing phase there exists a value $$s^* \in \mathbb{F}_p$$ such that at the end of the Reconstruct phase all honest parties agree on $$s^*$$.

### Pedersen's Construction

We describe a VSS protocol from Pedersen [Ped91]. This scheme provides perfect information theoretic security against any adversary with unbounded computational power.

Share The dealer chooses two polynomials $$p(x)=a_0+a_1x+a_2x^2+...+a_tx^t$$ and $$p'(x)=b_0+b_1x+b_2x^2+...+b_tx^t$$ such that $$a_0=s$$. Then he compute $$s_i=p(i)$$ and $$s_i'=p'(i)$$. The dealer then broadcasts $$C_i=g^{s_i}h^{s_i'}$$. Then he gives the party $$P_i$$ the tuple $$(s_i,s_i')$$. This allows $$P_i$$ to check if $$(s_i,s_i')$$ is a valid share by checking that

$$g^{s_{i}}h^{s_{i}'} \stackrel{?}{=} \prod_{k=0}^{t}(C_{k})^{i^k} (*).$$

If a party $$P_i$$ receives a share that does not satisfy $$(*)$$, he will complains against the dealer. The dealer must reveal the share $$(s_i,s_i')$$ that satisfies for each complaining party $$P_i$$. If any of the revealed shares does not satisfy $$(1)$$ the equation, the dealer is marked invalid.

Reconstruct Each participant submits $$(s_i,s_i')$$. Everyone can verify if $$P_i$$ submitted the correct shares by checking if $$1$$ is satisfied. If $$P_i$$ receives more than $$t$$ complaints, then he is disqualified. Given $$t+1$$ valid shares, $$s_1,s_2,...,s_{t+1}$$ from parties $$P_{x_1},P_{x_2},...,P_{x_{t+1}}$$, the secret $$s$$ can be computed as: $$s= \sum_{i=0}^{t}s_i \left(\prod_{\substack{j=0 \ j\neq i}}^{t}\dfrac{x_j}{x_j-x_i}\right).$$

The security proof of the protocol can be found in [Ped91].

## DKG Construction

In this chapter, we describe the DKG protocol in the paper Gennaro et al [GJKR99]. The protocol can withstand up to $$\dfrac{n}{2}$$ dishonest participants, where $$n$$ is the number of participants. Despite the high communication cost, the protocol only need to be executed once in the first round of many threshold cryptosystems.

### Why Gennaro et al's Construction?

Despite there are numerous constructions for DKG, namely [GJKR99], there is a reason we choose the DKG protocol of Gennaro et al.

Previously, Pedersen was the first to propose a DKG construction [Ped91a]. However, Gennaro et al. proved that in Pedersen's DKG, an attacker can manipulate the result of the secret key, making it not uniformly distributed [GJKR99].

Canetti et al. [CGJKR99] give a construction of a DKG that is secure against an adaptive adversary. However, his construction has worse performance than Gennaro's, since each participant has to use Pedersen's VSS two times. In addition, no adaptive adversary has been able to successfully attack the construction of Gennato et al.

Numerous attempts have been made to reduce the communication cost for a DKG {{#cite KG09, CS04, CZAPGD20}}. However, all these schemes require a trusted party. This quite contradict the goal of a DKG.

This make the DKG construction of Gennaro et al. remains a simple, efficient and secure DKG protocol.

### Gennaro et al's Construction

The DKG protocol consists of two phases, namely, generating and extracting, working as follows:

Public Parameters: Let $$p$$ be a prime number. Let $$G$$ be a cyclic group of order $$p$$ with generators $$g$$ and $$h$$. The public parameters of the system are $$p,G,g,h$$.

Generating: This process works as follows:

1. Each participant $$P_i$$ chooses two random polynomials $$f_i(z)=a_{i0}+a_{i1}z+...+a_{it}z^t$$ and $$f_i'(z)=b_{i0}+b_{i1}z+...+b_{it}z^t$$ and broadcasts $$C_{ij}=g^{a_{ij}}h^{b_{ij}}$$ for $$j=0,1,...,t$$.
2. The participant $$P_i$$ then sends $$s_{ij}=f_i(j)$$ and $$s'_{ij}=f_i'(j)$$ to $$P_j$$.
3. Each participant $$P_j$$ verifies the shares he received from each $$P_i$$ by checking whether

$$g^{s_{ij}}h^{s_{ij}'}\stackrel{?}{=} \prod_{k=0}^{t}C_{ik}^{j^k}. (*)$$

If the check fails for some $$i$$, $$P_j$$ complains against $$P_i$$.

1. Each $$P_i$$ who receives a complaint from $$P_j$$ broadcasts $$s_{ij}$$ and $$s_{ij}'$$ that satisfy Equation $$(*)$$.
2. A participant $$P_i$$ is disqualified if he receives at least $$t+1$$ complaints or answers a complaint with value that does not satisfy Equation. Then a set $$\mathcal{QUAL}$$ of qualified participants is determined.
3. For each $$i$$, the secret key $$sk_i$$ of $$P_i$$ is equal to $$\sum_{j\in \mathcal{QUAL}} s_{ji}$$. For any set $$\mathcal{V}$$ of at least $$t+1$$ participants, the secret key $$sk$$ is equal to $$\sum_{i \in \mathcal{V}} sk_i\cdot\lambda_{i,\mathcal{V}}$$.

Extracting: The process works as follows:

1. Each participant $$P_i$$ in the set $$\mathcal{QUAL}$$ publishes $$A_{ij}=g^{a_{ij}}$$ for $$j=0,1,2,\dots,t$$.
2. Each participant $$P_j$$ verifies $$A_{ij}$$ for each $$i$$. Specifically, $$P_j$$ checks whether $$g^{s_{ij}}\stackrel{?}{=} \prod_{k=0}^{t}A_{ik}^{j^k}.$$ If the check fails for some $$i$$, $$P_j$$ complains against $$P_i$$.
3. For each $$i$$ that $$P_i$$ receives at least one valid complaint, all other parties run Pedersen VSS to reconstruct $$f_i(z)$$, and restore $$s_{i0}$$ and $$A_{ij}$$ for $$j=0,1,...,t$$. The public key is equal to $$pk= \prod_{i \in \mathcal{QUAL}}A_{i0}$$
4. The public key $$pk_i$$ of $$P_i$$ is calculated as $$pk_i=g^{sk_i}=\prod_{j \in \mathcal{QUAL}}g^{s_{ji}}= \prod_{j \in \mathcal{QUAL}}\prod_{k=0}^{t}A_{jk}^{i^k}$$

The security proof of the DKG protocol can be found in [GJKR99].

# KZG Polynomial Commitment Scheme

KZG polynomial commitment scheme [KZG10] plays an important role in making the polynomial constraints of PlonK's arithmetization become a zkSNARK [GWC19].

In this section, we provide the syntax and security requirements of a polynomial commitment scheme (PCS) in Polynomial Commitment Scheme - Definition. Then, we show some instantiations of the scheme by discussing its technical overview in Technical Overview.

Author(s): khaihanhtang

# Polynomial Commitment Scheme - Definition

In this section, we provide the definition of polynomial commitment scheme including syntax and security requirements. Syntax describes how the scheme works through algorithms while security requirements enforce the scheme to satisfy in order to make it secure.

Syntax is presented in Syntax and security requirements are presented in Security Requirements.

# Syntax

A polynomial commitment scheme, for polynomials over a field $$\mathbb{F}$$, is a tuple of $$5$$ algorithms $(\mathsf{Setup}, \mathsf{Commit}, \mathsf{VerifyPoly}, \mathsf{CreateWitness}, \mathsf{VerifyEval})$ working as follows:

• $$\mathsf{Setup}(1^\kappa, t):$$ On inputs security parameter $$1^\kappa$$ and a degree bound $$t$$, this algorithm returns a commitment key $$ck$$. The key $$ck$$ allows to commit any polynomial in $$\mathbb{F}[X]$$ whose degree is at most $$t$$.

Above we use the notation $$\mathbb{F}[X]$$ to denote the field extension of $$\mathbb{F}$$. For intuition, this field extension contains every polynomial of the form $$f(X) = \sum_{j = 0}^{\deg(f)} c_j \cdot X^j$$ where $$\deg(f)$$ is the degree of $$f(X)$$ and $$c_j \in \mathbb{F}$$ for all $$j \in \{0, \dots, \deg(f)\}$$. Hence, with the algorithm $$\mathsf{Setup}$$ on input $$t$$, we can assume that it allows to commit to any polynomial $$f$$ satisfying $$\deg(f) \leq t$$.

We may have a question about what will happen if we try to use $$ck$$ to commit to a polynomial whose degree is larger than $$t$$. In this case, the execution and correctness of the algorithms below or the security of the scheme are not guaranteed.

• $$\mathsf{Commit}\left(ck, f(X)\right):$$ On inputs commitment key $$ck$$ and polynomial $$f(X) \in \mathbb{F}[X]$$, this algorithm returns a commitment $$c$$ and an opening (or decommitment) $$d$$. We note that $$f(X)$$ here is recommended to have degree at most $$t$$ with respect to $$ck$$ output by $$\mathsf{Setup}(1^\kappa, t)$$.

• $$\mathsf{VerifyPoly}(ck, f(X), c, d):$$ On inputs commitment key $$ck$$, polynomial $$f(X) \in \mathbb{F}[X]$$, commitment $$c$$ and opening $$d$$, this algorithm deterministically returns a bit $$b \in \{0, 1\}$$ to specify whether $$c$$ is a correct commitment to $$f(X)$$. If $$c$$ is such a correct commitment, then $$b = 1$$. Otherwise, $$b = 0$$.

At this moment, we may wonder why $$\mathsf{Commit}$$ does output both $$c, d$$ and $$\mathsf{VerifyPoly}$$ does use both $$c, d$$. In fact, when participating in an interactive protocol, one party may commit to some secret by exposing commitment $$c$$ to other parties. This commitment $$c$$ guarantees that the secret behind, namely, polynomial $$f(X)$$ in this case, is still secret, guaranteed the hiding property to be discussed later.

On the other hand, since we abuse the word commit, it means that the party publishing $$c$$ has only one opening, namely, $$d$$, to show that $$c$$ is a correct commitment to $$f(X)$$. It is extremely hard for this party to show that $$c$$ is correct commitment to some other polynomial $$f'(X) \not= f(X)$$. This is guaranteed by the binding property of the polynomial commitment scheme, to be discussed later.

• $$\mathsf{CreateWitness}(ck, f(X), i, d):$$ On inputs commitment key $$ck$$, polynomial $$f(X)$$, index $$i$$ and opening $$d$$, this algorithm returns a witness $$w_i$$ to ensure that $$c$$, related to opening $$d$$, is commitment to $$f(X)$$ whose evaluation at index $$i$$ is equal to $$f(i)$$.

Let us explain in detail the use of $$\mathsf{CreateWitness}$$. Assume that a party published $$c$$ which is a commitment to $$f(X)$$. It then publishes a point $$(i, v)$$ and claims that $$f(i) = v$$ without exposing $$f(X)$$. By using the algorithm $$\mathsf{VerifyEval}$$ defined below, this claim is assured if $$f(i)$$ is actually equal to $$v$$. Moreover, for all other indices $$i'$$ satisfying $$i' \not= i$$, if $$f(i')$$ has not been opened before, then $$f(i')$$ is unknown to any party who does not know $$f(X)$$.

We also remark that, from basic algebra, if we know evaluations at $$\deg(f) + 1$$ distinct indices, we can recover the polynomial $$f(X)$$. Therefore, the above claim assumes that other parties do not know up to $$\deg(f) + 1$$ distinct evaluations.

• $$\mathsf{VerifyEval}(ck, c, i, v, w_i):$$ On inputs commitment key $$ck$$, commitment $$c$$, index $$i$$, evaluation $$v$$ and witness $$w_i$$, this algorithm returns $$\{0, 1\}$$ deciding whether $$c$$ is a commitment key $$f(X)$$ satisfying $$f(i) = v$$.

# Security Requirements

In this section, we briefly discuss the security requirement for polynomial commitment schemes.

A polynomial commitment scheme is said to be secure if it satisfies the following properties:

• Correctness. The correctness property says that if the scheme is executed honestly then the verification algorithms, namely, $$\mathsf{VerifyPoly}$$ and $$\mathsf{VerifyEval}$$, always returns $$1$$. In particular, assume that $$ck \leftarrow \mathsf{Setup}(1^\kappa, t)$$ and $$(c, d) \leftarrow \mathsf{Commit}(ck, f(X))$$. Then, $$\mathsf{VerifyPoly}(ck, f(X), c, d)$$ always returns $$1$$. And, for any $$w_i$$ output by $$\mathsf{CreateWitness}(ck, f(i), i, d)$$, algorithm $$\mathsf{VerifyEval}(ck, c, i, f(i), w_i)$$ always returns $$1$$.

• Polynomial Binding. For a given commitment key $$ck$$ output by $$\mathsf{Setup}(1^\lambda, t)$$, this property says that it is hard to output commitment $$c$$ and two tuples $$(f(X), d)$$ and $$(f'(X), d')$$ such that $$f(X)$$ and $$f'(X)$$ are distinct and have degrees at most $$t$$, and $$c$$ is commitment to both $$f(X)$$ and $$f'(X)$$ with respect to openings $$d$$ and $$d'$$, respectively. More precisely, it is hard to make $$\mathsf{VerifyPoly}(ck, f(X), c, d) = 1$$ and $$\mathsf{VerifyPoly}(ck, f'(X), c, d') = 1$$ if $$f(X) \not= f'(X)$$, $$\deg(f) \leq t$$ and $$\deg(f') \leq t$$.

• Evaluation Binding. The evaluation binding property says that a committed polynomial evaluating on an index $$i$$ cannot produce two different outcomes $$v$$ and $$v'$$. More precisely, an adversary cannot provide an index $$i$$ and $$2$$ tuples $$(v, w_i)$$ and $$(v', w'_i)$$ satisfying $$v \not= v'$$ such that $$\mathsf{VerifyEval}(ck, c, i, v, w_i) = 1$$ and $$\mathsf{VerifyEval}(ck, c, i, v', w'_i) = 1$$.

• Hiding. An adversary $$\mathcal{A}$$ who knows at most $$\deg(f)$$ evaluations of a committed polynomial $$f(X)$$ cannot determine any evaluation that it does not know before.

We remind that knowing $$\deg(f) + 1$$ different evaluations helps to determine polynomial, and hence all other evaluations. Therefore, we use the bound $$\deg(f)$$ here as a maxmimum number of evaluations that the adversary is allowed to know in order not to correctly obtain the evaluations of all other indices.

# Technical Overview

From the syntax of polynomial commitment scheme presented in Polynomial Commitment Scheme - Syntax, a realization, or equivalently, a construction, can be separated into $$2$$ components:

• Commiting polynomial includes the algorithms $$\mathsf{Commit}$$ and $$\mathsf{VerifyPoly}$$.
• Evaluating polynomial includes the algorithms $$\mathsf{CreateWitness}$$ and $$\mathsf{VerifyEval}$$.

Based on those components, we present the high level idea of $$2$$ constructions, namely, conditional and unconditional, of polynomial commitment scheme separated into $$3$$ little parts. The first and second parts, to be presented in Commitment to Polynomial Without Hiding Property and Correct Evaluation from the Commitment, respectively, focus on constructing the realization of conditional version. And, in the third part, to be presented in Dealing with Hiding, regarding condition and unconditional hidings, we discuss the modification of conditional version to achieve the unconditional one.

# Commitment to Polynomial Without Hiding Property

In the construction of [KZG10], the main idea to commit a polynomial is to evaluate it on a secret index. For example, assume that $$f(X) = a_0 + a_1 X + \dots + a_d X^d \in \mathbb{F}[X]$$. The secret index can be thought of as some value $$x$$ that the committer does not know. So, how can committer evaluate $$f(x)$$ on that secret index without any knowledge about it? In fact, cryptography can magically help you do that. For instance, by putting $$1, x, x^2, \dots, x^n$$ into the form of powers to some base element $$g$$, e.g., $$g^1, g^x, g^{x^2}, \dots, g^d$$, it helps to hide those values $$1, x, x^2, \dots, x^d$$. Moreover, it alows you to evaluate $$g^{f(x)}$$ as desired by computing $$(g^1)^{a_0} \cdot (g^x)^{a_1} \cdot (g^{x^2})^{a_2} \cdot \dots \cdot (g^{x^d})^{a_d} = g^{a_0 + a_1x + \dots a_d x^d} = g^{f(x)}.$$ Thus, $$g^{f(x)}$$ is computed without any knowledge about $$x$$. Hence, that is whatever the committer needs to do in the commit step, namely, executing the algorithm $$\textsf{Commit}$$ to output the commitment $$g^{f(x)}$$. So the commitment key $$ck$$ for this algorithm is the hidden indices wrapped under powers of $$g$$, namely, the set sequence $$(g^1, g^{x}, g^{x^2}, \dots, g^{x^d})$$. And, therefore, $$(g^1, g^{x}, g^{x^2}, \dots, g^{x^d})$$ is also the output of the algorithm $$\textsf{Setup}$$. At this point, we might think about a few things:

1. How to verify the commitment $$c = g^{f(x)}$$ by executing $$\textsf{VerifyPoly}$$.
2. How to guarantee that the commitment satisfies the polynomial binding property.

For the first question, to verify $$c$$, the decommitment of the construction is $$f(X)$$ itself. Committer simply send the entire polynomial $$f(X)$$ to verifier, namely, by sending coefficients $$a_0, \dots, a_d$$. Having the polynomial and the commitment key $$ck$$, the verifier can check easily by repeating steps similar to the algorithm $$\textsf{Commit}$$.

For the second question, to show the satisfaction of binding property, we can assume that the committer is able to break the binding property by introducing another polynomial $$f'(X) = a'_0 + a'_1X + \dots + a'_dX^d$$ where $$f'(X) \not= f(X)$$. So we have $$g^{f(x)} = c = g^{f'(x)}$$.

# Correct Evaluation from the Commitment

Warning. This part explains by the use of algebra. You may skip if you feel it is complicated.

For an index $$i$$ given to the committer, since committer knows $$f(X)$$, he can compute $$f(i)$$ definitely. The $$\mathsf{CreateWitness}$$ algorithm is constructed based on the fact that $$X - i$$ divides $$f(X) - f(i)$$. At this point, there is something difficult to realize here since it regards to the use of algebra. However, we know that $$f(i)$$ is the output of $$f(X)$$ on input $$i$$. Therefore, we see that $$i$$ is among the roots of $$g(X) = f(X) - f(i)$$, i.e., $$g(i) = 0$$ which says that $$i$$ is a root of $$g(X)$$. Therefore, $$X - i$$ divides $$f(X) - f(i)$$. Hence, to guarantee the evaluation binding property, committer needs to show that $$f(X) - f(i)$$ is divisible by $$X - i$$.

Example. Consider polynomial $$f(X) = 6X^3 + 25X^2 + 16X + 19$$ in $$\mathbb{Z}_{31}$$. Let $$i = 28$$. Then, $$f(28) = 151779 = 3$$ in $$\mathbb{Z} _{31}$$. Hence, $$X - 28 = X + 3$$ divides $$f(X) - 3$$. In fact, $$f(X) - 3 = 6X^3 + 25X^2 + 16X + 16 = (3X + 5)(2X + 30)(X + 3)\text{ in } \mathbb{Z} _{31}.$$ It is obvious that $$X + 3$$ is a factor of $$f(X) - 3$$.

Equivalently, we can say that $$v = f(i)$$ if and only if $$X - i$$ divides $$f(X) - f(i)$$.

To show such divisibility holds, we can compute $$\psi(X) = \frac{f(X) - v_i}{X - i}$$, where $$v_i$$ is assumed to be $$f(i)$$, and define witness $$w_i = g^{\psi(x)}$$ by using $$g^1, g^x, \dots, g^{x^d}$$ output by the algorithm $$\textsf{Setup}$$.

At this point, for the verifier to verify, committer needs to show that $$\psi(x) \cdot (x - i) + v_i = f(x)$$. Let's closely take a look at this formula. We observe the followings:

1. No one knows $$x$$. Hence, $$\psi(x)$$ is not known to anyone.
2. Committer and verifier know $$g^{\psi(x)}$$ which is equal to the commitment $$c$$. Moreover, they also know $$g^x, g^i, g^{v_i}$$ since $$g^x$$ belongs to commitment key $$ck$$, $$i$$ is public and $$v_i$$ is publicly claimed by committer.
3. Verifier can easily compute $$g^{x - i} = g^x / g^i$$.

Clearly, having $$g^{\psi(x)},g^{x - i}, g^{v_i}$$ and $$g^{f(x)}$$, we do not know any efficient way to compute $$g^{\psi(x)\cdot (x - i) + v_i}$$ since computing $$g^{\psi(x)\cdot (x-i)}$$ is hard due to Diffie-Hellman assumption.

## Using Bilinear Pairing to Handle Correct Multiplications

Recall the bilinear pairing $$e : \mathbb{G}\times \mathbb{G} \to \mathbb{G}_T$$ where $$\mathbb{G}$$ and $$\mathbb{G}_T$$ are some groups of the same cardinality. This bilinear pairing has $$2$$ properties: bilinearity and non-degeneracy. However, to avoid confusion, we only care about the bilinearity and temporarily skip the notice to non-degeneracy.

• Bilinearity. For $$g \in \mathbb{G}$$ and $$g_T \in \mathbb{G}_T$$, $$e(g^a, g^b) = e(g, g)^{ab}$$ for any $$a, b \in \mathbb{Z}_p$$ where $$p=|\mathbb{G}|$$.

The validity of the witness can be check easily by using pairing, namely, $$e(w_i, g^x / g^i)\cdot e(g,g)^{v_i} \stackrel{?}{=}e(c, g),$$ where $$c$$ is the commitment to $$f(X)$$. If the above identity holds, with non-negligible probability, it says that $$v_i = f(i)$$.

To show identity implies $$v_i = f(i)$$ with non-negligible probability, we consider $$w_i = g^{\psi(x)} = g^{\frac{f(x) - v_i}{x - i}}$$. Hence, \begin{align} e(w_i, g^x / g^i)\cdot e(g, g)^{v_i} &= e\left(g^{\frac{f(x) - v_i}{x - i}}, g^x / g^i\right)\cdot e(g, g)^{v_i}\\ &= e(g, g)^{f(x) - v_i}\cdot e(g, g)^{v_i} = e(g^{f(x)}, g) = e(c, g). \end{align}

Notice that, if the person providing $$w_i$$ and $$v_i$$ does not know $$f(X)$$, then we have the following possibilities:

• This person correctly guesses $$w_i$$ and $$v_i$$. This happens with negligible probability if we assumes that field cardinality, namely, number of elements in field $$\mathbb{F}$$, is large.
• The person incorrectly provides $$w_i$$ and $$v_i$$. Specificially, either $$v_i$$ is not equal to $$f(i)$$ or $$w_i$$ is incorrect. Assume that $$w_i = g^{h_i}$$. This case happens when $$x$$ is the root of $$h_i\cdot (X - i) \cdot v_i = f(X)$$. By Schwartz–Zippel lemma, this case also happens with negligible probability if the field cardinality is large and that person does not know $$x$$, as $$x$$ at the beginning was assumed to be hidden index.

# Dealing with Hiding

In the previous sections, namely, Commitment to Polynomial Without Hiding Property and Correct Evaluation from the Commitment, we discussed the high level idea of the construction of algorithms as well as the polynomial and evaluation binding properties. One remaining thing is the hiding property. In [KZG10], the authors proposed $$2$$ constructions from discrete logarithm assumption, for conditional hiding, and Pedersen commitment, for unconditional hiding.

Remark. We now make clear the $$2$$ notions here, namely, conditional hiding and unconditional hiding.

Conditional hiding of a commitment $$c$$ to a polynomial $$f(X)$$ is the property protecting the polynomial $$f(X)$$ from being compromised with a condition that some assumption employed is assumed to be hard. Usually, the hardness of the assumption is against probabilistic polynomial-time adversaries. Here, probabilistic polynomial-time adversaries stand for the machines that attack the scheme with limited amount of time, and this amount of time is upper-bounded by a polynomial on input the security parameter given to the scheme. Probabilistic polynomial time is a notion in the theory of computation. If you would like to know more about the detail, we prefer to check some textbooks. For example, we prefer [Sipser2012-introduction-to-theory-of-computation] in this blog.

On the other hand, unconditional hiding means that we cannot extract any information about the polynomial behind. For example, if $$f(X) = a_0 + a_1X + \dots + a_dX^d$$ and $$r(X) = r_0 + r_1X + \dots + r_dX^d$$, given that $$r_0, \dots, r_d$$ are independently and uniformly sampled from $$\mathbb{F}$$, then $$f(X) + r(X) = (a_0 + r_0) + (a_1 + r_1)X + \dots + (a_d + r_d)X^d$$ completely hides $$f(X)$$ since $$a_0 + r_0, a_1 + r_1, \dots, a_d + r_d$$ are uniform in $$\mathbb{F}$$.

## Conditional hiding from discrete logarithm assumption

The former, namely, discrete logarithm assumption, guarantees the hiding by its own hardness assumption. In particular, given $$g$$ and $$g^v$$ for some secret integer $$v$$, there is no way to extract any piece of information about $$v$$ from $$g$$ and $$g^v$$ due to hardness of discrete logarithm problem. Therefore, the hiding here is conditional, i.e., discrete logarithm assumption is hard.

## Unconditional hiding from Pedersen commitment

The latter, namely, using Pedersen commitment, exploits the use of an additional part to achieve unconditional hiding property, which is secure against powerful adversaries and not limited to PPT ones. Roughly speaking, the external part can be informally thought of as a commitment to a random polynomial with conditional hiding. To perform this, we extend the commitment key $$ck$$ to contain additional elements $$h^1, h^x, \dots, h^{x^d}$$ where $$h$$ is another generator of $$\mathbb{G}$$ different from $$g$$. Hence, the commitment key $$ck$$ now is the tuple $$(g^1, \dots, g^{x^d}, h^1, \dots, h^{x^d})$$. Therefore, whenever committing to a polynomial $$f(X) = a_0 + a_1X + \dots + a_dX^d$$, we additionally sample a polynomial $$r(X) = r_0 + r_1X + \dots + r_dX^d\in \mathbb{F}[X]$$. The sampling process can be conducted easily by sampling each $$r_i$$ uniformly and independently from $$\mathbb{Z}_{|F|} = \{0, \dots, |F| - 1\}$$.

The algorithm $$\textsf{Commit}$$ now can be evaluated by computing $$c = \prod_{i = 0}^d (g^{x^i})^{a_i} \cdot \prod_{i = 0}^d (h^{x^i})^{r_i} = g^{f(x)}\cdot h^{r(x)},$$ namely, the original conditional-hiding commitment to $$f(X)$$ multiplying with the condition-hiding commitment to random polynomial $$r(X)$$ becomes an unconditional commitment $$c$$ where auxiliary information $$\textsf{aux}$$ can be set to be the tuple $$(f(X), r(X))$$. Hence, now, the adversary knows nothing about the evaluations of any index in $$\mathbb{F}$$. We can see clearly that the commitment $$c$$ hides $$f(X)$$ unconditionally since $$r_0$$ is chosen uniformly from $$\mathbb{Z}_{|\mathbb{F}|}$$ and, hence, $$(h^{x^0})^{r_0}$$ is uniform in $$\mathbb{F}$$. It also implies that $$c$$ is uniform in $$\mathbb{F}$$.

Since $$c = g^{f(x)}\cdot h^{r(x)}$$, we can say that $$c$$ is the multiplication of two parts, namely, the message part $$g^{f(x)}$$ and the randomness part $$h^{r(x)}$$.

We now discuss how algorithms $$\textsf{CreateWitness}$$ and $$\textsf{VerifyEval}$$ work with respect to introductory of the additional part, namely, $$h^{r(x)}$$.

### Creating witness in unconditional hiding mode

For a given index $$i$$, the witness output by algorithm $$\textsf{CreateWitness}$$ is also a multiplication of $$2$$ parts. We simply call the message evaluation and randomness evaluation parts.

The message evaluation part is computed identically to the conditional version of the commitment scheme. That is, we compute the formula $$g^{\psi(x)}$$ where $$\psi(X) = \frac{f(X) - f(i)}{X - i}$$.

The randomness evaluation part is also conducted similarly. Notice that, since we employ $$r(X)$$ as a polynomial of degree $$d$$, we can compute witness for the correct evaluation on the same index $$i$$, namely, $$r(i)$$. This randomness evaluation part is equal to $$h^{\varphi(x)}$$ where $$\varphi(X) = \frac{r(X) - r(i)}{X - i}$$.

Remark. As previously said, $$x$$ is unknown to committer. Therefore, the computation of $$g^{\psi(x)}$$ and $$h^{\varphi(x)}$$ must depend on the commitment key $$ck$$ by employing the elements $$g^1, \dots, g^{x^{d - 1}}, h^1, \dots, h^{x^{d - 1}}$$. Notice that we do not you $$g^{x^d}$$ and $$h^{x^d}$$ since $$\psi(X)$$ and $$\varphi(X)$$ are polynomials of degrees at most $$d - 1$$.

The output of algorithm $$\textsf{CreateWitness}$$ is then equal to $$w_i = (w^\star_i, s_i)$$ where $$w^\star_i = g^{\psi(x)} \cdot h^{\varphi(x)}$$ is the multiplication of message evaluation and randomness evaluation parts and $$s_i = r(i)$$. Notice that $$r(i)$$ is attached to witness in order to help the evaluation verification algorithm, to be described below, work.

### Verifying correct evaluation in unconditional mode

The evaluation verification algorithm $$\textsf{VerifyEval}$$ receives as inputs the commitment key $$ck = (g^1, \dots, g^{x^d}, h^1, \dots, h^{x^d})$$, commitment $$c$$ to $$f(X)$$, index $$i$$, value $$v_i \in \mathbb{F}$$ assumed to be equal to $$f(i)$$ and $$w_i = (w^\star_i, s_i)$$. This algorithm is expected to return $$1$$ is $$v_i$$ is equal to $$f(i)$$ with the use of associated witness $$w_i$$.

To verify whether $$v_i = f(i)$$, it is worth to verify both correct computations of $$f(i)$$ and $$r(i)$$. More precisely, verifier needs to confirm that $$v_i = f(i)$$ and $$s_i = r(i)$$. To this end, again, we imitate the verification process of the conditional hiding case. Hence, we employ again bilinear pairing $$e : \mathbb{G}\times \mathbb{G} \to \mathbb{G}_T$$ which maps $$(g^a, g^b)$$ to $$e(g, g)^{ab}$$. However, there is an obstacle here. Observe that, in this pairing, we use only $$1$$ generator $$g$$ while our commitment employs $$2$$ different generators $$g$$ and $$h$$ both generating $$G$$. In order to enable the bilinear pairing, we enforce $$h = g^\gamma$$ for some hidden $$\gamma$$. This enforcement works because $$g$$ is the generator of $$\mathbb{G}$$ and $$h$$ belongs to $$\mathbb{G}$$. Hence, our commitment $$c$$ can be alternatively written as $$c = g^{f(x)}\cdot h^{r(x)} = g^{f(x)}\cdot g^{\gamma \cdot r(x)} = g^{f(x) + \gamma\cdot r(x)}$$ which is a conditional commitment to $$f(X) + \gamma\cdot r(X)$$ with $$\gamma$$ unknown to both committer and verifier.

Moreover, the witness $$w^\star_i = g^{\psi(x)}\cdot h^{\varphi(x)}$$ can also be interpreted as $$w^\star_i = g^{\psi(x)}\cdot h^{\varphi(x)}=g^{\psi(x)} \cdot g^{\gamma\cdot \varphi(x)} = g^{\psi(x) + \gamma\cdot \varphi(x)}$$ which is also a witness for correct evaluation at index $$i$$ with respect to polynomial $$f(X) + \gamma\cdot r(X)$$ whose $$\gamma$$ is not known to both parties, namely, committer and verifier.

We now observe that $$\psi(X) + \gamma\cdot \varphi(X) = \frac{f(X) - f(i)}{X - i} + \gamma\cdot \frac{r(X) - r(i)}{X - i}$$. Hence, it is worth to guarantee the following equation holds: $$\left(\frac{f(X) - f(i)}{X - i} + \gamma\cdot \frac{r(X) - r(i)}{X - i}\right)\cdot\left(X - i\right) - (f(i) + \gamma\cdot r(i)) = f(X) + \gamma \cdot r(X).$$

We now describe the process for verifying the above equation by employing the bilinear pairing function $$g : \mathbb{G}\times \mathbb{G} \to \mathbb{G}_T$$. Since the above equation has a multiplication, we apply the bilinear pairing by checking $$e\left(g^{\frac{f(x) - v_i}{x - i} + \gamma\cdot \frac{r(x) - s_i}{x - i}}, g^{x - i}\right)\cdot e\left(g^{-\left(v_i + \gamma\cdot s_i\right)}, g\right) = e\left(g^{f(x) + \gamma\cdot r(x)},g\right)$$ where $$x$$ and $$\gamma$$ are unknown to both parties. Since $$x$$ and $$\gamma$$ are not known to both parties, it is inconvenient to evaluate the bilinear pairing function. However, since $$ck = (g^1, \dots, g^{x^d}, h^1, \dots, h^{x^d})$$ and $$h = g^\gamma$$ are public inputs, we can replace the appearances of $$x$$ and $$\gamma$$ in the above equation by those of $$ck$$ and $$h$$. The above computation of bilinear pairing hence becomes $$e\left(w^\star, g^x / g^i\right)\cdot e\left(g^{-v_i}\cdot h^{-s_i}, g\right) = e(c, g)$$ since $$c = g^{f(x)}\cdot h^{r(x)} = g^{f(x) + \gamma\cdot r(x)}$$.

# PlonK

In this chapter, we will present the construction of [GWC19], i.e., permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge.

PlonK is a succinct non-interactive zero-knowledge argument (SNARK) system that proves the correct execution of a program, i.e., in this case, an arithmetic circuit with only addition $$(+)$$ and multiplication $$(\cdot)$$ operations.

As an overview of the construction, we separate it into $$2$$ parts. First, we transform the arithmetic circuit into a set of constraints, called arithmetization and represented under some form of polynomials. Then, by applying some proof technique, it compiles the arithmetization into the proof.

Author(s): khaihanhtang

# Plonk's Arithmetization

PlonK's arithmetization [GWC19] breaks the circuit into a batch of gates, namely, multiplications, additions, multiplications with constants and additions with constants. For each gate, the operation is transformed into a unified form with respective selectors, uniquely determined by the gate without assigned inputs and output. On the other hand, since breaking circuit into gates introduces the inconsistencies among wires, we additionally apply copy constraints to wires to guarantee that such inconsistencies unavailable.

We describe circuit specification in Circuit Specification. Then, we discuss how to break the circuit into gates and label wires in Breaking Circuit. Then we present unified form of gate constraints in Gate Constraints and handling copy constraints in Copy Constraints.

# Circuit Specification

Let $$\mathbb{F}$$ be a finite field. In this section, we describe the arithmetic circuit whose operations are over $$\mathbb{F}$$.

Let $$\ell_{\mathsf{in}} \in \mathbb{N}$$ be the number of input wires of the circuit $$\mathcal{C}$$. Assume that $$\mathcal{C}$$ has exactly $$n$$ gates. Each gate takes at most $$2$$ wires as inputs and returns $$1$$ output wires. In particular,

• Addition and multiplications gates takes $$2$$ inputs and return $$1$$ output.
• Gates of additions and multiplications with constants take $$1$$ input and return $$1$$ output.

Let's take a look at the following example.

Assume that $$f(u, v) = u^2 + 3uv + v + 5$$. Then the sequence are arranged in the following constraints, wrapped as below. $$\begin{cases} z_1 = u \cdot u &\text{(multiplication)},\\ z_2 = u \cdot v &\text{(multiplication)},\\ z_3 = z_2 \cdot 3 &\text{(multiplication with constant)},\\ z_4 = z_1 + z_3 &\text{(addition)},\\ z_5 = z_4 + v &\text{(addition)},\\ z_6 = z_5 + 5 &\text{(addition with constant)}. \end{cases}$$ The input size is $$\ell_{\mathsf{in}} = 2$$ for variables $$u, v$$ and the output is $$z_6$$.

# Breaking Circuit

To break the circuit into gates with wires separated, namely, no wire involves to $$2$$ or more gates, we use a set of labels $$\mathcal{I} = \{a_1, \dots, a_n, b_1, \dots, b_n, c_1, \dots, c_n\}$$ to denote the wire label of each gate. Let $$x : \mathcal{I} \to \mathbb{F}$$ be the function mapping wires to their respective wire values. Hence, $$x(id)$$ represents the value at wire $$id \in \mathcal{I}$$ For simplicity, we write $$x_{id}$$, in place of $$x(id)$$, for any $$id \in \mathcal{I}$$.

Specifically, for each $$i \in \{1, \dots, n\}$$, we denote by

• $$x_{c_i} = x_{a_i}\circ x_{b_i}$$ to denote the computation where $$\circ$$ is either addition or multiplication.
• $$x_{c_i} = x_{a_i}\circ c$$ to denote the computation where $$\circ$$ is either addition or multiplication with constant and $$c$$ is the constant.

At this point, we may be curious what value $$x_{b_i}$$ takes if $$\circ$$ is operation with constant. In fact, in this case x_{b_i} can be seen as garbage and can take any value from $$\mathbb{F}$$. This value will affect neither security nor correctness of the system since the constraints in PlonK's arithmetization guarantee that such a compromise will not happen.

Let's take a look at the following example, taken from Circuit Specification.

We have $$f(u, v) = u^2 + 3uv + v + 5$$ and the constraints below. $$\begin{cases} z_1 = u \cdot u &\text{(multiplication)},\\ z_2 = u \cdot v &\text{(multiplication)},\\ z_3 = z_2 \cdot 3 &\text{(multiplication with constant)},\\ z_4 = z_1 + z_3 &\text{(addition)},\\ z_5 = z_4 + v &\text{(addition)},\\ z_6 = z_5 + 5 &\text{(addition with constant)}. \end{cases}$$ By breaking circuit, we have the following constraints with respect to the above format, namely, using $$\mathcal{I} = \{a_1, \dots, a_6, b_1, \dots, b_6, c_1, \dots, c_6\}$$, where $$n = 6$$, and $$x : \mathcal{I}\to \mathbb{F}$$. $$\begin{cases} x_{c_1} = x_{a_1} \cdot x_{b_1},\\ x_{c_2} = x_{a_2} \cdot x_{b_2},\\ x_{c_3} = x_{a_3} \cdot 3,\\ x_{c_4} = x_{a_4} + x_{b_4},\\ x_{c_5} = x_{a_5} + x_{b_5}, \\ x_{c_6} = x_{a_6} + 5 \end{cases}\text{ where } \begin{cases} u = x_{a_1} = x_{a_2} = x_{b_1},\\ v = x_{b_2} = x_{b_5},\\ z_1 = x_{a_4} = x_{c_1},\\ z_2 = x_{a_3} = x_{c_2},\\ z_3 = x_{b_4} = x_{c_3},\\ z_4 = x_{a_5} = x_{c_4},\\ z_5 = x_{a_6} = x_{c_5},\\ z_6 = x_{c_6}.\ \end{cases}$$ Notice that $$v_{b_3}$$ and $$v_{b_6}$$ are dismissed. These values can be any elements from \mathbb{F} and do not have any effect to the arithmetization.

For equations in the system guaranteeing the correct computation of the operation, we call them gate constraints.

For equations in the system guaranteeing the equalities, or consistencies, among wires, we call them copy constraints.

We first discuss the transformation of gate constraints into a common unified form with publicly determined selectors in Gate Constraints. Then, we discuss the method for guaranteeing copy constraints in Copy Constraints.

# Gate constraints

At this stage, for each $$i \in \{1,\dots, n\}$$, we need to transform the computation of each gate to a unified form as follows: $$q^O_i \cdot x_{c_i} + q^L_i \cdot x_{a_i} + q^R_i \cdot x_{b_i} + q^M_i \cdot (x_{a_i} \cdot x_{b_i}) + q^C_i = 0$$ where $$q_i^O, q_i^L, q_i^R, q_i^M, q_i^C$$ are selectors uniquely determined by the corresponding gate. In particular,

• For addition gate, $$(q_i^O, q_i^L, q_i^R, q_i^M, q_i^C) = (-1, 1, 1, 0, 0)$$ since $$(-1) \cdot x_{c_i} + 1 \cdot x_{a_i} + 1 \cdot x_{b_i} + 0 \cdot (x_{a_i} \cdot x_{b_i}) + 0 = 0$$ is equivalent to $$x_{c_i} = x_{a_i} + x_{b_i}$$.

• For multiplication gate, $$(q_i^O, q_i^L, q_i^R, q_i^M, q_i^C) = (-1, 0, 0, 1, 0)$$ since $$(-1) \cdot x_{c_i} + 0 \cdot x_{a_i} + 0 \cdot x_{b_i} + 1 \cdot (x_{a_i} \cdot x_{b_i}) + 0 = 0$$ is equivalent to $$x_{c_i} = x_{a_i} \cdot x_{b_i}$$.

• For gate of addition with constant, $$(q_i^O, q_i^L, q_i^R, q_i^M, q_i^C) = (-1, 1, 0, 0, c)$$ since $$(-1) \cdot x_{c_i} + 1 \cdot x_{a_i} + 0 \cdot x_{b_i} + 0 \cdot (x_{a_i} \cdot x_{b_i}) + c = 0$$ is equivalent to $$x_{c_i} = x_{a_i} + c$$.

• For gate of multiplication with constant, $$(q_i^O, q_i^L, q_i^R, q_i^M, q_i^C) = (-1, c, 0, 0, 0)$$ since $$(-1) \cdot x_{c_i} + c \cdot x_{a_i} + 0 \cdot x_{b_i} + 0 \cdot (x_{a_i} \cdot x_{b_i}) + 0 = 0$$ is equivalent to $$x_{c_i} = x_{a_i} \cdot c$$.

We now take a look at the example achieved above, i.e., $$\begin{cases} x_{c_1} = x_{a_1} \cdot x_{b_1},\\ x_{c_2} = x_{a_2} \cdot x_{b_2},\\ x_{c_3} = x_{a_3} \cdot 3,\\ x_{c_4} = x_{a_4} + x_{b_4},\\ x_{c_5} = x_{a_5} + x_{b_5}, \\ x_{c_6} = x_{a_6} + 5. \end{cases}$$ In this example, we can transform the above system of equation into the unified form as follows: $$\begin{cases} (-1) \cdot x_{c_1} + 0 \cdot x_{a_1} + 0 \cdot x_{b_1} + 1 \cdot (x_{a_1} \cdot x_{b_1}) + 0 = 0,\\ (-1) \cdot x_{c_2} + 0 \cdot x_{a_2} + 0 \cdot x_{b_2} + 1 \cdot (x_{a_2} \cdot x_{b_2}) + 0 = 0,\\ (-1) \cdot x_{c_3} + 3 \cdot x_{a_3} + 0 \cdot x_{b_3} + 0 \cdot (x_{a_3} \cdot x_{b_3}) + 0 = 0,\\ (-1) \cdot x_{c_4} + 1 \cdot x_{a_4} + 1 \cdot x_{b_4} + 0 \cdot (x_{a_4} \cdot x_{b_4}) + 0 = 0,\\ (-1) \cdot x_{c_5} + 1 \cdot x_{a_5} + 1 \cdot x_{b_5} + 0 \cdot (x_{a_5} \cdot x_{b_5}) + 0 = 0,\\ (-1) \cdot x_{c_6} + 1 \cdot x_{a_6} + 0 \cdot x_{b_6} + 0 \cdot (x_{a_6} \cdot x_{b_6}) + 5 = 0. \end{cases}$$

# Copy Constraints

Recall that gate constraints do not enforce the equalities of wire values making inconsistencies across the circuit. We generalize copy constraints to the following problem.

Let $$k \in \{1, \dots, 3n\}$$ and $$\{i_1, \dots, i_k\} \subseteq \mathcal{I}$$ satisfying $$i_1 < i_2 < \dots < i_k$$. We would like to show that $$x_{i_1} = x_{i_2} = \dots = x_{i_k}.$$

The technique for proving this problem is tricky. We form the pairs of index-value and make them into a sequence $$\big((i_1, x_{i_1}), (i_2, x_{i_2}), \dots, (i_k, x_{i_k})\big).$$ It can be observed that if $$x_{i_1} = x_{i_2} = \dots = x_{i_k}$$, then, if we rotate the indices among the pairs, we achieve a sequence
$$\big((i_k, x_{i_1}), (i_1, x_{i_2}), \dots, (i_{k - 1}, x_{i_k})\big)$$ that is permutation of the previous sequence. Notice that we just rotated the indices $$1$$ step to the left and this is the recommended rotation. This fact helps imply the other direction of the fact. For more details, we use the following observation

Observation. $$\big((i_1, x_{i_1}), (i_2, x_{i_2}), \dots, (i_k, x_{i_k})\big)$$ is a permutation of $$\big((i_k, x_{i_1}), (i_1, x_{i_2}), \dots, (i_{k - 1}, x_{i_k})\big)$$ if and only if $$x_{i_1} = x_{i_2} = \dots = x_{i_k}$$.

Proof. The proof is as follows:

"$$\Leftarrow$$": This direction is straightforward.

"$$\Rightarrow$$": Since the two sequences are permutation of each other, for $$j \in \{1, \dots, k\}$$ we consider $$(i_j, x_{i_j})$$ from the first sequence. It can be seen that $$(i_j, x_{i_{(j \mod k) + 1}})$$ from the second sequence is the only one that is equal $$j \in \{1, \dots, k\}$$ since they share the unique prefix $$i_j$$. Hence, $$x_{i_j} = x_{i_{(j \mod k) + 1}}$$. Following this argument, we see that $$x_{i_1} = x_{i_2}, x_{i_2} = x_{i_3}, \dots, x_{i_{k - 1}} = x_{i_k}, x_{i_k} = x_{i_1}$$. Thus, $$x_{i_1} = x_{i_2} = \dots = x_{i_k}$$.

#### General paradigm for proving copy constraints of the entire circuit

Recall that $$x : \mathcal{I} \to \mathbb{F}$$. We can deterministically determine a partition of $$\mathcal{I}$$ such that $$\mathcal{I} = \bigcup_{j = 1}^{\ell_{\mathsf{in}} + n} \mathcal{I}_j$$

where $$\ell_{\mathsf{in}} + n$$ is the number of wires of the original circuits, namely, $$\ell_{\mathsf{in}}$$ input wires and $$n$$ output wires of all gates. Hence each subset $$\mathcal{I}_j$$ is the set of wire labels whose value are all equal to the same wire value of the original circuit. We hence obtain a rotation of indices for each subset $$\mathcal{I}_j$$. By unifying all those rotations, we achieve a permutation $$\sigma : \mathcal{I}\to\mathcal{I}$$ such that

$$\big((a_1, x_{a_1}), \dots, (a_n, x_{a_n}), (b_1, x_{b_1}), \dots, (b_n, x_{b_n}), (c_1, x_{c_1}), \dots, (c_n, x_{c_n})\big)$$

is a permutation of

$$\big((\sigma(a_1), x_{a_1}), \dots, (\sigma(a_n), x_{a_n}), (\sigma(b_1), x_{b_1}), \dots, (\sigma(b_n), x_{b_n}), (\sigma(c_1), x_{c_1}), \dots, (\sigma(c_n), x_{c_n})\big).$$ Such guaranteed permutation relation implies the consistencies among wires of the circuit.

Recall the example in Breaking Circuit with the following copy constraints. $$\begin{cases} u = x_{a_1} = x_{a_2} = x_{b_1},\\ v = x_{b_2} = x_{b_5},\\ z_1 = x_{a_4} = x_{c_1},\\ z_2 = x_{a_3} = x_{c_2},\\ z_3 = x_{b_4} = x_{c_3},\\ z_4 = x_{a_5} = x_{c_4},\\ z_5 = x_{a_6} = x_{c_5},\\ z_6 = x_{c_6}.\ \end{cases}$$ We achieve the partition $$\big\{\{a_1, a_2, b_1\}, \{b_2, b_5\}, \{a_4, c_1\}, \{a_3, c_2\}, \{b_4, c_3\}, \{a_5, c_4\}, \{a_6, c_5\}, \{c_6\}\big\}.$$ We hence can achive the permutation $$\sigma: \mathcal{I}\to\mathcal{I}$$ as follows: $$\begin{array}[ccc]\\ \sigma(a_1) = b_1, &\sigma(a_2) = a_1, &\sigma(b_1) = a_2,\\ \sigma(b_2) = b_5, &\sigma(b_5) = b_2,\\ \sigma(a_4) = c_1, &\sigma(c_1) = a_4,\\ \sigma(a_3) = c_2, &\sigma(c_2) = a_3,\\ \sigma(b_4) = c_3, &\sigma(c_3) = b_4,\\ \sigma(a_5) = c_4, &\sigma(c_4) = a_5,\\ \sigma(a_6) = c_5, &\sigma(c_5) = a_6,\\ \sigma(c_6) = c_6. \end{array}$$

# Halo 2 for Dummies

Halo 2 is succint non-interactive zero-knowledge argument of knowledge (zkSNARK) library for developing applications with an associated zkSNARK in order to prove their honesty in computing the programs. In this chapter, I present a simple implementation of a program, under the form of a $$2$$-variable polynomial, by using Halo 2.

Author(s): khaihanhtang

# PLONKish Arithemetization

We recommend readers who are not familiar with PlonK's arithmetization to read the article PlonK's Arithmetization. In this chapter, we further discuss a more customized version of PlonK's arithmetization, namely, PLONKish arithmetization. Customization aims to handle more general gates with more complicated structures rather than employing only multiplication, addition, multiplication with constants and addition with constants gates.

PLONKish arithmetization can be pictured as a table of the following types of columns:

• Constant columns for putting constants,
• Instant columns for putting public inputs,
• Advice columns for putting private inputs, literally known as witnesses,
• Selector columns for putting selectors.

For simplicity, in this section, we present a transformation from a program, represented by an arithmetic circuit, to the form of PLONKish arithmetization. This transformation can be showed by an example that proves the knowledge of a $$2$$-variable polynomial specified in A Simple Arithmetic Circuit. Then, we explain the method for transforming this polynomial to the form of PLONKish arithmetization in Transforming to PLONKish Arithmetization and programming in A Simple Halo 2 Program.

# A Simple Arithmetic Circuit

Let $$\mathbb{F}$$ be some finite field. We would like to show in this section the transformation from the program computing polynomial $$f(u, v) = u^2 + 3uv + v + 5$$ where $$u, v \in \mathbb{F}$$. With inputs $$u, v \in \mathbb{F}$$, the arithmetic circuit for this polynomial is equivalently represented by topologically following the sequence of computations below.

1. Compute $$u^2$$.
2. Compute $$uv$$.
3. Compute $$3uv$$.
4. Compute $$u^2 + 3uv$$.
5. Compute $$u^2 + 3uv + v$$.
6. Compute $$u^2 + 3uv + v + 5$$.

# Transforming to PLONKish Arithmetization

To setup the above sequence of computations, in A Simple Arithmetic Circuit, into PLONKish arithmetization, we specify a table to contain all possible variables appeared during the computing the arithmetic circuit for $$f(u,v) = u^2 + 3uv + v + 5$$. Then, for each row, we setup a the set gates, or equivalently gate constraints, that applies to the row.

## Specifying columns

We define the following tuple of columns $$(\mathsf{advice} _0, \mathsf{advice} _1, \mathsf{advice} _2, \mathsf{constant}, \mathsf{selector} _{\mathsf{add}}, \mathsf{selector} _{\mathsf{mul}}, \mathsf{selector} _{\mathsf{addc}}, \mathsf{selector} _{\mathsf{mulc}})$$ where

• $$\mathsf{advice}_0, \mathsf{advice}_1, \mathsf{advice}_2$$ are columns containing private inputs, i.e., values belonging to these columns are hidden to any verifier,
• $$\mathsf{constant}$$ is the column containing public constants appearing during the computation,
• $$\mathsf{selector} _{\mathsf{add}}, \mathsf{selector} _{\mathsf{mul}}, \mathsf{selector} _{\mathsf{addc}}, \mathsf{selector} _{\mathsf{mulc}}$$ contain public selectors corresponding to addition, multiplication, addition with constant, multiplication with constant, respectively, gates.

## Transforming to Constraints and Meaning of Columns

We now explain the intuition to the above setting of columns. To do this, we need to transform the sequence of computations in A Simple Arithmetic Circuit into $$2$$ parts, namely, gate constraints (or gate identities) and wire constraints (or wire identities). In particular, we transform \begin{align} t^{(1)} &= u^2, &t^{(3)} &= t^{(2)} \cdot 3 = 3uv, & t^{(5)} &= t^{(4)} + v = u^2 + 3uv + v,\\ t^{(2)} &= u v, & t^{(4)} &= t^{(1)} + t^{(3)} = u^2 + 3uv, &t^{(6)} &= t^{(5)} + 5 = u^2 + 3uv + v + 5 \end{align} to gate constraints

\begin{align} x ^{(1)} _{c} &= x ^{(1)} _{a} \cdot x ^{(1)} _{b}, & x ^{(3)} _{c} &= x ^{(3)} _{a} \cdot 3, & x ^{(5)} _{c} &= x ^{(5)} _{a} + x ^{(5)} _{b},\\ x ^{(2)} _{c} &= x ^{(2)} _{a} \cdot x ^{(2)} _{b}, & x ^{(4)} _{c} &= x ^{(4)} _{a} + x ^{(4)} _{b}, & x ^{(6)} _{c} &= x ^{(6)} _{a} + 5 \end{align}

and wire constraints

\begin{align} u &= x_{a}^{(1)} = x_{a}^{(2)} = x_{b}^{(1)}, &t^{(1)} &= x_{a}^{(4)} = x_{c}^{(1)}, &t^{(3)} &= x_{b}^{(4)} = x_{c}^{(3)}, &t^{(5)} &= x_{a}^{(6)} = x_{c}^{(5)},\\ v &= x_{b}^{(2)} = x_{b}^{(5)}, &t^{(2)} &= x_{a}^{(3)} = x_{c}^{(2)}, &t^{(4)} &= x_{a}^{(5)} = x_{c}^{(4)}, &t^{(6)} &= x_{c}^{(6)}. \end{align}

We note that $$x_b^{(3)}, x_b^{(6)}$$ are not set. To deal with these values, we simple set them to be equal to any random vales, since they do not affect the constraints defined above.

Notice that each equation in gate constrains receives either $$2$$ secret inputs or $$1$$ secret input and $$1$$ constant and returns $$1$$ secret output. Therefore, we use $$2$$ columns, namely, $$\mathsf{advice}_0$$ and $$\mathsf{advice}_1$$, to contain the secret inputs and $$1$$ column, namely, $$\mathsf{advice}_2$$, to contain the secret output. Moreover, we also use the column $$\mathsf{constant}$$ to contain possible public constants.

The remaining columns, namely, $$\mathsf{selector} _{\mathsf{add}}, \mathsf{selector} _{\mathsf{mul}}, \mathsf{selector} _{\mathsf{addc}}, \mathsf{selector} _{\mathsf{mulc}}$$, are employed to contain the selectors indicating the required constraints for each row, which corresponds to a constraint in the set of gate constraints specified above. We now clarify the employment of selectors to guarantee the correctness of gate constraints.

## Specifying Table Values and Gate Constraints

For each row $$i \in \{1, \dots, 6\}$$ of the table, we denote by the tuple $$(x_{a}^{(i)}, x_{b}^{(i)}, x_{c}^{(i)}, c^{(i)}, s_{\mathsf{add}}^{(i)}, s_{\mathsf{mul}}^{(i)}, s_{\mathsf{addc}}^{(i)}, s_{\mathsf{mulc}}^{(i)})$$ corresponding to the tuple of columns $$(\mathsf{advice} _0, \mathsf{advice} _1, \mathsf{advice} _2, \mathsf{constant}, \mathsf{selector} _{\mathsf{add}}, \mathsf{selector} _{\mathsf{mul}}, \mathsf{selector} _{\mathsf{addc}}, \mathsf{selector} _{\mathsf{mulc}})$$ devised above.

For each row $$i \in \{1,\dots, 6\}$$, we set the following $$4$$ constraints \begin{align} s_\mathsf{add}^{(i)}\cdot (x_{a}^{(i)} + x_{b}^{(i)} - x_{c}^{(i)}) &= 0, & s_\mathsf{mul}^{(i)}\cdot (x_{a}^{(i)} \cdot x_{b}^{(i)} - x_{c}^{(i)}) &= 0,\\ s_\mathsf{addc}^{(i)}\cdot (x_{a}^{(i)} + c^{(i)} - x_{c}^{(i)}) &= 0, & s_\mathsf{mulc}^{(i)}\cdot (x_{a}^{(i)} \cdot c^{(i)} - x_{c}^{(i)}) &= 0. \end{align}

Example. Assume that the $$i$$-th row corresponds to a multiplication gate. Hence, in this case, we set $$(s_{\mathsf{add}}^{(i)}, s_{\mathsf{mul}}^{(i)}, s_{\mathsf{addc}}^{(i)}, s_{\mathsf{mulc}}^{(i)}) = (0, 1, 0, 0)$$. We can see that only $$s_{\mathsf{mul}}^{(i)} = 1$$ while remaining selectors are set to $$0$$.

Therefore, whatever the values $$x_{a}^{(i)}, x_{b}^{(i)}, x_{c}^{(i)}, c^{(i)}$$ are set, the results always hold with respect to the gates $$s_\mathsf{add}^{(i)}\cdot (x_{a}^{(i)} + x_{b}^{(i)} - x_{c}^{(i)}) = 0, s_\mathsf{addc}^{(i)}\cdot (x_{a}^{(i)} + c^{(i)} - x_{c}^{(i)}) = 0$$ and $$s_\mathsf{mulc}^{(i)}\cdot (x_{a}^{(i)} \cdot c^{(i)} - x_{c}^{(i)}) = 0$$.

However, since $$s_{\mathsf{mul}}^{(i)} = 1$$, we can see that the gate $$s_\mathsf{mul}^{(i)}\cdot (x_{a}^{(i)} \cdot x_{b}^{(i)} - x_{c}^{(i)}) = 0$$ must guarantee $$x_{a}^{(i)} \cdot x_{b}^{(i)} = x_{c}^{(i)}$$.

# A Simple Halo 2 Program

Based on the specifications in Transforming to PLONKish Arithmetization, we now show an implementation in Rust for proving knowledge of input of $$f(u, v) = u^2 + 3uv + v + 5$$ mentioned in A Simple Arithmetic Circuit. The implementation can be found in Example of Halo2-PSE.

The following are a recommended for setting up a Rust implementation for Halo 2.

In Cargo.toml, specify the dependency

#![allow(unused)]
fn main() {
[dependencies]
halo2_proofs = { git = "https://github.com/privacy-scaling-explorations/halo2.git" }
}

In main.rs, we implement step-by-step as follows:

1. Specify columns by putting all possible columns into a custom struct MyConfig.
#![allow(unused)]
fn main() {
#[derive(Clone, Debug)]
struct MyConfig {
instance: Column<Instance>,
constant: Column<Fixed>,

// selectors
s_mul: Selector,
s_mul_c: Selector,
}
}
1. Define gates based on elements of the above defined struct.
#![allow(unused)]
fn main() {
impl<Field: FieldExt> FChip<Field> {
fn configure(
meta: &mut ConstraintSystem<Field>,
instance: Column<Instance>,
constant: Column<Fixed>,
) -> <Self as Chip<Field>>::Config {
// specify columns used for proving copy constraints
meta.enable_equality(instance);
meta.enable_constant(constant);
meta.enable_equality(*column);
}

// extract columns with respect to selectors
let s_mul = meta.selector();
let s_mul_c = meta.selector();

vec![s_add * (lhs + rhs - out)]
});

// define multiplication gate
meta.create_gate("mul", |meta| {
let s_mul = meta.query_selector(s_mul);
vec![s_mul * (lhs * rhs - out)]
});

// define addition with constant gate
let fixed = meta.query_fixed(constant, Rotation::cur());
vec![s_add_c * (lhs + fixed - out)]
});

// define multiplication with constant gate
meta.create_gate("mul with constant", |meta| {
let s_mul_c = meta.query_selector(s_mul_c);
let fixed = meta.query_fixed(constant, Rotation::cur());
vec![s_mul_c * (lhs * fixed - out)]
});

MyConfig {
instance,
constant,
s_mul,
s_mul_c,
}
}
}
}
1. Put values to the table and define wire constraints.
#![allow(unused)]
fn main() {
impl<Field: FieldExt> Circuit<Field> for MyCircuit<Field> {
type Config = MyConfig;
type FloorPlanner = SimpleFloorPlanner;

fn without_witnesses(&self) -> Self {
Self::default()
}

fn configure(meta: &mut ConstraintSystem<Field>) -> Self::Config {
let instance = meta.instance_column();
let constant = meta.fixed_column();
}

fn synthesize(
&self, config: Self::Config, mut layouter: impl Layouter<Field>
) -> Result<(), Error> {
// handling multiplication region
let t1 = self.u * self.u;
let t2 = self.u * self.v;
let t3 = t2 * Value::known(Field::from(3));

// define multiplication region
let (
(x_a1, x_b1, x_c1),
(x_a2, x_b2, x_c2),
(x_a3, x_c3)
) = layouter.assign_region(
|| "multiplication region",
|mut region| {
// first row
config.s_mul.enable(&mut region, 0)?;

// second row
config.s_mul.enable(&mut region, 1)?;

// third row
config.s_mul_c.enable(&mut region, 2)?;
region.assign_fixed(|| "constant 3",
config.constant.clone(), 2, || Value::known(Field::from(3)))?;

Ok((
(x_a1.cell(), x_b1.cell(), x_c1.cell()),
(x_a2.cell(), x_b2.cell(), x_c2.cell()),
(x_a3.cell(), x_c3.cell())
))
}
)?;

let t4 = t1 + t3;
let t5 = t4 + self.v;
let t6 = t5 + Value::known(Field::from(5));

let (
(x_a4, x_b4, x_c4),
(x_a5, x_b5, x_c5),
(x_a6, x_c6)
) = layouter.assign_region(
|mut region| {
// first row

// second row

// third row
region.assign_fixed(|| "constant 5",
config.constant.clone(), 2, || Value::known(Field::from(5)))?;
Ok((
(x_a4.cell(), x_b4.cell(), x_c4.cell()),
(x_a5.cell(), x_b5.cell(), x_c5.cell()),
(x_a6.cell(), x_c6.cell())
))
}
)?;

layouter.constrain_instance(x_c6, config.instance, 0)?;

// enforce copy constraints
layouter.assign_region(|| "equality",
|mut region| {
region.constrain_equal(x_a1, x_a2)?; // namely, x_a1 = x_a2
region.constrain_equal(x_a2, x_b1)?; // namely, x_a2 = x_b1

region.constrain_equal(x_b2, x_b5)?; // namely, x_b2 = x_b5

region.constrain_equal(x_a4, x_c1)?; // namely, x_a4 = x_c1

region.constrain_equal(x_a3, x_c2)?; // namely, x_a3 = x_c2

region.constrain_equal(x_b4, x_c3)?; // namely, x_b4 = x_c3

region.constrain_equal(x_a5, x_c4)?; // namely, x_a5 = x_c4

region.constrain_equal(x_a6, x_c5)?; // namely, x_a6 = x_c5
Ok(())
}
)?;
Ok(())
}
}
}

# Bibliography

[DY05] - Yevgeniy Dodis, Aleks, r Yampolskiy - A Verifiable Random Function with Short Proofs and Keys. - 2005. -

# Summary/Abstract

N/A

[HJ15] - Tibor Jager - Verifiable Random Functions from Weaker Assumptions. - 2015. -

# Summary/Abstract

N/A

[GWC19] - Ariel Gabizon, Zachary J. Williamson, Oana Ciobotaru - {PLONK:} Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge. - 2019. -

# Summary/Abstract

N/A

[Sipser2012-introduction-to-theory-of-computation] - Sipser, Michael - Introduction to the Theory of Computation. - 2012. -

# Summary/Abstract

N/A

[PWHVNRG17] - DimitriosPapadopoulos, Duane Wessels, Shumon Huque, Moni Naor, Jan Včelák, Leonid Reyzin, Sharon Goldberg - Making NSEC5 Practical for DNSSEC. - 2017. -

# Summary/Abstract

N/A

[EKSSZSC20] - Muhammed F. Esgin, Veronika Kuchta, Amin Sakzad, Ron Steinfeld, Zhenfei Zhang, Shifeng Sun, Shumo Chu - Practical Post-quantum Few-Time Verifiable Random Function with Applications to Algorand. - 2021. -

# Summary/Abstract

N/A

[SECG1] - Certicom Research - SEC1: Elliptic Curve Cryptography. - 2009. -

# Summary/Abstract

N/A

[BKP11] - Michael Backes, Aniket Kate, Arpita Patra - Computational Verifiable Secret Sharing Revisited. - 2011. -

# Summary/Abstract

N/A

[MR02] - Silvio Micali, Ronald L. Rivest - Micropayments Revisited. - 2002. -

# Summary/Abstract

N/A

[Ped91a] - Torben P. Pedersen - A Threshold Cryptosystem without a Trusted Party (Extended Abstract). - 1991. -

# Summary/Abstract

N/A

[CGJKR99] - Ran Canetti, Rosario Gennaro, Stanislaw Jarecki, Hugo Krawczyk, Tal Rabin - Adaptive Security for Threshold Cryptosystems. - 1999. -

# Summary/Abstract

N/A

[MRV99] - Silvio Micali, Michael O. Rabin, Salil P. Vadhan - Verifiable Random Functions. - 1999. -

# Summary/Abstract

N/A

[Ped91] - Torben P. Pedersen - Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing. - 1991. -

# Summary/Abstract

N/A

[CGMA85] - Benny Chor, Shafi Goldwasser, Silvio Micali, Baruch Awerbuch - Verifiable Secret Sharing and Achieving Simultaneity in the Presence of Faults (Extended Abstract). - 1985. -

# Summary/Abstract

N/A

[GLOW20] - David Galindo, Jia Liu, Mihai Ordean, Jin{-}Mann Wong - Fully Distributed Verifiable Random Functions and their Application to Decentralised Random Beacons. - 2021. -

# Summary/Abstract

N/A

[Bit19] - Nir Bitansky - Verifiable Random Functions from Non-interactive Witness-Indistinguishable Proofs. - 2020. -

# Summary/Abstract

N/A

[BCKL09] - Mira Belenkiy, Melissa Chase, Markulf Kohlweiss, Anna Lysyanskaya - Compact E-Cash and Simulatable VRFs Revisited. - 2009. -

# Summary/Abstract

N/A

[GJKR99] - Rosario Gennaro, Stanislaw Jarecki, Hugo Krawczyk, Tal Rabin - Secure Distributed Key Generation for Discrete-Log Based Cryptosystems. - 1999. -

# Summary/Abstract

N/A

[KZG10] - Aniket Kate, Gregory M. Zaverucha, Ian Goldberg - Constant-Size Commitments to Polynomials and Their Applications. - 2010. -

N/A