Zero Knowledge Proofs / Secure Computing

Notes and reference for some intro crypt + zero knowledge. IN PROGRESS.

divider

1. Intro

AI inference services require blind trust. Key security risks are privacy of the user’s input and integrity of the server’s output. Ideally, we want private and verifiable computation. The user should send an encrypted question to a trustworthy server which computes and the encrypted response. The user can then decryp the response and verify that it is the intended output of the model, without learning anything else about the model.

Three key capabilities to learn about:

Zero Knowledge Proofs

What is a proof system? The prover has some claim and creates proof π\pi. This proof is sent to the verifier, who will run an algorithm to either accept or reject this proof.

In a zero knowledge proof, a prover wants to convince a verifier that a statement is true without revealing secret evidence.

2. Trapdoor Matrices

Frievald’s Algorithm

FACT: For nonzero MFn×nM \in \F^{n\times n} and uniform random rFnr \in \F^n,

P(Mr=0)1/F.\P(M*r =0) \leq 1/| \F |.

Algorithm: we want to multiply two matrices A,BFn×nA,B \in \F^{n \times n} and check that the matrix returned, CC, is correct.

Now, check with a matrix:

Pseudorandom matrix

Matrix generation algorithm given a key k{0,1}λk \in \{0,1\}^\lambda.

RGen(k)R \leftarrow \text{Gen}(k)

defines a pseudorandom family if, for uniform random key, no algorithm (without the input kk) running in time TT can distinguish the matrix RR from random with probability better than T/2λT/2^\lambda.

notes

Trapdoored pseudorandom matrix

RGen(k)R \leftarrow \text{Gen}(k) is a pseudorandom n×nn \times n matrix and there is a special algorithm

yEval(k,X)y \leftarrow \text{Eval}(k,X)

that given any XFnX \in \F^n returns y=RXy = R*X and Eval\text{Eval} runs in time O(n2)O(n^2).

notes

3. Collision Resistance, Hardness

4. Private, verifiable computation (preliminaries)

Previously: DL and collision-resistant hash from DL

Groups of unknown order

A group where the order is computationally hard to compute, defined by group sampling alg

<G>,nGGen(1λ)<\mathbb{G}>, n \leftarrow \text{GGen}(1^\lambda)

for <G><\mathbb{G}> a succinct (size polynomial in λ\lambda) description of a group G\mathbb{G} and nn an upper bound on its order.

The hidden order assumption holds for GGen\text{GGen} if for any polynomial time algorithm AA and distribution:

P[ga=1]<negl(λ).\P [g^a = 1 ] < \text{negl}(\lambda).

RSA assumption (fill in)

GUO with trusted setup: the algorithm GGen\text{GGen} is allowed to randomly generate secret values that are discarded.

Ex: GGen\text{GGen} samples secret primes p,qp,q and outputs N=pqN = p*q defining the group ZN. \mathbb{Z}^*_N. This group has order (p1)(q1)(p-1)(q-1), which is easy to compute if NN can be factored.

Collision resistant hash from GUO

Collision-resistant based on hidden order assumption: if gx=gyg^x = g^y for xyx \neq y then gxy=1g^{x-y} = 1.

Statistical distance

The statistical distance between random variables X,YX, Y over a finite domain DD is :

Δ(X,Y)=αD12P(X=α)P(Y=α).\Delta(X,Y)= \sum_{\alpha \in D} \frac{1}{2}|P(X=\alpha) - P(Y=\alpha)|.

Sequences of random variables (Xn),(Yn)(X_n), (Y_n) are statistically indistinguishable if

Δ(Xn,Yn)=1/2n.\Delta(X_n, Y_n) = 1/ 2^n.

Sequences (Xn),(Yn)(X_n), (Y_n) on the same domain DD are computationally indistinguishable if for any polynomial time algorithm AA that receives an input xDx \in D and outputs 00 or 11:

P(A(Xn)=1)P(A(Yn)=1)=1/2n.|P(A(X_n) = 1) - P(A(Y_n) = 1)| = 1/2^n.

Learning parity with noise

The LPN assumption over a field F\mathbb{F} has a dimension parameter kk, noise rate ϵ\epsilon, and number of samples NN.

The assumption is that a list of nn samples of the form (ri,ris+ei)(r_i, r_i * s + e_i) is computationally indistinguishable from a list of nn samples of the form (ri,ui)(r_i, u_i) where

Pseudorandom matrix from LPN

R=AS+ER = A*S + E

where AFn×kA \in \mathbb{F}^{n \times k} and SFk×n S \in \mathbb{F}^{k \times n} and all entries of EE are sampled as eie_i on previous slide.

Each entry of RR has the form rij=aisj+eij.r_{ij} = a_i * s_j + e_{ij}.

For each sjs_j get nn samples of the form aisj+eijuij.a_i * s_j + e_{ij} \approx u_{ij}.

Languages, NP, and P

A language is a subset of strings L{0,1}.L \subset \{0,1\}^*.

A language LL is in NP if there exists a deterministic polynomial times algorithm DD and polynomial q()q(\cdot) such that:

A language LL is in P if there exists a polynomial time algorithm SS such that S(x)=1S(x) =1 if and only if xLx \in L.

NP relations

A binary relation RR is a subset of {0,1}×{0,1}.\{0,1\}^* \times \{0,1\}^*. The language LRL_R of a binary relation RR is the set

{x{0,1} : there exists w s.t. (x,w)R}.\{x \in \{0,1\}^* \text{ : there exists } w \text{ s.t. } (x,w) \in R \}.

RR is called an NP relation if the language LRL_R is in NP.

How to represent algorithms?

Arithmetic circuits

Fix a finite field F={0,...,p1}\mathbb{F} = \{0, ..., p-1\} for some prime p>2.p>2.

An arithmetic circuit is defined C:FnFC: \mathbb{F}^n \rightarrow \mathbb{F}. It is a directed acyclic graph (DAG) where internal nodes are labeled as +,,+, -, or ×\times. Inputs are labeled 1,x1,...,xn1, x_1, ..., x_n. This defines an nn-variate polynomial with an evaluation recipe.

C|C| usually denotes the number of multiplication gates in CC.

5. Private, verifiable computation

Representing algorithms (above).

Boolean circuits can be represented as arithmetic circuits over Fp\mathbb{F}_p:

Proof verification asymptotically faster than program evaluation = verifiable computation.

Sound: infeasible to produce valid proof if prover doesn’t know program ww Zero knowledge: proof doesn’t reveal anything about the program ww

Proof system syntax

Completeness property

Notation: <P(x,w),V(x)>< P(x,w), V(x)> denotes output of verifier on interaction with prover for input xx and witness ww.

For all (x,w)(x,w) s.t. C(x,w)=0C(x,w) = 0, there is a probability 1 that <P(x,w),V(x)>=1<P(x,w), V(x)> = 1.

Soundness property

Notation: <A(x,w),V(x)><A(x,w), V(x)> the output of verifier on interaction with an adversarial prover algorithm AA for input xx and witness ww.

For any xx where there does not exist ww such that C(x,w)=0C(x,w) = 0 and any polynomial time algorithm AA, the probability that <A(x,w),V(x)>=1<A(x,w), V(x)> = 1 is boundeed by negl(x).\text{negl}(|x|).

Weaker SNARKS with trusted setup

We believe there’s some trusted party that creates the parameters and then disappears. Risky because a broken trusted setup is very bad.

Sometimes we can use multi-party trusted setup.

Types of pre-processing setup:

Setup(1λ,C)public parameters(pk,vk).\text{Setup}(1^\lambda, C) \rightarrow \text{public parameters}(pk,vk).

Completeness property with setup

<P(pk, x,w), V(vk, x)> denotes output of verifier on interaction with prover for input xx and witness ww.

For all (pk,vk)Setup(1λ)(pk, vk) \leftarrow \text{Setup}(1^\lambda) (finish line from slides)

Soundness property with setup

from slides

Knowledge & Soundness

What does it mean to have knowledge?

Soundness is a aproperty of a proof system that ensures cheaters cannot convince the verifier of a false statement. Soundness     \implies if the prover succeeds, there must exist a witness (except with very small probability). This is too weak for some applications.

Knowledge soundness captures the property that the prover must somehow “know” the witness ww in order to succeed in the protocol.

A protocol is knowledge sound if there exists an extractor algorithm E that can obtain the witness w from any successful prover (informal).

The extractor:

For every adversarial prover AA and input xx that causes verifier to accept with non-negligible probability, EE outputs ww such that C(x,w)=0C(x,w) = 0 with high probability.

6. Proofs/Args of Knowledge, Zero Knowledge

Recap on knowledge soundness.

Proofs and arguments of knowledge

Oracle access to prover: Adversarial prover A is a stateful process that interacts with the verifier (internally runs a modified prover algorithm to get each next message).

If E is an algorithm that has oracle access to A then:

A protocol has δ(λ)\delta(\lambda) knowledge error if

Protocol is called proof of knowledge if δ(λ)\delta(\lambda) negligible.

Argument of knowledge: same definition as proof of knowledge, except extractor is only guaranteed to succeed for polynomial time adversary AA.

Succinctness, SNARKs

An interactive argument for circuit CC with nn gates and setup parameter λ\lambda is succinct if:

Succinct noninteractive arguments abbreviated SNARK. Zero-knowledge SNARK abbreviated zkSNARK.

The differences between SIA and SNARK:

Zero knowledge

notes

notes

Non-interactive zero knowledge

Sim can simulate the proof π\pi without the witness ww. Therefore, the proof reveals nothing about ww that an observer did not already know before seeing π\pi. How?

Clearly the Sim cannot produce a convincing proof π\pi*, otherwise soundness breaks. Then how can it simulate?

7. Polynomial Commitments and Classical ZKP

A polynomial commitment scheme (PCS) provides

A hiding/ZK PC is:

ZKP of DL

Problem: Given group elements g,hGg, h \in \mathcal{G} in a group of prime order pp, provide a ZK PoK of a witness x[0,p)x \in [0,p) such that ßgx=hg^x = h.

Prover message: sample r[0,p1]r \in [0, p-1] and send R=grR = g^r

Verifier challenge: sample c[0,2λ)c \in [0, 2^\lambda) and send cc

Prover response: send z=r+xcmodpz = r+x * c \mod p

Verifier decision: accept iff Rhc=gzR * h^c = g^z

Three round protocol of this form has a special name, the sigma protocol.

Special Soundness

A sigma protocol is special sound if from any two transcripts (m,c,z)(m,c,z) and (m,c,z)(m, c', z') where ccc \neq c' there is a polynomial time extractor that computes a witness.

Forking Lemma: Every special sound protocol with challenge space size 2λ2^\lambda has negligible knowledge error.

Honest verifier ZK

Assuming the verifier behaves honestly, we can simulate:

Simulator:

Fiat-Shamir Transform

A technique for taking an interactive PoK and creating a digital signature based on it. For sigma protocol, Fiat-Shamir transformation says instead of having verifier send a random challenge value to the prover, the prover can compute this value themselves by using a random function, leading us to the idea of random oracles.

Random oracle model

In the random oracle model, both prover and verifier algorithms have oracle access to a completely random function HH.

The random oracle model enables constructions of non-interactive proofs.

How can a simulator simulate a non-interactive proof without knowing a witness? In the random oracle model, the simulator can cheat by “re-programming” the random oracle at points (but must preserve the distribution of random oracle, e.g. re-program a point to uniform random output).

If a sigma protocol \prod is knowledge sound, then its Fiat-Shamir transform in the random oracle model is also knowledge sound.

HVZK to ZK via Fiat-Shamir Transform

Theorem: If a sigma protocol \prod satisfies HVZK, then assuming the random oracle model, the FS transform of \prod satisfies statistical zero-knowledge.

Schnorr Signatures

Apply Fiat-Shamir to ZKPoK of DL:

Public key: PK=gskPK = g^{sk} where sk[0,p)sk \in [0,p)

Signature on message m: (R,z)(R,z) s.t. gz=RPKcg^z = R*PK^c where c=H(m,PK,R)c = H(m, PK, R).

Group homomorphism

A group homomorphism is a function f:GHf : \mathbb{G} \rightarrow \mathbb{H} that satisfies f(g1g2)=f(g1)f(g2)f(g_1 \cdot g_2) = f(g_1) * f(g_2) for all g1,g2Gg_1, g_2 \in \mathbb{G}.

In additive notation, this is linearity:

f(g1+g2)=f(g1)+f(g2).f(g_1 + g_2) = f(g_1) + f(g_2).

Correlation intractability

A hash function family with sampling algorithm Gen(1λ)\text{Gen}(1^\lambda) is correlation intractable for a binary relation RR if for all polynomial time algorithms AA and the following distribution:

P(R(x,H(x))=1)<negl(λ).P(R(x, H(x)) = 1) < \text{negl}(\lambda).

CI and Fiat-Shamir

A hash family is correlation intractable if it is correlation intractable for all binary relations.

Theorem: If a sigma protocol \prod is knowledge sound, then its Fiat-Shamir transform using a correlation-intractable hash family is also knowledge sound.

Knowledge in Random Oracle Model

A Q-query random oracle algorithm is an algorithm with oracle access to random oracle HH that makes at most QQ queries to HH. An RO protocol is a noninteractive protocol between random oracle algorithms.

notes

8. More Polynomial Commitments + Classical ZKPs

FS transform of Sigma Protocols

Multiround Special Soundness

kk-ary special soundness: given kk-ary forking tree of proof transcripts for xLRx \in L_R, there is a polytime algorithm that extracts a witness ww s.t. (x,w)R(x,w) \in R.

Theorem: A kk-ary special sound μ\mu-round protocol is knowledge sound for constant kk and μO(logw)\mu \in O(\log |w|).

Simple PCP-based SNARK (Kilian, Micali)

notes

notes

notes

notes

Unfortunately, this method is impractical/the prover time is very high. Later: more efficient constructions, constructions that do not require random oracles.

9. Classical Succinct Proofs

None of the classical proofs examined previously have been succinct (communication has thus been proportional to size).

Recall PoK for DL: I know xx s.t. gx=yg^x = y over G\G. Send grg^r, return cc, send z=r+cz = r + c. Size of communication same as size of witness xx.

Consider x1,...,xnx_1, ..., x_n s.t. gixiy\prod g_i^{x_i} - y. A protocol with only O(logn)O(\log n) bits of communication?

As with Fiat-Shamir in PoK for DL, we can do something analogous for this situation.

Generalizing sigma protocols: Recall sigma protocol for PoK HPI (homomorphism preimage). We have a ff a homomorphism, f:ZpGf: \Z_p \rightarrow \G s.t. xgxx \mapsto g^x. Generalize mapping from any group HH to GG. Prover sends f(r)f(r), verifier returns cc, prover sends back f=r+cxf = r+ cx. Verifier checks that f(z)=f(r)f(x)cf(z) = f(r) * f(x)^c (for * the group operation).

The problem we’ve posed above is ParseError: KaTeX parse error: Undefined control sequence: \righarrow at position 11: f: \Z_p^n \̲r̲i̲g̲h̲a̲r̲r̲o̲w̲ ̲\G, e.g. with our multiple xix_i values we’re mapping vector xgxx \mapsto g^x. Check gizi=(giri)yc\prod g_i^{z_i} = (\prod g_i^{r_i}) * y^c for * the group operation.

This is not succinct!

notes

There are log2(n)log_2 (n) rounds (first round start with nn, cut in half each round until n=1n=1). In each round, we send two group elements CL,CRC_L, C_R, so total communication 2log2(n)2 log_2(n).

Recal Multiround Special Soundness (Lec 8). kk-ary special sound μ\mu round protocol is knowledge sound for const kk and μO(logw)\mu \in O(\log |w|).

10. SNARK Compilers

Compilation paradigm for SNARKs today: arithmetization -> information theoretic pf system -> SNARK.

Interactive Oracle Proofs

Pf systems today generally not built from probabilistically checkable proofs (PCPs) in the classical sense, computationally infeasible (1 round, oracle access to pf).

PCP: 1-round oracle proof, where prover sends poly sized string π\pi, and verifier has oracle access to query locations of π\pi without reading whole string.

IOP: interactive oracle proof, multi round. Verifier receives new pf strings each round and has oracle access to all pf strings received.

R1CS: we have matrices A,B,CFm×(n+1).A,B,C \in \F^{m \times (n+1)}. Input z=(1,x,w)z = (1,x,w) over Fn+1.\F^{n+1}. R1CS program accepts (x,w)(x,w) iff

(Az)(Bz)=(Cz)(A*z) \odot (B * z) = (C*z)

for \odot component wise product (Hadamard).

Linear PCP: PCP is a fn fπ:FnFf_\pi: \F^n \rightarrow \F. Equivalent:

notes

Ishai, Kushilevitz, Ostrovksy '07 show 4 round linear PCP with linear homomorphic encryption as compiler can convert linear PCPs into SNARKs.

Efficiency of IOPs

Some modern proof systems building SNARKs off of IOPs,

Interactive linear PCPs?

Linear IOPs (each round send linear PCP oracle, linear queries to prior oracles sent).

leading to \rightarrow poly IOPs.

(μ,d)(\mu, d) polynomial PCP

Generic reduction: replace coordinate query with 2-round polynomial IOP.

{point IOPs}{polynomial IOPs}{linear IOPs}.\{ \text{point IOPs} \} \subset \{\text{polynomial IOPs} \} \subset \{\text{linear IOPs}\}.

SNARKS from polynomial commitments: arithmetic circuit -> constraint system, polynomial testing -(using polynomial commitments)-> interactive argument -(fiat-shamir / hashing)-> non interactive proof.

IOPs with Preproc: in a setup phase, a preprocessor outputs several oracles. Any verifier can read the whole oracle to check its correctness. During ‘online’ interaction with prover, it may make queries to the preprocessed code.

Polynomial IOPs

Informally, PIOP efficiency and security: In a PIOP for NP relation RR with arithmetic complexity nn (size of the circuit computing RR):

PIOP not used in practice.

*PLONK: There exists a 3 round polynomial IOP for any NP relation RR (with arithmetic circuit complexity nn) with:

Constraint systems

Recall arithmetic circuits (one kind of constraint system). New: PLONK, R1CS.

notes

Arithmetic Circuit to R1CS

Theorem: For any circuit CC, the condition C(x,w)=0C(x,w) = 0 can be expressed as an R1CS program with mm constraints where ParseError: KaTeX parse error: Expected 'EOF', got '#' at position 5: m = #̲(\text{multipli…

notes

PLONK

notes

11. R1CS Linear IOP to SNARK

Recap on compilation paradigms:

R1CS to Linear PCP

We have matrices A,B,CFm×(n+1)A, B, C \in \F^{m \times (n+1)} and z=(1,x,w)Fn+1.z = (1,x,w) \in \F^{n+1}. Define degree m1m-1 polynomial f(X)f(X) by interpolating f(i)=(Az)i.f(i) = (A \cdot z)_i. Define degree m1m-1 polynomial g(X)g(X) by interpolating g(i)=(Bz)ig(i) = (B \cdot z)_i. Define degree 2m22m -2 polynomial h(X)h(X) as interpolation of h(i)=(Cz)ih(i) = (C \cdot z)_i for imi \leq m and h(i)=f(i)g(i)h(i) = f(i) g(i) for i=m+1,...,2m1i = m+1, ..., 2m-1.

Idea: if h=fgh = f \cdot g then (Az)(Bz)=Cz(A \cdot z) \circ (B \cdot z) = C \cdot z (\circ is Hadamard/comp-wise product). If prover is honest, then h=fgh= f \cdot g as defined above. Linear PCP verifier “checks” h(r)=f(R)g(r)h(r) = f(R) g(r) at random rr implies h=fgh = f \cdot g with probability 12mF.1 - \frac{2m}{| \F |}.

12. LPCP to SNARK and HVZK

Linear PCP to SNARK

Idea: prover can only output encodings of linear transformations of queries:

[a]=Enc(<q1R,w1>),[b]=Enc(<q2R,w2>),[c]=Enc(<q3R,w3>).[a] = Enc(<q_1^R, w_1>),[b] = Enc(<q_2^R, w_2>), [c] = Enc(<q_3^R, w_3>).

Verifier can check that ab=ca \cdot b = c using QuadTest. Remaining issue: how to force prover to use the same w1,w2,w3w_1, w_2, w_3?

Solution: add one more query that does a random linear check:

Prover forced to choose w1,w2,w3w_1, w_2, w_3 independently of the secret α,β,γ\alpha, \beta, \gamma.

Summary:

R1CS to linear PCP (previous)

then linear PCP to SNARK:

Proof is 4 elements, verifier does two QuadTests.

ZK Linear PCP

A linear PCP for a R1CS program (A,B,C)(A,B,C) is zero knowledge if there exists a simulator Sim that takes an input xxxx and verifier program VV' satisfying the following:

For all xFn1x \in \F^{n_1} such that wFmn1\exists w \in \F^{m-n_1} such that program accepts (x,w)(x,w) and all verifiers V’ making querires M’, Sim outputs:

Sim(x,V)(SP,SV,v)Sim(x,V') \rightarrow (S_P^*, S_V^*, v^*)

a random variable distributed identically to (SP,SV,Mπ)(S_P, S_V, M'_\pi) from

(SP,SV)S(A,B,C)(S_P, S_V) \leftarrow S(A,B,C)

and πP(SP,x,w)\pi \leftarrow P(S_P, x,w).

Honest Verifier Zero Knowledge

HVZK for LPCP: assume verifier follows the protocol, then Sim produces output (SP,SV,v)(S_P^*, S_V^*, v^*) which is distributed identically to (SP,SV,Mπ).(S_P, S_V, M\pi).

In transformation to SNARK, our trusted setup selects the query vectors honestly.

R1CS to HVZK LPCP

f(X)f(X)=αi[1,m](Xi)f'(X) - f(X) = \alpha \prod_{i \in [1,m]}(X-i)

implies

α=δ1f(0)(1)mm!\alpha = \frac{\delta_1 - f(0)}{(-1)^m m!}

(independent, uni random). Also,

f(r)=f(r)+αi(ri).f'(r) = f(r) + \alpha \prod_i (r-i).

13. Polynomial IOP for PLONK Constraints

Construction roadmap:

Arithmetic Circuit to Constraints

The input is a 2-fan-in arithmetic circuit with nn arithmetic gates and mm input/output wires. There are \ell public I/O wires. Any circuit can be padded with dummy I/O inputs so that =0\ell = 0 mod 33. Construct (s,σ,n,m,):(s, \sigma, n, m, \ell):

The permutation places toggether in cycles all indices assigned to left-input, right-input, output, and circuit I/O values that are equal according to the wiring.

xFx \in \F^\ell and wF3n+w \in \F^{3n+\ell} satisfy this constraint system iff the ww’s are valid assignments to the ii-th gate’s left/right/output values for all i[0,n)i \in [0, n) AND the first \ell public I/O values of the circuit are x0,...,x1,x_0, ..., x_{\ell -1}, equal to w3n,...,w3n+1.w_{3n},..., w_{3n + \ell -1}.

Constraints to Poly IOP

There’s a lot… Interpolating degree n=3n+n = 3n + \ell polynomial (selection polynomial), assignment polynomial, permutation polynomial.

Strawman polynomial IOP and efficient permutation argument. Permutation poly IOP, combined poly IOP, linearization optimization.

14. Polynomial IOP for PLONK Constraints, Cont.

Recap: Construction roadmap (arithmetic circuit -> constraint system -> polynomial IOP -(via polynomial commitment)-> interactive argument -(Fiat Shamir)-> SNARK).

Complete SNARK for Polynomial Constraints

Polynomial Commitment

Input: f(x)Fp(X)f(x) \in \F_p(X) degree at most dd.

Public coin interactive protocol (communication sublinear in dd):

Commitment is binding and evaluation is binding/argument of knowledge.

Compiling PIOP with Poly Commit

Polynomial Commitments

In an additive (homomorphic) polynomial commitment scheme, Cf+Cg=Cf+g.C_f + C_g = C_{f+g}. An additive polynomial commitment scheme can compile any Poly IOP to an argument with just 1 Eval execution.

PCS Batch Opening

Generic optimization for opening at multiple points with homomorphic polynomial commitments:

This generalizes to multiple polynomials at multiple points. Given {Ci}\{C_i\} commits to {fi}\{f_i\} and ParseError: KaTeX parse error: Expected 'EOF', got '}' at position 17: …{x_{ij}, y_{ij}}̲_{i \in [k], j … for i[k]i\in [k]:

Kate Batch Opening

Special case of Kate PCS: batch opening only 1 group element.

Final SNARK Size

Total size: 4-5 group elements, 5 field elemnts

Original PLONK size: 7 group elements, 7 field elements