Zero Knowledge Proofs / Secure Computing

Notes and reference for some intro crypt + zero knowledge, moving towards secure AI content. IN PROGRESS.

1. Intro

AI inference services require blind trust. Key security risks are privacy of the user’s input and integrity of the server’s output. Ideally, we want private and verifiable computation. The user should send an encrypted question to a trustworthy server which computes and the encrypted response. The user can then decryp the response and verify that it is the intended output of the model, without learning anything else about the model.

Three key capabilities to learn about:

zero knowledge proofs
succinct (non-interactive) proofs
fully homomorphic encryption.

Zero Knowledge Proofs

What is a proof system? The prover has some claim and creates proof $\pi$ . This proof is sent to the verifier, who will run an algorithm to either accept or reject this proof.

In a zero knowledge proof, a prover wants to convince a verifier that a statement is true without revealing secret evidence.

2.

blah blah blah make up lecture

3. make up lecture

4. Private, verifiable computation (preliminaries)

Previously: DL and collision-resistant hash from DL

Groups of unknown order

A group where the order is computationally hard to compute, defined by group sampling alg

<\mathbb{G}>, n \leftarrow \text{GGen}(1^\lambda)

for $<\mathbb{G}>$ a succinct (size polynomial in $\lambda$ ) description of a group $\mathbb{G}$ and $n$ an upper bound on its order.

The hidden order assumption holds for $\text{GGen}$ if for any polynomial time algorithm $A$ and distribution:

$<\mathbb{G}>, n \leftarrow \text{GGen}(1^\lambda)$
$g \leftarrow \mathbb{G}$ a uniformly distributed group element
$a \leftarrow A(g, <\mathbb{G}>, n)$

\P [g^a = 1 ] < \text{negl}(\lambda).

RSA assumption (fill in)

GUO with trusted setup: the algorithm $\text{GGen}$ is allowed to randomly generate secret values that are discarded.

Ex: $\text{GGen}$ samples secret primes $p,q$ and outputs $N = p*q$ defining the group $\mathbb{Z}^*_N.$ This group has order $(p-1)(q-1)$ , which is easy to compute if $N$ can be factored.

Collision resistant hash from GUO

setup: $\mathbb{G} \leftarrow \text{GGen}(\lambda),$ and $g \leftarrow \mathbb{G}$ .
hash function: input $x \in \mathbb{Z}$ return $g^x$ .

Collision-resistant based on hidden order assumption: if $g^x = g^y$ for $x \neq y$ then $g^{x-y} = 1$ .

Statistical distance

The statistical distance between random variables $X, Y$ over a finite domain $D$ is :

\Delta(X,Y)= \sum_{\alpha \in D} \frac{1}{2}|P(X=\alpha) - P(Y=\alpha)|.

Sequences of random variables $(X_n), (Y_n)$ are statistically indistinguishable if

\Delta(X_n, Y_n) = 1/ 2^n.

Sequences $(X_n), (Y_n)$ on the same domain $D$ are computationally indistinguishable if for any polynomial time algorithm $A$ that receives an input $x \in D$ and outputs $0$ or $1$ :

|P(A(X_n) = 1) - P(A(Y_n) = 1)| = 1/2^n.

Learning parity with noise

The LPN assumption over a field $\mathbb{F}$ has a dimension parameter $k$ , noise rate $\epsilon$ , and number of samples $N$ .

The assumption is that a list of $n$ samples of the form $(r_i, r_i * s + e_i)$ is computationally indistinguishable from a list of $n$ samples of the form $(r_i, u_i)$ where

$s,r_i \in \mathbb{F}^k$ and $u_i \in \mathbb{F}$ are uniform
$e_i$ is 0 with probability $1 - \epsilon$ and otherwise uniform in $\mathbb{F}$ .

Pseudorandom matrix from LPN

R = A*S + E

where $A \in \mathbb{F}^{n \times k}$ and $S \in \mathbb{F}^{k \times n}$ and all entries of $E$ are sampled as $e_i$ on previous slide.

Each entry of $R$ has the form $r_{ij} = a_i * s_j + e_{ij}.$

For each $s_j$ get $n$ samples of the form $a_i * s_j + e_{ij} \approx u_{ij}.$

Languages, NP, and P

A language is a subset of strings $L \subset \{0,1\}^*.$

A language $L$ is in NP if there exists a deterministic polynomial times algorithm $D$ and polynomial $q(\cdot)$ such that:

for any $x \in L$ there exists a “witness” $w$ of length $q(|x|)$ such that $D(x,w) = 1.$
if $x \notin L$ then $D(x,w)=0$ for any $w$ of length $q(|x|)$ .

A language $L$ is in P if there exists a polynomial time algorithm $S$ such that $S(x) =1$ if and only if $x \in L$ .

NP relations

A binary relation $R$ is a subset of $\{0,1\}^* \times \{0,1\}^*.$ The language $L_R$ of a binary relation $R$ is the set

\{x \in \{0,1\}^* \text{ : there exists } w \text{ s.t. } (x,w) \in R \}.

$R$ is called an NP relation if the language $L_R$ is in NP.

How to represent algorithms?

Turing machines (not useful for proof systems)
Arithmetic circuit families
RAM programs

Arithmetic circuits

Fix a finite field $\mathbb{F} = \{0, ..., p-1\}$ for some prime $p>2.$

An arithmetic circuit is defined $C: \mathbb{F}^n \rightarrow \mathbb{F}$ . It is a directed acyclic graph (DAG) where internal nodes are labeled as $+, -,$ or $\times$ . Inputs are labeled $1, x_1, ..., x_n$ . This defines an $n$ -variate polynomial with an evaluation recipe.

$|C|$ usually denotes the number of multiplication gates in $C$ .

5. Private, verifiable computation

Representing algorithms (above).

Boolean circuits can be represented as arithmetic circuits over $\mathbb{F}_p$ :

$AND(x,y)$ encoded as $x*y$
$OR(x,y)$ encoded as $x+y - x*y$
$NOT(x)$ encoded as $1-x$

Proof verification asymptotically faster than program evaluation = verifiable computation.

Sound: infeasible to produce valid proof if prover doesn’t know program $w$ Zero knowledge: proof doesn’t reveal anything about the program $w$

Proof system syntax

A transcript is a recording of messages between prover and verifier
Prover algorithm:
- algorithm $\text{Prove}(x,w,tr,st) \rightarrow (m,st')$ takes input $x$ and witness $w$ , internal state $st$ , transcript $tr$ , and produces an updated state and next message, which is sent to verifier.
Verifier algorithm:
- algorithm $\text{Verify}(x, tr, st) \rightarrow (m, st')$ takes only input $x$ , internal state $st$ , transcript $tr$ , and produces and updated state and next message, which is sent back to prover.

Completeness property

Notation: $< P(x,w), V(x)>$ denotes output of verifier on interaction with prover for input $x$ and witness $w$ .

For all $(x,w)$ s.t. $C(x,w) = 0$ , there is a probability 1 that $<P(x,w), V(x)> = 1$ .

Soundness property

Notation: $<A(x,w), V(x)>$ the output of verifier on interaction with an adversarial prover algorithm $A$ for input $x$ and witness $w$ .

For any $x$ where there does not exist $w$ such that $C(x,w) = 0$ and any polynomial time algorithm $A$ , the probability that $<A(x,w), V(x)> = 1$ is boundeed by $\text{negl}(|x|).$

Weaker SNARKS with trusted setup

We believe there’s some trusted party that creates the parameters and then disappears. Risky because a broken trusted setup is very bad.

Sometimes we can use multi-party trusted setup.

Types of pre-processing setup:

\text{Setup}(1^\lambda, C) \rightarrow \text{public parameters}(pk,vk).

trusted setup per circuit
universal trusted setup (secret part of setup not specific to circuit)
updatable universal trusted setup (public parameters can be updated by anyone)
transparent (setup does not use secret data at all)

Completeness property with setup

<P(pk, x,w), V(vk, x)> denotes output of verifier on interaction with prover for input $x$ and witness $w$ .

For all $(pk, vk) \leftarrow \text{Setup}(1^\lambda)$ (finish line from slides)

Soundness property with setup

from slides

Knowledge

What does it mean to have knowledge?

Soundness saids that if the prover succeeds, there must exist a witness (except with very small probability). This is too weak for some applications.

“Knowledge soundness” captures the property that the prover must somehow “know” the witness $w$ in order to succeed in the protocol.

Knowledge soundness

A protocol is knowledge sound if there exists an extractor algorithm E that can obtain the witness w from any successful prover (informal).

E can run A for any number of steps, rewind A to a previous state, and re-run A on different inputs.

For every adversarial prover $A$ and input $x$ that causes verifier to accept with non-negligible probability, $E$ outputs $w$ such that $C(x,w) = 0$ with high probability.