420 lines
24 KiB
Plaintext
420 lines
24 KiB
Plaintext
CS 294. The Learning with Errors Problem:
|
||
Introduction and Basic Cryptography
|
||
The learning with errors (LWE) problem was introduced in its current form in a seminal work
|
||
of Oded Regev for which he won the Gödel prize in 2018. In its typical form, the LWE problem
|
||
asks to solve a system of noisy linear equations. That is, it asks to find s ∈ Znq given
|
||
m
|
||
(ai , hai , si + ei ) : s ← Znq , ai ← Znq , ei ← χ i=1
|
||
|
||
(1)
|
||
|
||
where:
|
||
|
||
• Zq = Z/qZ denotes the finite ring of integers modulo q, Znq denotes the vector space of
|
||
dimension n over Zq ;
|
||
|
||
• χ is a probability distribution over Z which typically outputs “small” numbers, an example
|
||
being the uniform distribution over an interval [−B, . . . , B] where B q/2; and
|
||
|
||
• a ← D denotes that a is chosen according to the finite probability distribution D, a ← S
|
||
denotes that a is chosen uniformly at random from the (finite) set S.
|
||
|
||
In this first lecture, we will present various perspectives on the LWE (and the closely related “short
|
||
integer solutions” or SIS) problem, basic theorems regarding the different variants of these problems
|
||
and their basic cryptographic applications.
|
||
We will shortly derive LWE in a different way, “from first principles”, starting from a different
|
||
view, that of finding special solutions to systems of linear equations.
|
||
|
||
|
||
1 Solving Systems of Linear Equations
|
||
Consider the problem of solving a system of linear equations
|
||
|
||
Ae = b mod q (2)
|
||
|
||
given A ∈ Zn×mq and b ∈ Znq . This can be accomplished in polynomial time with Gaussian
|
||
elimination. However, slight variations of this problem become hard for Gaussian elimination and
|
||
indeed, we believe, for all polynomial-time algorithms. This course is concerned with two such
|
||
problems, very related to each other, called the SIS problem and the LWE problem.
|
||
|
||
1.1 The “Total” Regime and SIS
|
||
Assume that we now ask for solutions to equation 2 where e lies in some subset S ⊆ Zm
|
||
q . Typically
|
||
we will think of subsets S that are defined geometrically, for example:
|
||
|
||
• S = {0, 1}, which is the classical subset sum problem modulo q. More generally, S =
|
||
[−B . . . B]m is the set of all solutions where each coordinate can only take a bounded value
|
||
(absolute value bounded by some number B q/2). This will be the primary setting of
|
||
interest.
|
||
|
||
• S = Ball2R , the Euclidean ball of (small) radius R.
|
||
|
||
|
||
1
|
||
In all cases, we are asking for short solutions to systems of linear equations and hence this is called
|
||
the SIS (short integer solutions) problem.
|
||
The SIS problem SIS(n, m, q, B) as we will study is parameterized by the number of variables
|
||
m, the number of equations n, the ambient finite field Zq , and the bound on the absolute value of
|
||
the solutions B. Namely, we require that each coordinate ei ∈ [−B, −B + 1, . . . , B − 1, B].
|
||
To define an average-case problem, we need to specify the probability distributions for A and
|
||
b. We will, for the most part of this course, take A to be uniformly random in Zn×m q . There are
|
||
two distinct ways to define b. The first is in the “total” regime where we simply choose b from the
|
||
uniform distribution over Znq .
|
||
What does “total” mean? Total problems in NP are ones for which each problem instance has
|
||
a solution that can be verified given a witness, but the solution may be hard to find. An example
|
||
is the factoring problem where you are given a positive integer N and you are asked for its prime
|
||
factorization. A non-example is the 3-coloring problem where you are given a graph G and you
|
||
are asked for a 3-coloring; although this problem is in NP, it is not total as not every graph is
|
||
3-colorable.
|
||
|
||
Totality of SIS on the Average. Here, using a simple probabilistic argument, one can show
|
||
that (B-bounded) solutions are very likely to exist if (2B + 1)m q n , or m = Ω( nlog
|
||
log q
|
||
B ). We call
|
||
this regime of parameters the total regime or the SIS regime. Thus, roughly speaking, in the SIS
|
||
regime, m is large enough that we are guaranteed solutions (even exponentially many of them)
|
||
when A and b are chosen to be uniformly random. The problem then is to actually find a solution.
|
||
|
||
A Variant: homogenous SIS. The homogenous version of SIS asks for a non-zero solution to
|
||
equation 1 with the right hand side being 0, that is, Ae = 0 (mod q). This variant is worst-case
|
||
total as long as (B + 1)m > q n . That is, for every instance A is guaranteed to have a solution. We
|
||
leave the proof to the reader (Hint: Pigeonhole). SIS and hSIS are equivalent on the average-case.
|
||
We again leave the simple proof to the reader.
|
||
|
||
1.2 The Planted Regime and LWE
|
||
When m nlog log q
|
||
B , one can show again that there are likely to be no B-bounded solutions for a
|
||
uniformly random b and thus, we have to find a different, sensible, way to state this problem. To
|
||
do this, we first pick a B-bounded vector e and compute b as Ae mod q. In a sense, we plant the
|
||
solution e inside b. The goal now is to recover e (which is very likely to be unique) given A and
|
||
b. We call this the planted regime or the LWE regime.
|
||
But why is this LWE when it looks so different from Equation 1?
|
||
This is because the SIS problem in the planted regime is simply LWE in disguise. For, given
|
||
(m−n)×m
|
||
an LWE instance (A, yT = sT A + eT ), let A⊥ ∈ Zq be a full-rank set of vectors in the
|
||
right-kernel of A. That is,
|
||
A⊥ · At = 0 mod q
|
||
Then,
|
||
b := A⊥ · y = A⊥ · (At s + e) = A⊥ · e mod q
|
||
so (A⊥ , b) is an SIS instance SIS(m − n, m, q, B) whose solution is the LWE error vector. Further-
|
||
more, this is in the planted regime since one can show with an easy probabilistic argument that
|
||
the LWE error vector e is unique given (A, y).
|
||
|
||
2
|
||
The reader should also notice that we can run the reduction in reverse, creating an LWE
|
||
instance from a SIS instance. If the SIS instance is in the planted regime, this (reverse) reduction
|
||
will produce an LWE instance.
|
||
In summary, the only difference between the SIS and the LWE problems is whether they live
|
||
in the total world or the planted world, respectively. But the world you live in may make a
|
||
big difference. Algorithmically, so far, we don’t see a difference. In cryptography, SIS gives us
|
||
applications in “minicrypt” (such as one-way functions) whereas we need LWE for applications in
|
||
“cryptomania” and beyond (such as public-key encryption and fully homomorphic encryption).
|
||
|
||
Decision vs. Search for LWE. In the decisional version of LWE, the problem is to distinguish
|
||
between (A, yT := sT A + eT mod q) and a uniformly random distribution. One can show, through
|
||
a reduction that runs in poly(q) time, that the two problems are equivalent. The interesting
|
||
direction is to show that if there is a poly-time algorithm that solves the decision-LWE problem
|
||
for a uniformly random matrix A, then there is a poly-time algorithm that solves the search LWE
|
||
problem for a (possibly different and possibly larger) uniformly random matrix A0 . We will see a
|
||
search to decision reduction later in class.
|
||
|
||
1.3 Reductions Between SIS and LWE
|
||
SIS is at least as hard as LWE. We wish to show that if you have a solution for SIS w.r.t.
|
||
A, then it is immediate to solve decision-LWE w.r.t. A. Indeed, given a SIS solution e such that
|
||
Ae = 0 (mod q), and a vector bT , compute bT e (mod q). If b is an LWE instance, then
|
||
|
||
bT e = (sT A + xT )e = xT e (mod q)
|
||
|
||
which is a “small” number (as long as xT is small enough). On the other hand, if b is random,
|
||
then this quantity is uniformly random mod q (in particular, with a non-negligible probability, not
|
||
small). This gives us a distinguisher.
|
||
|
||
LWE is (quantumly) at least as hard as SIS. This turns out to be true, as we will see later
|
||
in the course.
|
||
|
||
1.4 SIS, LWE and Lattice Problems
|
||
SIS and LWE are closely related to lattices and lattice problems. We will have much to say about
|
||
this connection, in later lectures.
|
||
|
||
|
||
2 Basic Theorems
|
||
We start with some basic structural theorems on LWE and SIS.
|
||
|
||
2.1 Normal Form SIS and Short-Secret LWE
|
||
The normal form for SIS is where the matrix A is systematic, that is of the form A = [A0 ||I] where
|
||
n×(m−n)
|
||
A0 ∈ Zq .
|
||
|
||
Lemma 1. Normal-form SIS is as hard as SIS.
|
||
|
||
3
|
||
Proof. To reduce from normal-form SIS to SIS, simply multiply the input to normal-form SIS
|
||
(nfSIS), denoted [A0 ||I], on the left by a random matrix B ← Zn×n
|
||
q . We will leave it to the reader
|
||
to verify that the resulting matrix denoted A := B[A0 ||I] is uniformly random. Furthermore, a
|
||
solution to SIS on input (A, Bb0 ) gives us a solution to nfSIS on input (A0 , b0 ).
|
||
In the other direction, to reduce from SIS to normal-form SIS, write A as [A0 ||B] and gener-
|
||
ate [B−1 A0 ||I] as the normal-form SIS instance. Again, a solution to the normal form instance
|
||
(B−1 A0 , B−1 b) gives us a solution to SIS on input (A, b).
|
||
|
||
The corresponding version of LWE is called short-secret LWE where both the entries of s and
|
||
that of e are chosen from the error distribution χ. The proof of the following lemma follows along
|
||
the lines of that for normal form SIS and is left as an exercise. (Indeed, a careful reader will observe
|
||
that short-secret LWE is nothing but normal-form SIS in disguise.)
|
||
Lemma 2. There is a polynomial-time reduction from ssLWE(n, m, q, χ) to LWE(n, m, q, χ) and
|
||
one from LWE(n, m, q, χ) to ssLWE(n, m + n, q, χ).
|
||
|
||
We will continue to see more structural theorems about LWE through the course, but this
|
||
suffices for now.
|
||
|
||
|
||
3 Basic Cryptographic Applications
|
||
3.1 Collision-Resistant Hashing
|
||
A collision resistant hashing scheme H consists of an ensemble of hash functions {Hn }n∈N where
|
||
each Hn consists of a collection of functions that map n bits to m < n bits. So, each hash function
|
||
compresses its input, and by pigeonhole principle, it has collisions. That is, inputs x 6= y such that
|
||
h(x) = h(y). Collision-resistance requires that every p.p.t. adversary who gets a hash function
|
||
h ← Hn chosen at random fails to find a collision except with negligible probability.
|
||
|
||
Collision-Resistant Hashing from SIS. Here is a hash family Hn that is secure under SIS(n, m, q, B)
|
||
where n log q > m log(B + 1). Each hash function hA is parameterized by a matrix A ∈ Zn×m
|
||
q ,
|
||
takes as input e ∈ [0, . . . , B]m and outputs
|
||
|
||
hA (e) = Ae mod q
|
||
|
||
A collision gives us e, e0 ∈ [0, . . . , B]m where Ae = Ae0 mod q which in turn says that A(e − e0 ) =
|
||
0 mod q. Since each entry of e − e0 is in [−B, . . . , B], this gives us a solution to SIS(n, m, q, B).
|
||
|
||
3.2 Private-Key Encryption
|
||
A private-key encryption scheme has three algorithms: a probabilistic key generation Gen which,
|
||
on input a security parameter λ, generates a private key sk; a probabilistic encryption algorithm
|
||
Enc which, on input sk and a message m chosen from a message space M, generates a ciphertext
|
||
c; and a deterministic decryption algorithm Dec which, on input sk and the ciphertext c, outputs
|
||
a message m0 .
|
||
Correctness requires that for every sk generated by Gen and every m ∈ M,
|
||
|
||
Dec(sk, Enc(sk, m)) = m
|
||
|
||
4
|
||
The notion of security for private-key encryption is semantic security or equivalently, CPA-security,
|
||
as defined in the Pass-Shelat lecture notes (see References at the end of the notes.) In a nutshell,
|
||
this says that no probabilistic polynomial time (p.p.t.) adversary which gets oracle access to either
|
||
the Left oracle or the Right oracle can distinguish between the two. Here, the Left (resp. the Right)
|
||
oracle take as input a pair of messages (mL , mR ) ∈ M2 and outputs an encryption of mL (resp.
|
||
mR ).
|
||
|
||
Private-Key Encryption from LWE.
|
||
|
||
|
||
• Gen(1λ ): Compute n = n(λ), q = q(λ) and χ = χ(λ) in a way we will describe later in this
|
||
lecture. Let the private key sk be a uniformly random vector
|
||
|
||
sk := s ← Znq .
|
||
|
||
• Enc(sk, m): We will work with the message space M := {0, 1}. Larger message spaces can
|
||
be handled by encrypting each bit of the message independently. The ciphertext is
|
||
|
||
c := (a, b) := (a, sT a + e + mbq/2e mod q)
|
||
|
||
where a ← Znq and e ← χ is chosen from the LWE error distribution.
|
||
|
||
• Dec(sk, c = (a, b)): Output 0 if
|
||
|
||
b − sT a mod q < q/4
|
||
|
||
and 1 otherwise.
|
||
Lemma 3. The scheme above is correct if the support of the error distribution Supp(χ) ⊆ (−q/4, q/4)
|
||
and CPA-secure under the LWE assumption LWE(n, m = poly(n), q, χ).
|
||
Correctness and security are immediate and left as an exercise to the reader.
|
||
We left the issue of how to pick n, q and χ open, and indeed, they need to be chosen appropriately
|
||
for the scheme to be secure. Correctness and security give us constraints on these parameters
|
||
(see Lemma 3 above), but do not tell us how to completely specify them. To fully specify the
|
||
parameters, we need to ensure security against attackers “running in 2λ time” (this is the meaning
|
||
of the security parameter λ that we will use throughout this course) and to do that, we need to
|
||
evaluate the efficacy of various attacks on LWE which we will do (at least, asymptotically) in the
|
||
next lecture.
|
||
|
||
|
||
Open Problem 1.1. Construct a nice private-key encryption scheme from the hardness of SIS.
|
||
|
||
|
||
Note that SIS implies a one-way function directly. Together with generic transformations
|
||
in cryptography from one-way functions to pseudorandom generators (Håstad-Impagliazzo-Levin-
|
||
Luby) and from pseudorandom generators to pseudorandom functions (Goldreich-Goldwasser-Micali)
|
||
and from pseudorandom functions to private-key encryption (easy/folklore), this is possible. The
|
||
problem is to avoid the ugliness that results from using these general transformations.
|
||
|
||
5
|
||
3.3 Public-Key Encryption
|
||
A public-key encryption scheme is the same as private-key encryption except for two changes: first,
|
||
the key generation algorithm Gen outputs a public key pk as well as a private key sk; and second,
|
||
the encryption algorithm requires only the public key pk to encrypt. Security requires that a p.p.t.
|
||
adversary which is given pk (and thus can encrypt as many messages as it wants on its own) cannot
|
||
distinguish between an encryption of any two messages m0 , m1 ∈ M of its choice.
|
||
|
||
Public-Key Encryption from LWE (the LPR Scheme) There are many ways of doing this;
|
||
we will present the cleanest one due to Lyubashevsky-Peikert-Regev.
|
||
|
||
|
||
• Gen(1λ ): Compute n = n(λ), q = q(λ) and χ = χ(λ) in a way we will describe later in this lec-
|
||
ture. Let the private key sk be a random vector sk := s ← χn is chosen from the error distribution
|
||
and the public key is
|
||
pk := (A, yT := sT A + eT ) ∈ Zqn×n × Znq
|
||
where A is a uniformly random n-by-n matrix and e ← χn is chosen from the error distribu-
|
||
tion.
|
||
• Enc(sk, m): We will work with the message space M := {0, 1} as above. The ciphertext is
|
||
c := (a, b) := (Ar + x, yT r + x0 + mbq/2e mod q)
|
||
where r, x ← χn and x0 ← χ are chosen from the LWE error distribution.
|
||
• Dec(sk, c = (a, b)): Output 0 if
|
||
b − sT a mod q < q/4
|
||
and 1 otherwise.
|
||
p p
|
||
Lemma 4. The scheme above is correct if Supp(χ) ⊆ (− q/4(2n + 1), q/4(2n + 1)) and CPA-
|
||
secure under the LWE assumption LWE(n, m = 2(n + 1), q, χ).
|
||
Proof. For correctness, note that the decryption algorithm computes
|
||
b − sT a mod q = sT x + eT r + x0
|
||
p p
|
||
whose absolute value, as long as Supp(χ) ⊆ (− q/4(2n + 1), q/4(2n + 1)) is at most
|
||
q/4(2n + 1) · (2n + 1) = q/4 .
|
||
For security, we proceed by the following sequence of hybrid experiments.
|
||
|
||
Hybrid 0.m. The adversary gets pk and Enc(pk, m) where m ∈ {0, 1}.
|
||
|
||
Hybrid 1.m. Feed the adveresary with a “fake” public key pk
|
||
f computed as
|
||
f = (A, y) ← Zn×n × Zn
|
||
pk q q
|
||
|
||
and Enc(pk,
|
||
f m). This is indistinguishable from Hybrid 0 by the hardness of ssLWE(n, n, q, χ) and
|
||
therefore, by Lemma 2, LWE(n, 2n, q, χ).
|
||
|
||
6
|
||
Hybrid 2.m. Feed the adversary with pk
|
||
f and Enc(
|
||
g pk,f m) computed as
|
||
|
||
Enc(
|
||
g pk,f m) = (a, b0 + mbq/2e mod q)
|
||
|
||
where a ← Znq is uniformly random. This is indistinguishable from Hybrid 1 by ssLWE(n, n+1, q, χ)
|
||
or by Lemma 2, LWE(n, 2n + 1, q, χ), since the entire ciphertext can easily be rewritten as
|
||
|
||
A x 0
|
||
r+ + mod q
|
||
yT x0 mbq/2e
|
||
|
||
which, since y is now uniformly random, is n + 1 ssLWE samples and therefore can be indistin-
|
||
guishably replaced by
|
||
a 0
|
||
+ mod q
|
||
b0 mbq/2e
|
||
where a ← Znq and b0 ← Zq .
|
||
|
||
Hybrid 3.m. Feed the adversary with uniformly random numbers from the appropriate domains.
|
||
Follows from the previous expression for the fake ciphertext (random + anything = random).
|
||
For every m ∈ M, Hybrid 0.m is computationally indistinguishable from Hybrid 3.m. Furthermore,
|
||
Hybrid 3 is completely independent of m. Therefore, Hybrids 0.0 and 0.1 are computationally
|
||
indistinguishable from each other, establishing semantic security or CPA-security.
|
||
|
||
|
||
There are many ways to improve the rate of this encryption scheme, that is, lower the ratio of
|
||
(#bits in ciphertext)/(#bits in plaintext) and indeed, even achieve a rate close to 1. We can also
|
||
use these techniques as building blocks to construct several other cryptographic systems such as
|
||
oblivious transfer protocols. This public-key encryption scheme has its origins in earlier works of
|
||
Ajtai and Dwork (1997) and Regev (2004).
|
||
|
||
Public-Key Encryption from LWE (the Regev Scheme) We present a second public-key
|
||
encryption scheme due to Regev. We will only provide a sketch of the correctness and security
|
||
analysis and leave it as an exercise to the reader. We remark that the security proof relies on a
|
||
beautiful lemma called the “leftover hash lemma” (Impagliazzo, Levin and Luby 1990).
|
||
|
||
• Gen(1λ ): Compute n = n(λ), q = q(λ) and χ = χ(λ) in a way we will describe later in this
|
||
lecture. Let the private key sk be a random vector sk := s ← Znq is chosen uniformly at
|
||
random from Zq and the public key is
|
||
|
||
pk := (A, yT := sT A + eT ) ∈ Zn×n
|
||
q × Zm
|
||
q
|
||
|
||
where A is a uniformly random n-by-m matrix and e ← χn is chosen from the error distri-
|
||
bution. Here m = Ω(n log q).
|
||
|
||
Note the difference from LPR where the secret key had small entries. Note also that the
|
||
matrix A is somewhat larger than in LPR.
|
||
|
||
|
||
|
||
7
|
||
• Enc(sk, m): We will work with the message space M := {0, 1} as above. The ciphertext is
|
||
|
||
c := (a, b) := (Ar, yT r + mbq/2e mod q)
|
||
|
||
where r ← {0, 1}m . x0 ← χ is chosen from the LWE error distribution.
|
||
|
||
Note the difference from LPR where the vector r was chosen from the error distribution and
|
||
the first component of the ciphertext had an additive error as well. Roughly speaking, in Regev,
|
||
we will argue that the first component is statistically close to random, whereas in LPR, we
|
||
argued that it is computationally close to random under the decisional LWE assumption.
|
||
• Dec(sk, c = (a, b)): Output 0 if
|
||
|
||
b − sT a mod q < q/4
|
||
|
||
and 1 otherwise.
|
||
|
||
Decryption recovers mbq/2e plus an error eT r + x0 whose norm should be smaller than q/4
|
||
for the correctness of decryption. This is true as long as the support of the error distribution is
|
||
Supp(χ) ⊆ (−q/4(m + 1), q/4(m + 1)).
|
||
In the security proof, we first replace the public key with a uniformly random vector relying on
|
||
the LWE assumption. Once this is done, use the leftover hash lemma to argue that the ciphertext
|
||
is statistically close to random.
|
||
|
||
Public-Key Encryption from LWE (the dual Regev Scheme) We present yet another
|
||
public-key encryption scheme due to Gentry, Peikert and Vaikuntanathan called the “dual Regev”
|
||
scheme. The nice feature of this scheme, which will turn out to be important when we get to
|
||
identity-based encryption is that the distribution of the public key is really random. In other words,
|
||
any string could be a possible public key in the scheme.
|
||
|
||
• Gen(1λ ): Compute n = n(λ), q = q(λ) and χ = χ(λ) in a way we will describe later in this
|
||
lecture. Let the private key sk be a random vector sk := r ← {0, 1}m is chosen uniformly at
|
||
random with 0 or 1 entries and the public key is
|
||
|
||
pk := (A, a := Ar ∈ Zn×n
|
||
q × Zm
|
||
q )
|
||
|
||
where A is a uniformly random n-by-m matrix. Here m = Ω(n log q).
|
||
|
||
Note the difference from Regev where the private key here seems to have a component similar
|
||
to the first component of a Regev ciphertext. No wonder this is called “dual Regev”.
|
||
• Enc(sk, m): We will work with the message space M := {0, 1} as above. The ciphertext is
|
||
|
||
c := (yT , b) := (sT A + eT , sT a + x0 + mbq/2e mod q)
|
||
|
||
where s ← Znq and eT ← χm . x0 ← χ is chosen from the LWE error distribution.
|
||
|
||
• Dec(sk, c = (yT , b)): Output 0 if
|
||
|
||
b − yT r mod q < q/4
|
||
|
||
and 1 otherwise.
|
||
|
||
8
|
||
Open Problem 1.2. Construct a public-key encryption scheme from the hardness of LWE where
|
||
the support of the error distribution χ is large, namely [−cq, cq] for some constant c.
|
||
|
||
|
||
LWE with such large errors does imply a one-way function, and therefore, a private-key encryp-
|
||
tion scheme. The question therefore asks if there is a gap between the LWE parameters that gives
|
||
us public-key vs private-key encryption.
|
||
|
||
|
||
References
|
||
The primary reference for the cryptographic definitions in this lecture is lecture notes by Pass and
|
||
Shelat, available at this url.
|
||
|
||
|
||
|
||
|
||
9
|
||
|