{"id":1,"date":"2025-04-11T17:03:00","date_gmt":"2025-04-11T17:03:00","guid":{"rendered":"https:\/\/jacksonwalters.com\/blog\/?p=1"},"modified":"2025-12-17T20:25:21","modified_gmt":"2025-12-17T20:25:21","slug":"hello-world","status":"publish","type":"post","link":"https:\/\/jacksonwalters.com\/blog\/?p=1","title":{"rendered":"a practical intro to lattice-based cryptography"},"content":{"rendered":"<p>This post aims to cover some practical aspects of post-quantum cryptography and homomorphic encryption, topics I&#8217;ve been interested in over the past few months. Much of this work was performed during a Cryptography Fellowship with Zaiku Group, Ltd.<\/p>\n<p>These tools are meant to provide modern upgrades to some current cryptographic schemes which have been in use for decades. Namely, RSA (Rivest, Shamir, Adleman) and elliptic curve cryptography (ECDH, ECIES, ECDSA). These two approaches are related to each other in that they can both be phrased as discrete logarithm problems, though on different groups. For RSA, the group is the multiplicative group of the integers mod $p$, $(Z_p)^{\\times}$ for a prime $p$. For elliptic curves, we use the analogous group $E(F_p)$, the $F_p$-points of the curve E over a finite field $F_p$.<\/p>\n<p>RSA was introduced in 1977 and published in 1978: &#8220;A Method for Obtaining Digital Signatures and Public-Key Cryptosystems&#8221;; by Ronald L. Rivest, Adi Shamir, and Leonard M. Adleman in the journal Communications of the ACM (Volume 21, Issue 2). The basic layout of RSA is as follows:<\/p>\n<p>RSA is a public-key cryptosystem based on the hardness of factoring large integers.<\/p>\n<p><strong>Key generation:<\/strong> Choose two large primes $p$ and $q$. Compute $n = pq$ and $\\varphi(n) = (p-1)(q-1)$. Pick a public exponent $e$, then compute the private exponent $d$ such that $ed \\equiv 1 \\text{ mod } \\varphi(n)$.<\/p>\n<p><strong>Public key:<\/strong> $(n, e)$<\/p>\n<p><strong>Private key:<\/strong> $d$<\/p>\n<p><strong>Encryption:<\/strong> $c = m^e \\text{ mod } n$<\/p>\n<p><strong>Decryption:<\/strong> $m = c^d \\text{ mod } n$<\/p>\n<p>Security relies on the difficulty of factoring $n$. If one factors $n$, then one can compute the totient $\\varphi(n)$, and then compute the modular inverse $d =e^{-1} \\text{ mod } \\varphi(n)$.<\/p>\n<p>RSA was first standardized by NIST in 1993 as part of the Digital Signature Standard (DSS) family, specifically in FIPS 186-1, which included RSA as an approved algorithm for digital signatures.<\/p>\n<p>There is a quantum algorithm which can be used to break RSA known as Shor&#8217;s algorithm. This algorithm makes use of the fact that $a^x \\text{ mod } n$ is a periodic function on $(Z\/nZ)^{\\times}$. If one is able to find the period $ r$ so that $a^r \\equiv 1 \\text{ mod } n$, then note $(a^{r\/2} &#8211; 1)(a^{r\/2} + 1) \\equiv 0 \\text{ mod } n $, so $gcd(a^{r\/2} &#8211; 1, a^{r\/2} + 1)$ is likely to be a factor.<\/p>\n<p>In order to extract the period, the function is essentially spread in an unobservered superposition across the entire (exponential) set of integers $x$ from $0$ to $2^n-1$, and the exponentials computed via controlled modular exponentiation. Then the iQFT (inverse quantum Fourier transform) is applied to this overall state. The QFT is fast, in that it runs in $O(n\\log(n))$ time despite operating on $2^n$ states, since it can operate on tensor products of $n$ qubits. This allows efficient and polynomial time extraction of the period $r$ after a quick classical computation via continued fractions to get the actual factors.<\/p>\n<p>This algorithm was discovered by Peter W. Shor: &#8220;Algorithms for quantum computation: discrete logarithms and factoring&#8221;, Presented at the 35th Annual Symposium on Foundations of Computer Science (FOCS) in 1994.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<p>For elliptic curves, we take a plane curve $E: y^2 = x^3 + Ax + B$ over a finite field $F_p$, whose points form an abelian group. Choose a base point $P$. Alice picks a secret number a, and lets $P_0 = aP$ (add $P$ to itself $a$ times ), her public key. Bob picks a secret number $b$ and computes $P_1 = bP$, his public key. For the key exchange, Alice computes $aP_1 = abP$ and Bob computes $bP_0 = abP$, so they get the same point on the curve, which can be used to derive a shared key. Note that given $(aP, P)$ it is very difficult to determine $a$. This is known as the elliptic curve discrete logarithm problem (ECDLP), and is analogous to RSA where here we are using the addition in the elliptic curve group $E(F_p)$.<\/p>\n<p>The key sizes for elliptic curves are much smaller than RSA (256 bit ECC ~ 3072 bit RSA).<\/p>\n<p>A variant of Shor&#8217;s algorithm which still performs Fourier sampling just over the abelian group of the elliptic curve $E(F_p)$ rather than the multiplicative group of integers mod $n$. Thus, this is an effective attack on the ECDLP.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<p>There is a lot of current discussion re: when quantum computers will become reliable, fault-tolerant, error corrected, and large enough to perform useful computation. Currently there are 127 qubit systems available publicly via IBM. Google also has similar sized machines. The best error correction codes appear to be surface codes, with a $d \\times d$ array with $d=7$ qubits, and then hundreds or thousands of physical qubits per logical qubit. This allows the correction of up to 3 errors.<\/p>\n<figure class=\"wp-block-image size-large\">\n<img loading=\"lazy\" decoding=\"async\" \n  src=\"https:\/\/jacksonwalters.com\/blog\/wp-content\/uploads\/2025\/12\/surface_codes.png\" width=\"512\" height=\"512\" \n  alt=\"Surface codes diagram\"\n\/><figcaption>Surface codes and logical qubits<\/figcaption><\/figure>\n<p>In general, it may take awhile before there are thousands of qubits required to break something like RSA-1024, which would mean factoring a 1024-bit integer. Estimates range, but something like 5-10 years seems plausible, especially given the exponential nature of progress with respect to number of qubits. To that end, NIST has stated that cryptographic schemes like RSA and ECC with 128 bit security levels will be deprecated by Dec. 31st 2030, and no longer allowed by Dec. 31st 2035. This means that the transition to post-quantum cryptography has already begun, and will probably accelerate over the next five years with many systems being replaced.<\/p>\n<figure class=\"wp-block-image size-large\">\n<img decoding=\"async\" \n  src=\"https:\/\/jacksonwalters.com\/blog\/wp-content\/uploads\/2025\/12\/number_of_qubits_over_time.png\"\n  alt=\"Number of qubits over time\"\n  style=\"max-width:500px; width:100%; height:auto;\"\n  loading=\"lazy\"\n\/><figcaption>Number of physical qubits over time<\/figcaption><\/figure>\n<p>\n    NIST, Transition to Post-Quantum Cryptography Standards:<br \/>\n    <a href=\"https:\/\/csrc.nist.gov\/pubs\/ir\/8547\/ipd\">https:\/\/csrc.nist.gov\/pubs\/ir\/8547\/ipd<\/a>\n<\/p>\n<p>There is a major issue in the meantime that encrypted data that is collected (&#8220;harvested&#8221;) now will be able to be decrypted later by sufficiently large quantum computers later (&#8220;harvest now, decrypt later&#8221;). Therefore it is crucial that these systems and databases are replaced as soon as possible.<\/p>\n<p>To this end, NIST ran a competition to determine the best among a selection of competing post-quantum cryptography standards. <\/p>\n<p><a href=\"https:\/\/csrc.nist.gov\/projects\/post-quantum-cryptography\/post-quantum-cryptography-standardization\">https:\/\/csrc.nist.gov\/projects\/post-quantum-cryptography\/post-quantum-cryptography-standardization<\/a><\/p>\n<p>In the end, the three winners were: <em>CRYSTALS-Dilithium, CRYSTALS-KYBER and SPHINCS+<\/em>, from which FIPS 203, FIPS 204, and FIPS 205 are respectively derived.<\/p>\n<ul class=\"wp-block-list\">\n<li>Crystals-Dilithium is a digital signature scheme, used for signing and verification. It is based on lattice problems like module-LWE and module-SIS.<\/li>\n<li>CRYSTALS-KYBER is a key encapsulation mechanism, used for generating and encrypting a shared symmetric key to be used with another symmetric key encryption scheme. These are based on lattice problems like module-LWE.<\/li>\n<li>SPHINCS+&nbsp;is a&nbsp;stateless hash-based digital signature scheme&nbsp;\u2014 meaning it relies purely on&nbsp;hash functions&nbsp;(like SHA-256) for its security, rather than number-theoretic problems or lattices.<\/li>\n<\/ul>\n<p>We&#8217;ll primarily focus on lattice-based cryptography methods in this post. These are cryptosystems based on algebraic lattices which are very similar to rings of integers in algebraic number fields (like the cyclotomic integers $\\mathbb{Z}[x]\/(x^n-1)$, but using $x^n+1$ instead, and working over $\\mathbb{Z}_q$ rather than $\\mathbb{Z}$).<\/p>\n<p>Each system is based on a computationally hard problem that currently does not have a polynomial time quantum algorithm to solve it. We&#8217;ll go into more detail later, but examples of these problems are:<\/p>\n<ul class=\"wp-block-list\">\n<li><em>Learning with errors (LWE)<\/em>: Given a set of noisy linear equations, find the secret vector. In mathematical terms, you&#8217;re given a set of noisy linear combinations of a secret vector, and the task is to recover the secret vector despite the noise.<\/li>\n<li><em>Shortest vector problem (SVP)<\/em>: Given a lattice, find the shortest non-zero vector in that lattice. This is known to be a hard problem in general lattices, and it\u2019s often used as the foundation for other cryptographic protocols. The hardness arises when the given basis is not orthogonal.<\/li>\n<li><em>Closest vector problem (CVP)<\/em>: Given a lattice and a target point, find the closest lattice point to that target. In cryptographic settings, this is used for things like encryption or key exchange protocols.<\/li>\n<li><em>ring-LWE<\/em>: This is a specialized version of LWE, where the secret and error terms live in a ring structure, which makes the problem more efficient but still computationally hard.<\/li>\n<li><em>module-LWE<\/em>: Similar to Ring-LWE, but operates in a module setting, which generalizes the ring structure to a rank $k$ module over the ring.<\/li>\n<\/ul>\n<p>In this post, we&#8217;ll primarily focus on ring-LWE, module-LWE, and FIPS 203.<\/p>\n<p>FIPS 203 is the module-lattice key encapsulation mechanism (ML-KEM) derived from CRYSTALS-Kyber. It does use module-LWE under the hood in the internal PKE (public key encryption) portion, which we&#8217;ll cover in detail later.<\/p>\n<p>We&#8217;ll begin with ring-LWE, a post-quantum fully homomorphic encryption scheme.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<h2 class=\"wp-block-heading\">ring-LWE, homomorphic encryption<\/h2>\n<ul>\n<li>slides: <a href=\"https:\/\/jacksonwalters.com\/blog\/wp-content\/uploads\/2025\/12\/ring-lwe_in_rust.pdf\">ring_lwe.pdf<\/a><\/li>\n<li>code:<a href=\"https:\/\/crates.io\/crates\/ring-lwe\">https:\/\/crates.io\/crates\/ring-lwe<\/a><\/li>\n<\/ul>\n<p>For modern cryptography, we aim to find problems that are impractical or ideally impossible to solve, even on a quantum computer.<\/p>\n<p>\n<br \/>\nThe &#8220;learning with errors&#8221; (LWE) problem, along with its ring-LWE variant, are examples of such problems. It involves distinguishing between two distributions:\n<\/p>\n<ul class=\"wp-block-list\">\n<li>A set of random linear equations perturbed by a small error (noise)<\/li>\n<li>A truly uniform random distribution<\/li>\n<\/ul>\n<p>\n<br \/>\nGiven $(\\bf{A}, \\bf{b}) \\in \\mathbb{Z}_q^{n \\times m} \\times \\mathbb{Z}_q^m$ where $\\bf{b} = \\bf{A}\\bf{s} + \\bf{e} \\text{ mod } q$:\n<\/p>\n<ul class=\"wp-block-list\">\n<li>$\\bf{A}$ is a known random matrix<\/li>\n<li>$\\bf{s}$ is a secret vector<\/li>\n<li>$\\bf{e}$ is a noise vector sampled from a narrow error distribution<\/li>\n<li>$q$ is a large modulus<\/li>\n<\/ul>\n<p>\n<br \/>\nThe goal is to recover the secret $\\bf{s}$ or distinguish $(\\bf{A},\\bf{b})$ from uniformly random samples.\n<\/p>\n<p>We&#8217;d like to ensure that instances of LWE are hard in the average case. This is done by proving reductions from well-known hard problems in computer science:<\/p>\n<ul class=\"wp-block-list\">\n<li>$\\bf{GapSVP}$: Shortest vector problem with a gap<\/li>\n<li>$\\bf{SIVP}$: Shortest independent vector problem<\/li>\n<\/ul>\n<p>These lattice problems are:<\/p>\n<ul class=\"wp-block-list\">\n<li>NP-hard in their exact versions<\/li>\n<li>Computationally hard in their approximate versions<\/li>\n<\/ul>\n<p>\n<br \/>\nThe variant we focus on is ring-LWE, where the lattices are number-theoretic and derived from ideals in certain polynomial rings:<br \/>\n\n<\/p>\n<p>$R_q = \\mathbb{Z}_q[x]\/(f(x))$<\/p>\n<p>\n<br \/>\nwhere $f(x)$ is typically a polynomial like $x^n &#8211; 1$. However, this choice is insecure. Instead, we use:<\/p>\n<p>$f(x) = x^n+1$<br \/>\n\n<\/p>\n<p>\nwhere $n$ is a power of two. This is the &#8220;anti-cyclotomic&#8221; ring of integers.<\/p>\n<p>The hard problem becomes:\n<\/p>\n<ul class=\"wp-block-list\">\n<li>$\\bf{Ideal-SVP}$: Shortest vector problem for ideal lattices<\/li>\n<\/ul>\n<p>Consider the ring $R = \\mathbb{Z}[x]\/(x^n + 1)$:<\/p>\n<ul class=\"wp-block-list\">\n<li>This is a $\\mathbb{Z}$-module of rank $n$.<\/li>\n<li>Any element $a(x) \\in R$ can be written as: $a(x) = \\sum_{i=0}^{n-1} c_i x^i$<\/li>\n<\/ul>\n<ul class=\"wp-block-list\">\n<li>The vector $(c_i)_{i=0}^{n-1}$ is the coefficient vector, making $R \\cong \\mathbb{Z}^n$.<\/li>\n<\/ul>\n<ul class=\"wp-block-list\">\n<li>$(x^n + 1)$ is an ideal, and elements $e(x) \\in R$ map to elements in this lattice.<\/li>\n<li>The error $e(x)$ corresponds to a short vector (in the Euclidean norm) in the lattice, representing a small perturbation.<\/li>\n<\/ul>\n<p>Ideal lattices inherit the ring&#8217;s structure, such as multiplication by polynomials.<\/p>\n<p>\n<br \/>\nSolving ring-LWE allows efficient recovery of short vectors in ideal lattices:\n<\/p>\n<ul class=\"wp-block-list\">\n<li>Ring-LWE enables recovery of the secret $s(x)$, which corresponds to information about the underlying lattice structure.<\/li>\n<li>Decoding the noisy lattice point perturbed by $e(x)$ reveals a short vector.<\/li>\n<\/ul>\n<p>The reduction shows that solving ring-LWE allows decoding perturbed lattice points for any ideal lattice in $R$, solving Ideal-SVP in the process.<\/p>\n<p>\n<br \/>\nOur goal is to introduce the simplest possible implementation of the ring-LWE encryption scheme.\n<\/p>\n<ul class=\"wp-block-list\">\n<li>Choose a moderately large prime $p$ and large $n$ such that n|p-1<\/li>\n<li>$n$ should be a power of two and $512$ or above<\/li>\n<li>Example: $p = 12289$<\/li>\n<\/ul>\n<p>Let $R_p := \\mathbb{F}_p[x]\/(x^n+1)$. <\/p>\n<p>This is a finite ring with $p^n$ elements. It is not a finite field, as $x^n+1$ factors modulo $p$ (though it is irreducible over $\\mathbb{Z}$).<\/p>\n<p>The elements look like:<\/p>\n<p>\n<br \/>\n$$R_p = \\{a_{n-1}x^{n-1} + a_{n-2}x^{n-2} + \\ldots + a_1x + a_0: a_i \\in \\mathbb{F}_p\\}$$\n<\/p>\n<p>\n<strong>Public Setup<\/strong><br \/>\n\n<\/p>\n<p>The public setup involves the following:<\/p>\n<ul class=\"wp-block-list\">\n<li>A prime $p$ and dimension $n$ resulting in $R_p$<\/li>\n<li>A moderately large integer $k \\in \\mathbb{Z}$<\/li>\n<li>A notion of small, as applied to elements of $R_p$<\/li>\n<\/ul>\n<p>These will be &#8220;ternary polynomials&#8221; with coefficients in ${-1,0,+1}$. The coefficients could also be drawn from a normal distribution centered at $0$ with some standard deviation.<\/p>\n<p>\n<strong>Key Generation: Private\/Public Keypair<\/strong>\n<\/p>\n<p>\n<br \/>\nBob creates a private\/public keypair as follows:\n<\/p>\n<ul class=\"wp-block-list\">\n<li>Bob selects a small random element $s$ of $R_p$<\/li>\n<li>Bob selects a small random element $e_1$ of $R_p$<\/li>\n<li>Bob defines $(a, b = as + e_1) \\in R_p \\times R_p$<\/li>\n<\/ul>\n<ul class=\"wp-block-list\">\n<li>The element $e_1$ can be discarded<\/li>\n<li>Bob keeps $s$ as his secret key<\/li>\n<li>Bob makes $(a,b)$ public as his public key<\/li>\n<\/ul>\n<p>\n<strong>Encryption by Alice<\/strong>\n<\/p>\n<p>\n<br \/>\nAlice encrypts a message $m$ as follows:\n<\/p>\n<ul class=\"wp-block-list\">\n<li>Select a small random $r \\in R_p$ (ephemeral key)<\/li>\n<li>Select small random $e_2, e_3 \\in R_p$<\/li>\n<li>Define $v = ar + e_2$, $w = br + e_3 + km$<\/li>\n<\/ul>\n<ul class=\"wp-block-list\">\n<li>Alice may discard $k$, $e_2$, and $e_3$<\/li>\n<li>The ciphertext is $(v, w)$, which is sent to Bob<\/li>\n<\/ul>\n<p>\n<strong>Decryption by Bob<\/strong><br \/>\n\n<\/p>\n<p>Bob decrypts the message:<\/p>\n<ul class=\"wp-block-list\">\n<li>Compute $x = w &#8211; vs$<\/li>\n<li>Round $x$ to the nearest multiple of $k$<\/li>\n<li>The result should be an integer; divide it by $k$ to reveal the message $m$<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\">Homomorphic Encryption<\/h3>\n<p>Homomorphic encryption allows computations to be performed on encrypted data without needing to decrypt it first. This enables privacy-preserving computations.<\/p>\n<ul class=\"wp-block-list\">\n<li>\n    <strong>Encryption<\/strong>: The data is encrypted using a homomorphic encryption scheme.\n<\/li>\n<li>\n    <strong>Computation<\/strong>: Operations such as addition or multiplication are performed directly on the encrypted data.\n<\/li>\n<li>\n    <strong>Decryption<\/strong>: The result of the computation is decrypted to reveal the final outcome.\n<\/li>\n<\/ul>\n<p>\nIf $enc(x)$ represents the encryption of data $x$, and $\\oplus$ denotes addition on encrypted data, and $\\otimes$ represents multiplication on encrypted data:\n<\/p>\n<ul>\n<li>$enc(x + y) = enc(x) \\oplus enc(y)$<\/li>\n<li>$enc(x \\times y) = enc(x) \\otimes enc(y)$<\/li>\n<\/ul>\n<p>Homomorphic encryption can be used in cloud computing, privacy-preserving machine learning, and secure multi-party computations.<\/p>\n<h3 class=\"wp-block-heading\">Code Examples<\/h3>\n<p>\ncode:<br \/>\n<a href=\"https:\/\/crates.io\/crates\/ring-lwe\">https:\/\/crates.io\/crates\/ring-lwe<\/a>\n<\/p>\n<p>Key generation:<\/p>\n<pre class=\"wp-block-code\">\n<code>pub fn keygen(params: &amp;Parameters, seed: Option&lt;u64&gt;) -&gt; (&#91;Polynomial&lt;i64&gt;; 2], Polynomial&lt;i64&gt;) {\n\n\/\/rename parameters\nlet (n, q, f, omega) = (params.n, params.q, &amp;params.f, params.omega);\n\n\/\/ Generate a public and secret key\nlet sk = gen_ternary_poly(n, seed);\nlet a = gen_uniform_poly(n, q, seed);\nlet e = gen_ternary_poly(n, seed);\nlet b = polyadd(&amp;polymul_fast(&amp;polyinv(&amp;a,q), &amp;sk, q, &amp;f, omega), &amp;polyinv(&amp;e,q), q, &amp;f); \/\/ b = -a*sk - e\n                                        \n\/\/ Return public key (b, a) as an array and secret key (sk)\n(&#91;b, a], sk)\n}<\/code>\n<\/pre>\n<p>We generate the secret key, $a$, $e$ as random polynomials, ensuring $a$ is spread out and uniform whereas $sk$ and $e$ are &#8220;small&#8221; ternary polynomials. We then compute $b = -a*sk &#8211; e$, yielding the public key $[b,a]$ and secret key $sk$.<\/p>\n<p>Encryption takes in a public key and plaintext polynomial and produces ciphertext:<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn encrypt( \n   pk: &amp;&#91;Polynomial&lt;i64&gt;; 2],    \/\/ Public key (b, a)\n   m: &amp;Polynomial&lt;i64&gt;,        \/\/ Plaintext polynomial\n   params: &amp;Parameters,       \/\/parameters (n,q,t,f)\n   seed: Option&lt;u64&gt;            \/\/ Seed for random number generator\n) -&gt; &#91;Polynomial&lt;i64&gt;; 2] {\n   let (n,q,t,f,omega) = (params.n, params.q, params.t, &amp;params.f, params.omega);\n   \/\/ Scale the plaintext polynomial. use floor(m*q\/t) rather than floor (q\/t)*m\n   let scaled_m = mod_coeffs(m * q \/ t, q);\n\n   \/\/ Generate random polynomials\n   let e1 = gen_ternary_poly(n, seed);\n   let e2 = gen_ternary_poly(n, seed);\n   let u = gen_ternary_poly(n, seed);\n\n   \/\/ Compute ciphertext components\n   let ct0 = polyadd(&amp;polyadd(&amp;polymul_fast(&amp;pk&#91;0], &amp;u, q, f, omega), &amp;e1, q, f),&amp;scaled_m,q,f);\n   let ct1 = polyadd(&amp;polymul_fast(&amp;pk&#91;1], &amp;u, q, f, omega), &amp;e2, q, f);\n\n   &#91;ct0, ct1]\n}\n<\/code>\n<\/pre>\n<p>We scale the plaintext polynomial via $\\text{scaled\\_m} = \\lfloor q\/t \\rceil m$ and mod coefficients by $q$. Then generate random ternary polynomials $e_1$, $e_2$, $u$. Then $ct_0 = pk[0]*u+e_1+\\text{scaled\\_m}$ and $ct_1 = pk[1]*u + e_2$, and the final ciphertext is the pair $[ct_0,ct_1]$.<\/p>\n<p>Finally, decryption:<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn decrypt(\n    sk: &amp;Polynomial&lt;i64&gt;,    \/\/ Secret key\n    ct: &amp;&#91;Polynomial&lt;i64&gt;; 2],        \/\/ Array of ciphertext polynomials\n    params: &amp;Parameters\n   ) -&gt; Polynomial&lt;i64&gt; {\n   let (_n,q,t,f,omega) = (params.n, params.q, params.t, &amp;params.f, params.omega);\n   let scaled\\_pt = polyadd(&amp;polymul_fast(&amp;ct&#91;1], sk, q, f, omega),&amp;ct&#91;0], q, f);\n   let mut decrypted_coeffs = vec!&#91;];\n   let mut s;\n   for c in scaled_pt.coeffs().iter() {\n      s = nearest_int(c*t,q);\n      decrypted_coeffs.push(s.rem_euclid(t));\n   }\n   Polynomial::new(decrypted_coeffs)\n}\n<\/code>\n<\/pre>\n<p>We compute $\\text{scaled\\_pt} = ct[1]*sk + ct[0]$, then for each coefficient $c$ in $\\text{scaled\\_pt}$ compute $\\lfloor ct\/q \\rceil$ and take the remainder modulo $t$. The polynomial of coefficients is the decrypted message.<\/p>\n<p>Keygen\/encrypt\/decrypt from start to finish:<\/p>\n<pre class=\"wp-block-code\">\n<code>        \nlet seed = None; \/\/set the random seed\nlet message = String::from(\"hello\");\nlet params = Parameters::default();\nlet keypair = keygen_string(&amp;params,seed);\nlet pk_string = keypair.get(\"public\").unwrap();\nlet sk_string = keypair.get(\"secret\").unwrap();\nlet ciphertext_string = encrypt_string(&amp;pk_string, &amp;message, &amp;params,seed);\nlet decrypted_message = decrypt_string(&amp;sk_string, &amp;ciphertext_string, &amp;params);\nassert_eq!(message, decrypted_message, \"test failed: {} != {}\", message, decrypted_message);\n<\/code>\n<\/pre>\n<p>We use serialization and base64 encoding to compress the output strings.<\/p>\n<p>For the homomorphic encryption, we only achieve a single product, so this is &#8220;leveled homomorphic encryption&#8221; at level one for the time being. We use a form of relinearization described here: <\/p>\n<p><a href=\"https:\/\/crypto.stackexchange.com\/questions\/113740\/how-to-correctly-multiply-homomorphically-two-rlwe-based-encrypted-numbers\/113741?noredirect=1#comment242085_113741\">https:\/\/crypto.stackexchange.com\/questions\/113740\/how-to-correctly-multiply-homomorphically-two-rlwe-based-encrypted-numbers\/113741<\/a><\/p>\n<p>We perform multiplications of the ciphertext polynomials and the divide by $\\Delta = q\/t$. The implementation is as follows:<\/p>\n<pre class=\"wp-block-code\">\n<code>\nlet seed = None; \/\/set the random seed\nlet mut params = Parameters::default();\nlet (q, t, f) = (params.q, params.t, &amp;params.f);\nparams.q = q*q;\nparams.omega = omega(params.q, 2*params.n);\n\n\/\/create polynomials from ints\nlet m0_poly = Polynomial::new(vec!&#91;1, 0, 1]);\nlet m1_poly = Polynomial::new(vec!&#91;0, 0, 1]);\n\n\/\/ Generate the keypair\nlet (pk, sk) = keygen(&amp;params,seed);\n\n\/\/ Encrypt plaintext messages\nlet u = encrypt(&amp;pk, &amp;m0_poly, &amp;params, seed);\nlet v = encrypt(&amp;pk, &amp;m1_poly, &amp;params, seed);\n\nlet plaintext_prod = polymul(&amp;m0_poly, &amp;m1_poly, t, &amp;f);\n\n\/\/compute product of encrypted data, using non-standard multiplication\nlet c0 = polymul(&amp;u&#91;0],&amp;v&#91;0],params.q,&amp;f);\nlet u0v1 = &amp;polymul(&amp;u&#91;0],&amp;v&#91;1],params.q,&amp;f);\nlet u1v0 = &amp;polymul(&amp;u&#91;1],&amp;v&#91;0],params.q,&amp;f);\nlet c1 = polyadd(u0v1,u1v0,params.q,&amp;f);\nlet c2 = polymul(&amp;u&#91;1],&amp;v&#91;1],params.q,&amp;f);\n\n\/\/compute c0 + c1*s + c2*s*s\nlet c1_sk = &amp;polymul(&amp;c1,&amp;sk,params.q,&amp;f);\nlet c2_sk_squared = &amp;polymul(&amp;polymul(&amp;c2,&amp;sk,params.q,&amp;f),&amp;sk,params.q,&amp;f);\nlet ciphertext_prod = polyadd(&amp;polyadd(&amp;c0,c1_sk,params.q,&amp;f),c2_sk_squared,params.q,&amp;f);\n\/\/let delta = q \/ t, divide coeffs by 1 \/ delta^2\nlet delta = q \/ t;\nlet decrypted_prod = mod_coeffs(Polynomial::new(ciphertext_prod.coeffs().iter().map(|&amp;coeff| nearest_int(coeff,delta * delta) ).collect::&lt;Vec&lt;_&gt;&gt;()),t);\n                                            \nassert_eq!(plaintext_prod, decrypted_prod, \"test failed: {} != {}\", plaintext_prod, decrypted_prod);\n<\/code>\n<\/pre>\n<p>We&#8217;d eventually like to improve this to be fully homomorphic encryption.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<h2 class=\"wp-block-heading\">module-lwe<\/h2>\n<ul>\n<li>slides: <a href=\"https:\/\/jacksonwalters.com\/blog\/wp-content\/uploads\/2025\/12\/module_lwe.pdf\">module_lwe.pdf<\/a><\/li>\n<li>code: <a href=\"https:\/\/crates.io\/crates\/module-lwe\">https:\/\/crates.io\/crates\/module-lwe<\/a><\/li>\n<\/ul>\n<p>While the polynomial ring structure of ring-LWE is rich and useful for things like homomorphic encryption (since they&#8217;re based on rings, the structure is readily available), this additional algebraic structure is thought to make this system potentially vulnerable to attacks that exploit these number theoretic aspects.<\/p>\n<p>There is a generalization of of ring-LWE called module-LWE which uses the rank $k$ free module $R_q^k$ instead of just the ring $R_q$ (when $k=1$), where again $R_q = \\mathbb{Z}_q\/(x^n+1)$ as in ring-LWE. The elements of $R_q^k$ are k-tuples of polynomials in $R_q$. <\/p>\n<p>Module-LWE works sort of like a hybrid between the original LWE scheme and ring-LWE, combining both matrix operations and polynomial multiplication. <\/p>\n<p>Advantages of module-LWE:<\/p>\n<ul class=\"wp-block-list\">\n<li>Scalability: The main way to increase the security of ring-LWE is to increase the polynomial modulus degree $n$, but since this is alwayts a power of two there is no way to make a small increase in security if you are close to the necessary threshold. On the other hand, in module-LWE you can also increase the module rank $k$, which can be any nonnegative integer.<\/li>\n<li>Parallelization: Many of the computations in module-LWE involve matrix operations, which can be easily parallelized.<\/li>\n<\/ul>\n<figure class=\"wp-block-image size-large\">\n<img decoding=\"async\" \n  src=\"https:\/\/jacksonwalters.com\/blog\/wp-content\/uploads\/2025\/12\/module_lwe_diagram.png\" \n  alt=\"module-lwe diagram for keygen, encrypt, ecrypt\"\n  style=\"max-width:600px; width:100%; height:auto;\"\n\/><figcaption>module-lwe diagram for keygen, encrypt, decrypt<\/figcaption><\/figure>\n<h3 class=\"wp-block-heading\">Public Setup<\/h3>\n<p>The public setup involves the following:<\/p>\n<ul class=\"wp-block-list\">\n<li>A prime $q$ and modulus degree $n$ resulting in the ring $R_q$<\/li>\n<li>Module rank $k$<\/li>\n<li>\n        A notion of &#8220;small&#8221;, as applied to elements of $R_q^k$<\/p>\n<ul class=\"wp-block-list\">\n<li>These are $k$-tuples of ternary polynomials, with coefficients in $\\{-1,0,+1\\}$<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\">Key Generation<\/h3>\n<p>Bob selects a private\/public keypair as follows:<\/p>\n<ul class=\"wp-block-list\">\n<li>Bob selects a small random element $|s_0 \\rangle$ of $R_q^{k+1}$<\/li>\n<li>Bob selects a small random element $|e_0 \\rangle$ of $R_q^{k+1}$<\/li>\n<li>Bob selects a uniformly random matrix $\\hat{A}$ in $R_q^{k \\times k}$<\/li>\n<li>Bob defines $|p_0 \\rangle := \\hat{A}|s_0 \\rangle + |e_0 \\rangle$<\/li>\n<\/ul>\n<p>The element $|e_0 \\rangle$ can be discarded.<\/p>\n<ul class=\"wp-block-list\">\n<li>Bob keeps $|s_0$ as his secret key<\/li>\n<li>Bob makes $(\\hat{A},|p_0\\rangle)$ public as his public key<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\">Encryption by Alice<\/h3>\n<p>Alice encrypts a message $m \\in \\mathbb{Z}_2[x]\/(x^n+1)$ as follows:<\/p>\n<ul class=\"wp-block-list\">\n<li>Alice selects a small random element $\\langle s_1 |$ of $R_q^{1 \\times k}$ (ephemeral key)<\/li>\n<li>Alice selects small random $\\langle e_1 | \\in R_q^{1 \\times k}$ and $e \\in R_q$<\/li>\n<li>Alice defines $\\langle p_1 | := \\langle s_1 | \\hat{A} + \\langle e_1 |$ and $c := \\lfloor q\/2 \\rceil m + e + \\langle s_1 | p_0 \\rangle$<\/li>\n<\/ul>\n<p>Alice may discard $\\langle s_1 |$, $\\langle e_1 |$, and $e$<\/p>\n<ul class=\"wp-block-list\">\n<li>The ciphertext is $(c, \\langle p_1 |)$, which is sent to Bob<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\">Decryption by Bob<\/h3>\n<p>Bob decrypts the message as follows:<\/p>\n<ul class=\"wp-block-list\">\n<li>Compute $x = c &#8211; \\langle p_1 | s_0 \\rangle$<\/li>\n<li>Divide by $\\lfloor q\/2 \\rceil$<\/li>\n<li>Round the coefficients to the nearest integer modulo $2$<\/li>\n<li>the result is the message $m$<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\">Correctness<\/h3>\n<p>$x = c &#8211; \\langle p_1 | s_0 \\rangle$<\/p>\n<p>$= (\\lfloor q\/2 \\rceil m + e + \\langle s_1 | p_0 \\rangle ) &#8211; (\\langle s_1 | \\hat{A} + \\langle e_1 |)|s_0 \\rangle$<\/p>\n<p>$= \\lfloor q\/2 \\rceil m + e + \\langle s_1 | \\hat{A} | s_0 \\rangle + \\langle s_1 | e_0 \\rangle &#8211; \\langle s_1 | \\hat{A} | s_0 \\rangle + \\langle e_1 | s_0 \\rangle$<\/p>\n<p>$= \\lfloor q\/2 \\rceil m + \\text{small polynomials}$<\/p>\n<p>so dividing by $\\lfloor q\/2 \\rceil$ and rounding coefficients to the nearest integer modulo $2$ returns $m$. <\/p>\n<h3 class=\"wp-block-heading\">Code Examples<\/h3>\n<p>\n    We&#8217;ve published our Rust implementation as a crate at <a href=\"https:\/\/crates.io\/crates\/module-lwe\">https:\/\/crates.io\/crates\/module-lwe<\/a>.\n<\/p>\n<p>The key generation is fairly straightforward:<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn keygen(\n    params: &amp;Parameters,\n    seed: Option&lt;u64&gt; \/\/random seed\n) -&gt; ((Vec&lt;Vec&lt;Polynomial&lt;i64&gt;&gt;&gt;, Vec&lt;Polynomial&lt;i64&gt;&gt;), Vec&lt;Polynomial&lt;i64&gt;&gt;) {\n    let (n,q,k,f,omega) = (params.n, params.q, params.k, &amp;params.f, params.omega);\n    \/\/Generate a public and secret key\n    let a = gen_uniform_matrix(n, k, q, seed);\n    let sk = gen_small_vector(n, k, seed);\n    let e = gen_small_vector(n, k, seed);\n    let t = add_vec(&amp;mul_mat_vec_simple(&amp;a, &amp;sk, q, &amp;f, omega), &amp;e, q, &amp;f);\n    \n    \/\/Return public key (a, t) and secret key (sk) as a 2-tuple\n    ((a, t), sk)\n}\n<\/code>\n<\/pre>\n<p>We essentially generate a uniformly random matrix with coefficients in $R_q$, generate the secret key and small error vector, compute $t$, and this gives the public and private keys.<\/p>\n<p>Encryption is also rather brief:<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn encrypt(\n    a: &amp;Vec&lt;Vec&lt;Polynomial&lt;i64&gt;&gt;&gt;,\n    t: &amp;Vec&lt;Polynomial&lt;i64&gt;&gt;,\n    m_b: &amp;Vec&lt;i64&gt;,\n    params: &amp;Parameters,\n    seed: Option&lt;u64&gt;\n) -&gt; (Vec&lt;Polynomial&lt;i64&gt;&gt;, Polynomial&lt;i64&gt;) {\n\n    \/\/get parameters\n    let (n, q, k, f, omega) = (params.n, params.q, params.k, &amp;params.f, params.omega);\n    \n    \/\/generate random ephermal keys\n    let r = gen_small_vector(n, k, seed);\n    let e1 = gen_small_vector(n, k, seed);\n    let e2 = gen_small_vector(n, 1, seed)&#91;0].clone(); \/\/ Single polynomial\n\n    \/\/compute nearest integer to q\/2\n    let half_q = nearest_int(q,2);\n\n    \/\/ Convert binary message to polynomial\n    let m = Polynomial::new(vec!&#91;half_q])*Polynomial::new(m_b.to_vec());\n\n    \/\/ Compute u = a^T * r + e_1 mod q\n    let u = add_vec(&amp;mul_mat_vec_simple(&amp;transpose(a), &amp;r, q, f, omega), &amp;e1, q, f);\n\n    \/\/ Compute v = t * r + e_2 - m mod q\n    let v = polysub(&amp;polyadd(&amp;mul_vec_simple(t, &amp;r, q, &amp;f, omega), &amp;e2, q, f), &amp;m, q, f);\n\n    (u, v)\n}\n<\/code>\n<\/pre>\n<p>We generate random ephemeral keys as small error vectors again with coefficients in $R_q$, compute $\\lfloor q\/2 \\rceil$, convert the binary message to a polynomial with $\\{0,1\\}$ coefficients, then compute the vector $u$ and polynomial $v$. The ciphertext is $(u,v)$.<\/p>\n<p>Finally, decrypt gives the plaintext message given the ciphertext and secret keys:<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn decrypt(\n    sk: &amp;Vec&lt;Polynomial&lt;i64&gt;&gt;,    \/\/secret key\n    u: &amp;Vec&lt;Polynomial&lt;i64&gt;&gt;, \/\/ciphertext vector\n    v: &amp;Polynomial&lt;i64&gt; ,\t\t\/\/ciphertext polynomial\n    params: &amp;Parameters\n) -&gt; Vec&lt;i64&gt; {\n    let (q, f, omega) = (params.q, &amp;params.f, params.omega); \/\/get parameters\n    let scaled_pt = polysub(&amp;v, &amp;mul_vec_simple(&amp;sk, &amp;u, q, &amp;f, omega), q, f); \/\/Compute v-sk*u mod q\n    let half_q = nearest_int(q,2); \/\/ compute nearest integer to q\/2\n    let mut decrypted_coeffs = vec!&#91;];\n    let mut s;\n    for c in scaled_pt.coeffs().iter() {\n        s = nearest_int(*c,half_q).rem_euclid(2);\n        decrypted_coeffs.push(s);\n    }\n    decrypted_coeffs\n}\n<\/code>\n<\/pre>\n<p>We compute $v &#8211; sk*u$, a polynomial, then again compute $\\lfloor q\/2 \\rceil$, and then compute $c \/ \\lfloor q\/2 \\rceil \\text{ mod } 2$ for each coefficient $c$. This yields the decrypted message.<\/p>\n<p>\n    Note that <code>module-lwe<\/code> is not fully homomorphic, but it is additively homomorphic.\n<\/p>\n<pre class=\"wp-block-code\">\n<code>  \nlet seed = None; \/\/set the random seed\nlet params = Parameters::default();\nlet (n, q, f) = (params.n, params.q, &amp;params.f);\n\nlet mut m0 = vec!&#91;1, 0, 1];\nm0.resize(n, 0);\nlet mut m1 = vec!&#91;0, 0, 1];\nm1.resize(n, 0);\nlet mut plaintext_sum = vec!&#91;1, 0, 0];\nplaintext_sum.resize(n, 0);\nlet (pk, sk) = keygen(&amp;params,seed);\n\n\/\/ Encrypt plaintext messages\nlet u = encrypt(&amp;pk.0, &amp;pk.1, &amp;m0, &amp;params, seed);\nlet v = encrypt(&amp;pk.0, &amp;pk.1, &amp;m1, &amp;params, seed);\n\n\/\/ Compute sum of encrypted data\nlet ciphertext_sum = (add_vec(&amp;u.0,&amp;v.0,q,f), polyadd(&amp;u.1,&amp;v.1,q,f));\n\n\/\/ Decrypt ciphertext sum u+v\nlet mut decrypted_sum = decrypt(&amp;sk, &amp;ciphertext_sum.0, &amp;ciphertext_sum.1, &amp;params);\ndecrypted_sum.resize(n, 0);\n\nassert_eq!(decrypted_sum, plaintext_sum, \"test failed: {:?} != {:?}\", decrypted_sum, plaintext_sum);\n<\/code>\n<\/pre>\n<p>That is, if we encrypt $u$, $v$ to get $enc(u)$ and $enc(v)$, then form the sum $enc(u)+enc(v)$, we have $dec(enc(u)+enc(v)) = u+v$, so we recover the original sum.<\/p>\n<p>We use a serialization and base64 encoding on the resulting arrays of ints to store the keys in a compact form.<\/p>\n<p>Here is the entire encryption process from start to finish:<\/p>\n<pre class=\"wp-block-code\">\n<code>  \nlet seed = None; \/\/set random seed\nlet message = String::from(\"hello\");\nlet params = Parameters::default();\nlet keypair = keygen_string(&amp;params,seed);\nlet pk_string = keypair.get(\"public\").unwrap();\nlet sk_string = keypair.get(\"secret\").unwrap();\nlet ciphertext_string = encrypt_string(&amp;pk_string, &amp;message, &amp;params,seed);\nlet decrypted_message = decrypt_string(&amp;sk_string, &amp;ciphertext_string, &amp;params);\nassert_eq!(message, decrypted_message, \"test failed: {} != {}\", message, decrypted_message);\n<\/code>\n<\/pre>\n<p>Note here we use the string methods which generate the keys, ciphertext, and message as a string.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<h2 class=\"wp-block-heading\">the number theoretic transform (NTT)<\/h2>\n<p>\n    <a href=\"https:\/\/crates.io\/crates\/ntt\">https:\/\/crates.io\/crates\/ntt<\/a>\n<\/p>\n<p>In both of the above schemes, we perform a certain number of polynomial multiplications. There is a way to significantly speed this operation up from $\\mathcal{O}(n^2)$ to $\\mathcal{O}(n\\log(n))$.<\/p>\n<p>Polynomials have the property that their multiplication can be viewed as a cyclic convolution of the coefficients. That is, polynomial multiplication is just $f \\star g$.<\/p>\n<p>In general, the Fourier transform will take a convolution of functions to a pointwise product, so that $\\mathcal{F}(f \\star g) = \\mathcal{F}(f) \\cdot \\mathcal{F}(g)$. <\/p>\n<p>We can then perform multiplication in the original domain by performing an inverse Fourier transform, that is:<\/p>\n<p>$f \\star g = \\mathcal{F}^{-1}(\\mathcal{F}(f) \\cdot \\mathcal{F}(g))$<\/p>\n<p>Since we are working over the discrete ring $\\mathbb{Z}_q[x]\/(x^n+1)$, we instead use a version of the NTT which is exact and takes arrays of integers to arrays of integers. This is called the &#8220;number theoretic transform&#8221;, and relies on a root of unity in the given coefficient ring.<\/p>\n<p>In $\\mathbb{Z}_q$, the roots of unity are given by examining the multiplicative group $(\\mathbb{Z}_q)^{\\times}$, which is just a cyclic group of order $q-1$ if $q$ is prime. Generally, the size of the group is $\\varphi(q)$, the Euler totient function, and will have cyclic factors of size $\\varphi(p_i^{k_i})$ for each prime factor $p_i$ of $q$. It is only cyclic when $q=2, 4, p^k, 2p^k$ which is the case we&#8217;ll focus on, in particular $p^k$.<\/p>\n<p>Thus, to find an $n^{th}$ root of unity, we need that $n \\mid \\varphi(q)$. If we then have a primitive root of unity $g$, an element which generates the entire multiplicative group $(\\mathbb{Z}_q)^{\\times}$, then $\\omega = g^{\\varphi(q)\/n}$ will be an $n^{th}$ root of unity. Again, $\\omega$ is just an integer modulo $q$.<\/p>\n<p>The NTT is then just: <\/p>\n<p>$\\hat{a}_k = \\sum_{j=0}^{n-1} a_j \\omega^{jk} \\text{ mod } q \\quad \\text{for } k=0, \\ldots, n-1$<\/p>\n<p>One can think of this as the discrete Fourier transform with a different root of unity. This operation can be represented by a matrix.<\/p>\n<p>In order to optimize this operation, we use the Cooley\u2013Tukey method which uses butterfly operations recursively split the arrays in half. This is a reason that $n$ must be a power of $2$. <\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn ntt(a: &amp;&#91;i64], omega: i64, n: usize, p: i64) -&gt; Vec&lt;i64&gt; {\nlet mut result = a.to_vec();\nlet mut step = n\/2;\nwhile step &gt; 0 {\n    let w_i  = mod_exp(omega, (n\/(2*step)).try_into().unwrap(), p);\n    for i in (0..n).step_by(2*step) { \n        let mut w = 1;\n        for j in 0..step {\n            let u = result&#91;i+j];\n            let v = result&#91;i+j+step];\n            result&#91;i+j] = mod_add(u,v,p);\n            result&#91;i+j+step] = mod_mul(mod_add(u,p-v,p),w,p);\n            w = mod_mul(w,w_i,p);\n        }\n    }\n    step\/=2;\n}\nresult\n}\n<\/code>\n<\/pre>\n<p>\n    <strong>n.b.<\/strong> If one needs an $n^{th}$ root of unity $\\text{mod } N$ and $N$ is composite, It&#8217;s possible to find an $n^{th}$ root of unity $\\omega_i$ for each cyclic factor of size $\\varphi(p_i^{k_i})$ as long as $n \\mid \\varphi(p_i^{k_i})$ for each $i$. Then one can use the Chinese remainder theorem isomorphism to pull back each $\\omega_i$ to get a root of unity $\\omega$ which satisfies the necessary properties, namely $\\sum_{j=0}^{n-1} \\omega^{jk} = 0$ for $1 \\le k &lt; n$.\n<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<h2 class=\"wp-block-heading\">module-lattice key encapsulation mechanism (ML-KEM), FIPS 203, Kyber<\/h2>\n<p>\n    code:<br \/>\n    <a href=\"https:\/\/crates.io\/crates\/mlkem-fips203\">https:\/\/crates.io\/crates\/mlkem-fips203<\/a>\n<\/p>\n<p>\n    FIPS 203 paper:<br \/>\n    <a href=\"https:\/\/nvlpubs.nist.gov\/nistpubs\/FIPS\/NIST.FIPS.203.pdf\">https:\/\/nvlpubs.nist.gov\/nistpubs\/FIPS\/NIST.FIPS.203.pdf<\/a>\n<\/p>\n<p>\n    Kyber paper:<br \/>\n    <a href=\"https:\/\/eprint.iacr.org\/2017\/634.pdf\">https:\/\/eprint.iacr.org\/2017\/634.pdf<\/a>\n<\/p>\n<p>\n    While both ring-LWE and module-LWE represent post-quantum, lattice-based cryptosystems, neither of them have been directly standardized yet. Ring-LWE is more useful for homomorphic encryption, with the potential future security vulnerabilities arising from the direct algebraic ring structure. Module-LWE is more useful as an internal component of what is called a <em>key encapsulation mechanism<\/em>, which is a way to generate a shared key <code>K<\/code>, and encrypt the key itself to get a ciphertext version <code>c<\/code>which can be transmitted over an insecure channel. This is meant to be used in conjunction with another symmetric key cryptosystem, for instance AES which was standardized by NIST in FIPS PUB 197&nbsp;on&nbsp;Nov. 26, 2001.\n<\/p>\n<p>\n    For our implementation, we initially made use of our <code>module-lwe<\/code> implementation but then decided to follow closely Giacomo Pope&#8217;s implementation in Python, <code>kyber-py<\/code>:\n<\/p>\n<p>\n    <a href=\"https:\/\/github.com\/giacomopope\/kyber-py\">https:\/\/github.com\/giacomopope\/kyber-py<\/a>\n<\/p>\n<p>\n    The end goal will be to create the key encapsulation mechanism, which generates a public key <code>pk<\/code> and secret key <code>sk<\/code>, encapsulates <code>pk<\/code>to get the shared key <code>K<\/code> and ciphertext <code>c<\/code>, then use the secret key <code>sk<\/code> to decapsulate <code>c<\/code> to obtain the shared key <code>K<\/code>.\n<\/p>\n<p>\n    We need to be careful about how we generate randomness. We use a DRBG (deterministic random bit generator) to generate a stream of random bytes if the seed is set, and otherwise use OS\/system randomness via <code>getrandom<\/code>.\n<\/p>\n<pre class=\"wp-block-code\">\n<code>\nuse aes_ctr_drbg::DrbgCtx;\n\n...    \n\n\/\/\/ Set the DRBG to be used for random bytes\n    pub fn set_drbg_seed(&amp;mut self, seed: Vec&lt;u8&gt;) {\n        let p = vec!&#91;48, 0]; \/\/ personalization string must be min. 48 bytes long\n        let mut drbg = DrbgCtx::new(); \/\/ instantiate the DRBG\n        drbg.init(&amp;seed, p); \/\/ initialize the DRBG with the seed\n        self.drbg = Some(drbg); \/\/ Store the DRBG in the struct\n    }\n\n...\n\nfn gen_random_bytes(size: usize, drbg: Option&lt;&amp;mut DrbgCtx&gt;) -&gt; Vec&lt;u8&gt; {\n    let mut out = vec!&#91;0; size];\n    if let Some(drbg) = drbg {\n        drbg.get_random(&amp;mut out);\n    }\n    else {\n        getrandom(&amp;mut out).expect(\"Failed to get random bytes\");\n    }\n    out\n}\n<\/code>\n<\/pre>\n<p>\n    We use two psuedorandom functions, <code>prf_2<\/code> and <code>prf_3<\/code>, one for each value of <code>eta=2<\/code> or <code>eta=3<\/code>, where the size of <code>Shake256<\/code> hash is either 128 or 192 bytes, respectively.\n<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn prf_2(s: Vec&lt;u8&gt;, b: u8) -&gt; Vec&lt;u8&gt; {\n    \/\/ Concatenate s and b\n    let mut m = s;\n    m.push(b);\n    \/\/ Apply shake_256 hash\n    let mut shake_256hasher = Shake256Hasher::&lt;128&gt;::default();\n    shake_256hasher.write(&amp;m);\n    let bytes_result = HasherContext::finish(&amp;mut shake_256hasher);\n    bytes_result&#91;0..].to_vec()\n}\n\npub fn prf_3(s: Vec&lt;u8&gt;, b: u8) -&gt; Vec&lt;u8&gt; {\n    \/\/ Concatenate s and b\n    let mut m = s;\n    m.push(b);\n    \/\/ Apply shake_256 hash\n    let mut shake_256hasher = Shake256Hasher::&lt;192&gt;::default();\n    shake_256hasher.write(&amp;m);\n    let bytes_result = HasherContext::finish(&amp;mut shake_256hasher);\n    bytes_result&#91;0..].to_vec()\n}\n<\/code>\n<\/pre>\n<p>\n    <code>eta<\/code> is a parameter related to the width of a CBD (centered binomial distribution), which is given as follows:\n<\/p>\n<p>$X = \\sum_{i=1}^{\\eta} (a_i &#8211; b_i), \\quad \\text{where } a_i, b_i \\sim \\text{Ber}(1\/2)$<\/p>\n<p>This allows us to generate random polynomials.<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn cbd(input_bytes: Vec&lt;u8&gt;, eta: usize, n:usize) -&gt; Polynomial&lt;i64&gt; {\n    assert_eq!(eta*n\/4, input_bytes.len(), \"input length must be eta*n\/4\");\n    let mut coefficients = vec!&#91;0;n];\n    let mut t = BigUint::from_bytes_le(&amp;input_bytes);\n    let mask = BigUint::from((1 &lt;&lt; eta)-1 as u64);\n    let mask2 = BigUint::from((1 &lt;&lt; 2*eta)-1 as u64);\n    for i in 0..n {\n        let x = t.clone() &amp; mask2.clone();\n        let a = (x.clone() &amp; mask.clone()).count_ones() as i64;\n        let b = ((x.clone() &gt;&gt; eta) &amp; mask.clone()).count_ones() as i64;\n        t &gt;&gt;= 2*eta;\n        coefficients&#91;i] = a-b;\n    }\n    Polynomial::new(coefficients)\n}\n<\/code>\n<\/pre>\n<p>We also use an extendable output function (XOF) to extend a random 32 byte input given two domain separation parameters to get an 840 byte random output:<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn xof(bytes32: Vec&lt;u8&gt;, i: u8, j: u8) -&gt; Vec&lt;u8&gt; {\n    \/\/ Concatenate bytes32, i, and j\n    let mut m = bytes32;\n    m.push(i);\n    m.push(j);\n    \/\/ Apply shake_128 hash\n    let mut shake_128hasher = Shake128Hasher::&lt;840&gt;::default();\n    shake_128hasher.write(&amp;m);\n    let bytes_result = HasherContext::finish(&amp;mut shake_128hasher);\n    bytes_result&#91;0..].to_vec()\n}\n<\/code>\n<\/pre>\n<p>\n    This is used to generate a random matrix from a seed. Note this is essentially the matrix from <code>module-lwe<\/code>, and is a random <code>k x k<\/code> matrix whose coefficients are polynomials in the ring $R_q$. In this case, we are setting <code>q=3329<\/code> as a global parameter, a prime, noting that $\\mathbb{Z}_q$ is a field.\n<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn generate_matrix_from_seed(\n    rho: Vec&lt;u8&gt;,\n    rank: usize,\n    n: usize,\n    transpose: bool,\n) -&gt; Vec&lt;Vec&lt;Polynomial&lt;i64&gt;&gt;&gt; {\n    let mut a_data = vec!&#91;vec!&#91;Polynomial::new(vec!&#91;]); rank]; rank];\n\n    for i in 0..rank {\n        for j in 0..rank {\n            let xof_bytes = xof(rho.clone(), j as u8, i as u8);\n            a_data&#91;i]&#91;j] = Polynomial::new(ntt_sample(xof_bytes, n));\n        }\n    }\n\n    if transpose {\n        matrix_transpose(a_data)\n    } else {\n        a_data\n    }\n}\n<\/code>\n<\/pre>\n<p>\n    Note that again we are using the <code>Polynomial&lt;i64&gt;<\/code> type because it is convenient, but it would make a lot of sense to implement this as a struct with the methods we need for <code>polyadd<\/code>, <code>polysub<\/code>, etc. We instead just opt to keep everything as utility functions for now.\n<\/p>\n<p>\n    We can use the <code>prf<\/code> functions to generate an error vector as well:\n<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn generate_error_vector(\n    sigma: Vec&lt;u8&gt;,\n    eta: usize,\n    b: u8,\n    k: usize,\n    poly_size: usize,\n) -&gt; (Vec&lt;Polynomial&lt;i64&gt;&gt;, u8) {\n    let mut elements = vec!&#91;Polynomial::new(vec!&#91;]); k];\n    let mut current_b = b;\n\n    for i in 0..k {\n        let prf_output: Vec&lt;u8&gt;;\n        if eta == 2 {\n            prf_output = prf_2(sigma.clone(), current_b);\n        } else if eta == 3 {\n            prf_output = prf_3(sigma.clone(), current_b);\n        } else {\n            panic!(\"eta must be 2 or 3\"); \/\/ Handle invalid eta values\n        }\n        assert_eq!(eta*poly_size\/4, prf_output.len(), \"eta*poly_size\/4 must be 128 or 192 (prf output length)\");\n        elements&#91;i] = cbd(prf_output, eta, poly_size);\n        current_b += 1;\n    }\n\n    (elements, current_b)\n}\n<\/code>\n<\/pre>\n<p>\n    Similarly we can generate a random polynomial using the <code>cbd<\/code> function:\n<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn generate_polynomial(\n    sigma: Vec&lt;u8&gt;,\n    eta: usize,\n    b: u8,\n    poly_size: usize,\n    q: Option&lt;i64&gt;,\n) -&gt; (Polynomial&lt;i64&gt;, u8) {\n    \/\/ get the prf_output depending on eta = 2, or eta = 3\n    let prf_output: Vec&lt;u8&gt;;\n    if eta == 2 {\n        prf_output = prf_2(sigma, b);\n    } else if eta == 3 {\n        prf_output = prf_3(sigma, b);\n    } else {\n        panic!(\"eta must be 2 or 3\"); \/\/ Handle invalid eta values\n    }\n    let poly = cbd(prf_output, eta, poly_size); \/\/ form the polynomial array from a centered binomial dist.\n    \/\/if a modulus is set, place coeffs in &#91;0,q-1]\n    if let Some(q) = q {\n        return (mod_coeffs(poly,q), b + 1);\n    }\n    (poly, b + 1)\n}\n<\/code>\n<\/pre>\n<p>Now that we&#8217;re able to generate random polynomials, error vectors, and matrices, we want to be able to compress\/encode and decompress\/decode our inputs and outputs. <\/p>\n<p>To compress a polynomial, we compress its coefficients using this function:<\/p>\n<pre class=\"wp-block-code\">\n    <code>fn compress_ele(x: i64, d: usize) -&gt; i64 {\n        let t = 1 &lt;&lt; d;\n        let y = (t * x.rem_euclid(3329) + 1664) \/ 3329; \/\/ n.b. 1664 = 3329 \/ 2\n        y % t\n    }<\/code>\n<\/pre>\n<p>\n    To encode a polynomial, we use a bit-width parameter<br \/>\n    <code>d<\/code><br \/>\n        which is usually 12 or 1:\n<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn encode_poly(poly: Polynomial&lt;i64&gt;, d: usize) -&gt; Vec&lt;u8&gt; {\nlet poly_mod = mod_coeffs(poly.clone(), 3329);\nlet mut t = BigUint::zero(); \/\/ Start with a BigUint initialized to zero\nlet mut coeffs = poly_mod.coeffs().to_vec(); \/\/ get the coefficients of the polynomial\ncoeffs.resize(256, 0); \/\/ ensure they're the right size\n\nfor i in 0..255 {\n    \/\/ OR the current coefficient then left shift by d bits\n    t |= BigUint::from(coeffs&#91;256 - i - 1] as u64); \/\/ Use BigUint for coefficients\n    t &lt;&lt;= d; \/\/ Equivalent to t = t * 2^d\n}\n\n\/\/ Add the last coefficient\nt |= BigUint::from(coeffs&#91;0] as u64);\n\n\/\/ Convert BigUint to a byte vector\nlet byte_len = 32 * d;\nlet mut result = t.to_bytes_le(); \/\/ Convert to little-endian bytes\nresult.resize(byte_len, 0); \/\/ Ensure the result is exactly `32 * d` bytes\n\nresult\n}\n<\/code>\n<\/pre>\n<p>Decode and decompress invert this. For vectors of polynomials, we just compress each polynomial, encode each polynomial, and extend the array of bytes.<\/p>\n<p>\n    The final important step before describing the actual PKE and encapsulation algorithms is the NTT. Above, we saw that the NTT depended on an $n^{th}$ root of unity. In this case, the other global Kyber parameter is <code>n=256<\/code>, which is the degree of the polynomials. Since $q=3329$, and since Cooley-Tukey requires a $2n^{th}$ root of unity (a square root of an $n^{th}$ root of unity), this would mean a $512^{th}$ root of unity, which doesn&#8217;t exist in $\\mathbb{Z}_q$ since $512 \\nmid 3328$. Therefore we need to do something different.\n<\/p>\n<p>Indeed, the FIPS 203 paper outlines on pg. 24 how to perform an NTT despite this limitation. It uses the Chinese remainder theorem to write the ring $R_q = \\mathbb{Z}_q[x]\/(x^{256}+1)$ as<\/p>\n<p>$T_q = \\bigoplus_{i=0}^{127}\\mathbb{Z}_q[x]\/(x^2 &#8211; \\zeta^{BitRev_7(i)+1})$<\/p>\n<p>which is a sum of $128$ quadratic terms. This isomorphism means we only need a $256^{th}$ root of unity, which we do have. We just need to compute the bit reversal function, and the $\\zeta$&#8217;s:<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn bit_reverse(i: i64, k: usize) -&gt; i64 {\n    let mut reversed = 0;\n    let mut n = i;\n    \n    for _ in 0..k {\n        reversed = (reversed &lt;&lt; 1) | (n &amp; 1);\n        n &gt;&gt;= 1;\n    }\n\n    reversed\n}\n\n...\n\nlet zetas: Vec&lt;i64&gt; = (0..128)\n    .map(|i| mod_exp(17, bit_reverse(i, 7), 3329))\n    .collect();\n<\/code>\n<\/pre>\n<p>Here <code>mod_exp<\/code> is just the modular exponentiation modulo <code>q<\/code>.<\/p>\n<p>The NTT is similar to the usual butterfly operations, but we pair coefficients:<\/p>\n<pre class=\"wp-block-code\">\n    <code>pub fn poly_ntt(poly: Polynomial&lt;i64&gt;, zetas: Vec&lt;i64&gt;) -&gt; Polynomial&lt;i64&gt; {\n        let mut coeffs = poly.coeffs().to_vec(); \/\/ Convert slice to Vec&lt;i64&gt;\n        coeffs.resize(256, 0); \/\/ Ensure uniform length\n        let mut k = 1;\n        let mut l = 128;\n        while l &gt;= 2 {\n            let mut start = 0;\n            while start &lt; 256 {\n                let zeta = zetas&#91;k];\n                k += 1;\n                for j in start..start+l {\n                    let t = zeta*coeffs&#91;j+l];\n                    coeffs&#91;j+l] = (coeffs&#91;j]-t).rem_euclid(3329);\n                    coeffs&#91;j] = (coeffs&#91;j]+t).rem_euclid(3329);\n                }\n                start += 2*l;\n            }\n            l &gt;&gt;= 1;\n        }\n        Polynomial::new(coeffs)\n    }<\/code>\n<\/pre>\n<p>\n    With this NTT (and its associated inverse), we can define the <code>ntt<\/code> for vectors as well by just operating on each coefficient. Since we will deal with polynomials and vectors in their NTT form, we multiply them pointwise, and so we need to carefully define the multiplication in $T_q$:\n<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn ntt_base_multiplication(a0:i64 , a1:i64, b0:i64, b1:i64, zeta:i64) -&gt; (i64, i64) {\n    let r0 = (a0*b0+zeta*a1*b1).rem_euclid(3329);\n    let r1 = (a1*b0+a0*b1).rem_euclid(3329);\n    (r0, r1)\n}\n\npub fn ntt_coefficient_multiplication(f_coeffs: Vec&lt;i64&gt;, g_coeffs: Vec&lt;i64&gt;, zetas: Vec&lt;i64&gt;) -&gt; Vec&lt;i64&gt; {\n    let mut new_coeffs = vec!&#91;];\n    \/\/ Multiply in each of the 128 Z_q&#91;x]\/(x^2-zeta) factors\n    for i in 0..64 {\n        let (r0,r1) = ntt_base_multiplication(\n            f_coeffs&#91;4*i+0],\n            f_coeffs&#91;4*i+1],\n            g_coeffs&#91;4*i+0],\n            g_coeffs&#91;4*i+1],\n            zetas&#91;64+i]);\n        let (r2,r3) = ntt_base_multiplication(\n            f_coeffs&#91;4*i+2],\n            f_coeffs&#91;4*i+3],\n            g_coeffs&#91;4*i+2],\n            g_coeffs&#91;4*i+3],\n            -zetas&#91;64+i]);\n        new_coeffs.append(&amp;mut vec!&#91;r0,r1,r2,r3]);\n    }\n    new_coeffs\n}\n<\/code>\n<\/pre>\n<p>With those definitions, we can define a dot product of vectors using this NTT multiplication:<\/p>\n<pre class=\"wp-block-code\">\n<code>\npub fn mul_vec_simple(v0: Vec&lt;Polynomial&lt;i64&gt;&gt;, v1: Vec&lt;Polynomial&lt;i64&gt;&gt;, q: i64, f: Polynomial&lt;i64&gt;, zetas: Vec&lt;i64&gt;) -&gt; Polynomial&lt;i64&gt; {\n    assert!(v0.len() == v1.len());\n    let mut result = Polynomial::new(vec!&#91;]);\n    for i in 0..v0.len() {\n        let v0_v1_mult = ntt_multiplication(v0&#91;i].clone(), v1&#91;i].clone(), zetas.clone());\n        result = polyadd(result, v0_v1_mult.clone(), q, f.clone());\n    }\n    mod_coeffs(result, q)\n}\n<\/code>\n<\/pre>\n<p>This allows us to then define multiplication of a matrix times a vector. Now we are ready to define the PKE and encapsulation operations.<\/p>\n<p>We use different sets of parameters for different levels of security.<\/p>\n<ul class=\"wp-block-list\">\n<li>\n        MLKEM512 (~128 bit security): <code>k=2<\/code>, <code>eta_1=3<\/code>, <code>eta_2=2<\/code>, <code>du=10<\/code>, <code>dv=4<\/code>\n    <\/li>\n<li>\n        MLKEM768 (~192 bit security): <code>k=3<\/code>, <code>eta_1=2<\/code>, <code>eta_2=2<\/code>, <code>du=10<\/code>, <code>dv=4<\/code>\n    <\/li>\n<li>\n        MLKEM1024 (~256 bit security): <code>k=4<\/code>, <code>eta_1=2<\/code>, <code>eta_2=2<\/code>, <code>du=11<\/code>, <code>dv=5<\/code>\n    <\/li>\n<\/ul>\n<p>We keep these in <code>parameters::Parameters<\/code>.<\/p>\n<p>The PKE keygen follows Algorithm 13 of FIPS 203:<\/p>\n<pre class=\"wp-block-code\">\n<code>    \npub fn _k_pke_keygen(\n    &amp;self,\n    d: Vec&lt;u8&gt;,\n) -&gt; (Vec&lt;u8&gt;, Vec&lt;u8&gt;) {\n    \/\/ Expand 32 + 1 bytes to two 32-byte seeds.\n    \/\/ Note: rho, sigma are generated using hash_g\n    let (rho, sigma) = hash_g(&#91;d.clone(), vec!&#91;self.params.k as u8]].concat());\n\n    \/\/ Generate A_hat from seed rho\n    let a_hat = generate_matrix_from_seed(rho.clone(), self.params.k, self.params.n, false);\n\n    \/\/ Set counter for PRF\n    let prf_count = 0;\n\n    \/\/ Generate the error vectors s and e\n    let (s, _prf_count) = generate_error_vector(sigma.clone(), self.params.eta_1, prf_count, self.params.k, self.params.n);\n    let (e, _prf_count) = generate_error_vector(sigma.clone(), self.params.eta_1, prf_count, self.params.k, self.params.n);\n\n    \/\/ the NTT of s as an element of a rank k module over the polynomial ring\n    let s_hat = vec_ntt(s, self.params.zetas.clone());\n    \/\/ the NTT of e as an element of a rank k module over the polynomial ring\n    let e_hat = vec_ntt(e, self.params.zetas.clone());\n    \/\/ A_hat @ s_hat + e_hat\n    let a_hat_s_hat = mul_mat_vec_simple(a_hat, s_hat.clone(), self.params.q, self.params.f.clone(), self.params.zetas.clone());\n    let t_hat = add_vec(a_hat_s_hat, e_hat, self.params.q, self.params.f.clone());\n\n    \/\/ Encode the keys\n    let mut ek_pke = encode_vector(t_hat, 12); \/\/ Encoding vec of polynomials to bytes\n    ek_pke.extend(rho); \/\/ append rho, output of hash function\n    let dk_pke = encode_vector(s_hat, 12); \/\/ Encoding s_hat for dk_pke\n\n    (ek_pke, dk_pke)\n}\n<\/code>\n<\/pre>\n<p>Note that this essentially computes $\\hat{t}=\\hat{A}\\hat{s} + \\hat{e}$ and encodes this vector to get the public key, and then just encodes $\\hat{s}$ to get the private key.<\/p>\n<p>Next, the PKE encryption:<\/p>\n<pre class=\"wp-block-code\">\n<code>    \npub fn _k_pke_encrypt(\n    &amp;self,\n    ek_pke: Vec&lt;u8&gt;,\n    m: Vec&lt;u8&gt;,\n    r: Vec&lt;u8&gt;,\n) -&gt; Result&lt;Vec&lt;u8&gt;, String&gt; {\n\n    let expected_len = ek_pke.len();\n    let received_len = 384 * self.params.k + 32;\n\n    if expected_len != received_len {\n        return Err(format!(\n            \"Type check failed: ek_pke length mismatch (expected {}, got {})\",\n            received_len, expected_len\n        ));\n    }\n\n    \/\/ Unpack ek\n    let (t_hat_bytes_slice, rho_slice) = ek_pke.split_at(ek_pke.len() - 32);\n    let t_hat_bytes = t_hat_bytes_slice.to_vec();\n    let rho = rho_slice.to_vec();\n\n    \/\/ decode the vector of polynomials from bytes\n    let t_hat = decode_vector(t_hat_bytes.clone(), self.params.k, 12);\n\n    \/\/ check that t_hat has been canonically encoded\n    if encode_vector(t_hat.clone(),12) != t_hat_bytes {\n        return Err(\"Modulus check failed: t_hat does not encode correctly\".to_string());\n    }\n\n    \/\/ Generate A_hat^T from seed rho\n    let a_hat_t = generate_matrix_from_seed(rho.clone(), self.params.k, self.params.n, true);\n\n    \/\/ generate error vectors y, e1 and error polynomial e2\n    let prf_count = 0;\n    let (y, _prf_count) = generate_error_vector(r.clone(), self.params.eta_1, prf_count, self.params.k, self.params.n);\n    let (e1, _prf_count) = generate_error_vector(r.clone(), self.params.eta_2, prf_count, self.params.k, self.params.n);\n    let (e2, _prf_count) = generate_polynomial(r.clone(), self.params.eta_2, prf_count, self.params.n, None);\n\n    \/\/ compute the NTT of the error vector y\n    let y_hat = vec_ntt(y, self.params.zetas.clone());\n\n    \/\/ compute u = intt(a_hat.T * y_hat) + e1\n    let a_hat_t_y_hat = mul_mat_vec_simple(a_hat_t, y_hat.clone(), self.params.q, self.params.f.clone(), self.params.zetas.clone());\n    let a_hat_t_y_hat_intt = vec_intt(a_hat_t_y_hat, self.params.zetas.clone());\n    let u = add_vec(a_hat_t_y_hat_intt, e1, self.params.q, self.params.f.clone());\n\n    \/\/decode the polynomial mu from the bytes m\n    let mu = decompress_poly(decode_poly(m, 1),1);\n\n    \/\/compute v = intt(t_hat.y_hat) + e2 + mu\n    let t_hat_dot_y_hat = mul_vec_simple(t_hat, y_hat, self.params.q, self.params.f.clone(), self.params.zetas.clone());\n    let t_hat_dot_y_hat_intt = poly_intt(t_hat_dot_y_hat, self.params.zetas.clone());\n    let t_hat_dot_y_hat_intt_plus_e2 = polyadd(t_hat_dot_y_hat_intt.clone(), e2.clone(), self.params.q, self.params.f.clone());\n    let v = polyadd(t_hat_dot_y_hat_intt_plus_e2, mu, self.params.q, self.params.f.clone());\n\n    \/\/ compress vec u, poly v by compressing coeffs, then encode to bytes using params du, dv\n    let c1 = encode_vector(compress_vec(u,self.params.du),self.params.du);\n    let c2 = encode_poly(compress_poly(v,self.params.dv),self.params.dv);\n\n    \/\/return c1 + c2, the concatenation of two encoded polynomials\n    Ok(&#91;c1, c2].concat())\n\n}\n<\/code>\n<\/pre>\n<p>We unpack $\\hat{t}$ (and check correctness), then generate $\\hat{A}^{T}$, generate error vectors $y$, $e_1$, and error polynomial $e_2$. Then we compute $\\hat{y}$ (the NTT), and $u = intt(\\hat{A}^T*\\hat{y}) + e_1$ (a vector). We decode the message $m$ as $\\mu$, then compute $v = intt(\\hat{t}\\cdot\\hat{y}) + e_2 + \\mu$ (a polynomials). We then compress vector $u$ and polynomial $v$ to get the ciphertext $c_1$, $c_2$.<\/p>\n<p>Decryption is as follows:<\/p>\n<pre class=\"wp-block-code\">\n<code>    \npub fn _k_pke_decrypt(&amp;self, dk_pke: Vec&lt;u8&gt;, c: Vec&lt;u8&gt; ) -&gt; Vec&lt;u8&gt; {\n\n    \/\/ encoded size\n    let n = self.params.k * self.params.du * 32;\n    \n    \/\/ break ciphertext into two encoded parts\n    let (c1, c2) = c.split_at(n);\n    let c1 = c1.to_vec();\n    let c2 = c2.to_vec();\n\n    \/\/ decode and decompress c1, c2, dk_pke into vector u, polynomial v, secret key\n    let u = decompress_vec(decode_vector(c1, self.params.k, self.params.du), self.params.du);\n    let v = decompress_poly(decode_poly(c2, self.params.dv), self.params.dv);\n    let s_hat = decode_vector(dk_pke, self.params.k, 12);\n\n    \/\/ compute u_hat, the NTT of u\n    let u_hat = vec_ntt(u, self.params.zetas.clone());\n\n    \/\/ compute w = v - (s_hat.u_hat).from_ntt()\n    let s_hat_dot_u_hat = mul_vec_simple(s_hat, u_hat, self.params.q, self.params.f.clone(), self.params.zetas.clone());\n    let s_hat_dot_u_hat_intt = poly_intt(s_hat_dot_u_hat, self.params.zetas.clone());\n    let w = polysub(v, s_hat_dot_u_hat_intt, self.params.q, self.params.f.clone());\n\n    \/\/ compress and encode w to get message m\n    let m = encode_poly(compress_poly(w,1),1);\n\n    m\n\n}\n<\/code>\n<\/pre>\n<p>We break the incoming ciphertext bytes into two parts, $c_1$, $c_2$. We then recover $u$, $v$, $\\hat{s}$ from above. We compute the NTT $\\hat{u}$. Finally we compute $w = v &#8211; (\\hat{s} \\cdot \\hat{u})$, and compress and encode to get $m$. <\/p>\n<p>That&#8217;s it! Here&#8217;s a full example of running the PKE to perform keygen\/encrypt\/decrypt. Keep in mind these are internal functions:<\/p>\n<pre class=\"wp-block-code\">\n<code>    \n\/\/ run the basic PKE with a uniformly random message polynomial\nlet params = Parameters::mlkem512(); \/\/ initialize default parameters\nlet mut mlkem = MLKEM::new(params); \nmlkem.set_drbg_seed(vec!&#91;0x42; 48]); \/\/ Example 48-byte seed\nlet d = (mlkem.params.random_bytes)(32, mlkem.drbg.as_mut());\nlet (ek_pke, dk_pke) = mlkem._k_pke_keygen(d); \/\/ Generate public and private keys for PKE\nlet rand_bytes_os = (mlkem.params.random_bytes)(mlkem.params.n, None); \/\/get random bytes from OS\nlet rand_coeffs: Vec&lt;i64&gt; = rand_bytes_os.into_iter().map(|byte| byte as i64).collect(); \/\/ convert to i64 array\nlet m_poly = Polynomial::new(rand_coeffs); \/\/ create a polynomial from the random coefficients\nlet m = encode_poly(compress_poly(m_poly,1),1); \/\/ compress and encode the message polynomial\nlet r = vec!&#91;0x01, 0x02, 0x03, 0x04]; \/\/ Example random bytes for encryption\nlet c = match mlkem._k_pke_encrypt(ek_pke, m.clone(), r) { \/\/perform encryption, handling potential errors\n    Ok(ciphertext) =&gt; ciphertext,\n    Err(e) =&gt; panic!(\"Encryption failed: {}\", e),\n};\nlet m_dec = mlkem._k_pke_decrypt(dk_pke, c); \/\/ perform the decryption\nassert_eq!(m, m_dec); \/\/ check if the decrypted message matches the original message\n<\/code>\n<\/pre>\n<p>\n    There are three algorithms 16, 17, 18 for <code>_keygen_internal<\/code>, <code>_encaps_internal<\/code>, and <code>_decaps_internal<\/code>:\n<\/p>\n<pre class=\"wp-block-code\">\n<code>    \npub fn _keygen_internal(&amp;self, d: Vec&lt;u8&gt;, z: Vec&lt;u8&gt;) -&gt; (Vec&lt;u8&gt;, Vec&lt;u8&gt;) {\n    \n    let (ek_pke, dk_pke) = self._k_pke_keygen(d);\n\n    let ek = ek_pke;\n    let dk = &#91;dk_pke, ek.clone(), hash_h(ek.clone()), z].concat();\n\n    (ek, dk)\n}\n<\/code>\n<\/pre>\n<pre class=\"wp-block-code\">\n<code>    \npub fn _encaps_internal(&amp;self, ek: Vec&lt;u8&gt;, m: Vec&lt;u8&gt;) -&gt; Result&lt;(Vec&lt;u8&gt;, Vec&lt;u8&gt;), String&gt; {\n    let (shared_k, r) = hash_g(&#91;m.clone(), hash_h(ek.clone())].concat());\n\n    let c = self._k_pke_encrypt(ek, m, r)?; \/\/ Propagate error with `?`\n\n    Ok((shared_k, c))\n}\n<\/code>\n<\/pre>\n<pre class=\"wp-block-code\">\n<code>    \npub fn _decaps_internal(&amp;self, dk: Vec&lt;u8&gt;, c: Vec&lt;u8&gt;) -&gt; Result&lt;Vec&lt;u8&gt;, String&gt;{\n\n    \/\/ NOTE: ML-KEM requires input validation before returning the result of\n    \/\/ decapsulation. These are performed by the following three checks:\n    \/\/\n    \/\/ 1) Ciphertext type check: the byte length of c must be correct\n    \/\/ 2) Decapsulation type check: the byte length of dk must be correct\n    \/\/ 3) Hash check: a hash of the internals of the dk must match\n    \/\/\n    \/\/ Unlike encaps, these are easily performed in the kem decaps\n\n    if c.len() != 32 * (self.params.du * self.params.k + self.params.dv) {\n        return Err(format!(\n            \"ciphertext type check failed. Expected {} bytes and obtained {}\",\n            32 * (self.params.du * self.params.k + self.params.dv),\n            c.len()\n        ));\n    }\n\n    if dk.len() != 768 * self.params.k + 96{\n        return Err(format!(\n            \"decapsulation type check failed. Expected {} bytes and obtained {}\",\n            768 * self.params.k + 96,\n            dk.len()\n        ));\n    }\n\n    \/\/ Parse out data from dk as Vec&lt;u8&gt;\n    let dk_pke = dk&#91;0..384 * self.params.k].to_vec();\n    let ek_pke = dk&#91;384 * self.params.k..768 * self.params.k + 32].to_vec();\n    let h = dk&#91;768 * self.params.k + 32..768 * self.params.k + 64].to_vec();\n    let z = dk&#91;768 * self.params.k + 64..].to_vec();\n\n    \/\/ Ensure the hash-check passes\n    if hash_h(ek_pke.clone()) != h{\n        return Err(\"hash check failed\".to_string());\n    }\n\n    \/\/ Decrypt the ciphertext\n    let m_prime = self._k_pke_decrypt(dk_pke, c.clone());\n\n    \/\/ Re-encrypt the recovered message\n    let (k_prime, r_prime) = hash_g(&#91;m_prime.clone(),h].concat());\n    let k_bar = hash_j(&#91;z,c.clone()].concat());\n\n    \/\/ Here the public encapsulation key is read from the private\n    \/\/ key and so we never expect this to fail the TypeCheck or ModulusCheck\n    let c_prime = match self._k_pke_encrypt(ek_pke.clone(), m_prime.clone(), r_prime.clone()) {\n        Ok(ciphertext) =&gt; ciphertext,\n        Err(e) =&gt; panic!(\"Encryption failed: {}\", e),\n    };\n\n    \/\/ If c != c_prime, return K_bar as garbage\n    \/\/ WARNING: for proper implementations, it is absolutely\n    \/\/ vital that the selection between the key and garbage is\n    \/\/ performed in constant time\n    let shared_k = select_bytes(k_bar, k_prime, c == c_prime);\n\n    Ok(shared_k)\n}\n<\/code>\n<\/pre>\n<p>Note now this function may fail if the length is wrong for the ciphertext or secret key, in which case we return an error string.<\/p>\n<p>\n    Now for the final functions, the MLKEM <code>keygen<\/code>, <code>encaps<\/code>, <code>decaps<\/code>:\n<\/p>\n<pre class=\"wp-block-code\">\n<code>    \npub fn keygen(&amp;mut self) -&gt; (Vec&lt;u8&gt;, Vec&lt;u8&gt;) {\n    let d = (self.params.random_bytes)(32, self.drbg.as_mut());\n    let z = (self.params.random_bytes)(32, self.drbg.as_mut());\n    let (ek, dk) = self._keygen_internal(d,z);\n    return (ek, dk)\n}\n<\/code>\n<\/pre>\n<pre class=\"wp-block-code\">\n<code>\t\npub fn encaps(&amp;mut self, ek: Vec&lt;u8&gt;) -&gt; Result&lt;(Vec&lt;u8&gt;, Vec&lt;u8&gt;), String&gt; {\n    let m = (self.params.random_bytes)(32, self.drbg.as_mut());\n    let (shared_k, c) = self._encaps_internal(ek, m)?; \/\/ Propagate error with `?`\n    Ok((shared_k, c))\n}\n<\/code>\n<\/pre>\n<pre class=\"wp-block-code\">\n<code>\t\npub fn decaps(&amp;self, dk: Vec&lt;u8&gt;, c: Vec&lt;u8&gt;) -&gt; Result&lt;Vec&lt;u8&gt;, String&gt; {\n    let shared_k_prime = self._decaps_internal(dk, c)?; \/\/ Propagate error with `?`\n    Ok(shared_k_prime)\n}\n<\/code>\n<\/pre>\n<p>That&#8217;s it! We can perform the keygen\/encaps\/decaps as follows, which takes less than a millisecond on an M1 (2020) machine:<\/p>\n<pre class=\"wp-block-code\">\n<code>        \nlet params = Parameters::mlkem512();\nlet mut mlkem = MLKEM::new(params);\nlet (ek, dk) = mlkem.keygen();\nlet (shared_k,c) = match mlkem.encaps(ek) {\n    Ok(ciphertext) =&gt; ciphertext,\n    Err(e) =&gt; panic!(\"Encryption failed: {}\", e),\n};\nlet shared_k_decaps = match mlkem.decaps(dk,c) {\n    Ok(decapsulated_shared_key) =&gt; decapsulated_shared_key,\n    Err(e) =&gt; panic!(\"Decryption failed: {}\", e),\n};\nassert_eq!(shared_k, shared_k_decaps);\n<\/code>\n<\/pre>\n<p>\n    <strong>n.b.<\/strong> None of this is written in constant-time, nor resistant to side-channel attacks, nor been audited for security. It&#8217;s simply meant to get people oriented to these new post-quantum cryptography systems, in a strongly-typed and fast language.\n<\/p>\n<p>\n    There are surely optimized and secure implementations. It sounds like <a href=\"https:\/\/www.openssl.org\/\">OpenSSL <\/a> is releasing their implementation tomorrow. I encourage you to head on over to the <a href=\"https:\/\/www.bouncycastle.org\/\" data-type=\"link\" data-id=\"https:\/\/www.bouncycastle.org\">Bouncy Castle<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This post aims to cover some practical aspects of post-quantum cryptography and homomorphic encryption, topics I&#8217;ve been interested in over the past few months. Much of this work was performed during a Cryptography Fellowship with Zaiku Group, Ltd. These tools are meant to provide modern upgrades to some current cryptographic schemes which have been in [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[],"class_list":["post-1","post","type-post","status-publish","format-standard","hentry","category-cryptography"],"_links":{"self":[{"href":"https:\/\/jacksonwalters.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jacksonwalters.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jacksonwalters.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jacksonwalters.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jacksonwalters.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1"}],"version-history":[{"count":156,"href":"https:\/\/jacksonwalters.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1\/revisions"}],"predecessor-version":[{"id":204,"href":"https:\/\/jacksonwalters.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1\/revisions\/204"}],"wp:attachment":[{"href":"https:\/\/jacksonwalters.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jacksonwalters.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jacksonwalters.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}