Commit 725bf3d8 authored by Benoit Viguier's avatar Benoit Viguier
Browse files

moving some text (not reworked)

parent afeac8a6
......@@ -6,168 +6,9 @@ We then discuss which proofs are required.
\subsection{Implementation} \label{sec:impl}
Given a natural number $n$ and a value $x \in \F{p}$, Curve25519 is a function over a $\F{p}$-restricted
$x$-coordinate computing a scalar multiplication on $E(\F{p^2})$.
As a result of this restriction, all computations are done over $\F{p}$.
Numbers in that field can be represented with 256 bits.
We represent them in 8-bit limbs (respectively 16-bit limbs),
making use of a base $2^8$ (respectively $2^{16}$).
Consequently, inputs of the Curve25519 function are seen as arrays of bytes.
Computations inside this function makes use of the 16-bit limbs representation.
Those are placed into 64-bits signed container in order to mitigate overflows or underflows.
typedef long long i64;
typedef i64 gf[16];
Notice this does not guaranty a unique representation of each number. i.e.\\
$\exists x,y \in$ \TNaCle{gf} such that
$$x \neq y\ \ \land\ \ x \equiv y \pmod{2^{255}-19}$$
On the other hand it allows simple definitions of addition (\texttt{A}),
substraction (\texttt{Z}), and school-book multiplication (\texttt{M}).
% and squaring (\texttt{S}).
sv A(gf o,const gf a,const gf b) {
int i;
FOR(i,16) o[i]=a[i]+b[i];
sv Z(gf o,const gf a,const gf b) {
int i;
FOR(i,16) o[i]=a[i]-b[i];
sv M(gf o,const gf a,const gf b) {
i64 i,j,t[31];
FOR(i,31) t[i]=0;
FOR(i,16) FOR(j,16) t[i+j]+=a[i]*b[j];
FOR(i,15) t[i]+=38*t[i+16];
FOR(i,16) o[i]=t[i];
To avoid overflows, carries are propagated by the \texttt{car25519} function.
sv car25519(gf o)
int i;
FOR(i,16) {
% In order to simplify the verification of this function,
% we extract the last step of the loop $i = 15$.
% \begin{lstlisting}[language=Ctweetnacl]
% sv car25519(gf o)
% {
% int i;
% i64 c;
% FOR(i,15) {
% o[(i+1)]+=o[i]>>16;
% o[i]&=0xffff;
% }
% o[0]+=38*(o[15]>>16);
% o[15]&=0xffff;
% }
% \end{lstlisting}
At the end of the Montgomery ladder, \texttt{inv25519} computes the inverse over \Zfield.
It uses Fermat's little theorem by the exponentiation to
$2^{255}-21$ with the Square-and-multiply algorithm.
% It takes advantage of the shape of the number by not doing the multiplications only twice.
The last step of \texttt{crypto\_scalarmult} is the packing of the limbs: an array of 32 bytes.
It first performs 3 carry propagations in order to guarantee
that each 16-bit limbs values are between $0$ and $2^{16}$.
Then computes a modulo reduction by $\p$ using iterative substraction and
conditional swapping. This guarantees a unique representation in $\Zfield$.
After which each 16-bit limbs are splitted into 8-bit limbs.
sv pack25519(u8 *o,const gf n)
int i,j;
i64 b;
gf t,m={0};
FOR(j,2) {
m[0]=t[0]- 0xffed;
for(i=1;i<15;i++) {
FOR(i,16) {
The full Montgomery ladder is defined as follow.
First extract and clamp the value of $n$. Then unpack the value of $p$.
Compute the Montgomery ladder over the clamped $n$ and $p$, pack the result into $q$.
int crypto_scalarmult(u8 *q,
const u8 *n,
const u8 *p)
u8 z[32];
i64 r;
int i;
gf x,a,b,c,d,e,f;
FOR(i,31) z[i]=n[i];
FOR(i,16) {
for(i=254;i>=0;--i) {
return 0;
\subsection{What needs to be proven?}
This diff is collapsed.
......@@ -325,3 +325,11 @@ note = {\url{
note = {\url{}}
title = {25519 naming, Posting to the CFRG mailing list},
author = {Daniel J. Bernstein},
year = {2008},
note = {\url{}}
Daniel J. Bernstein. 25519 naming. , 2014. . 2
This diff is collapsed.
XXX: TweetNaCl (find real-world use of TweetNaCl?) uses tweetnacl-js (as JS port of tweetnacl) for their webclient
Keybase client:
Zeromq a messaging queue system uses tweetnacl to provide portability to it's users.
``TweetNaCl is the
first cryptographic library that allows correct functionality to be verified
by auditors with reasonable effort''
......@@ -9,20 +13,30 @@ by auditors with reasonable effort''
XXX: One core component of TweetNaCl (and NaCl) is X25519, mention use
of X25519 in the wild.
Originally, the name ``Curve25519'' referred to this keyexchange protocol,
but Bernstein suggested to rename the scheme to X25519 and to use the name
Curve25519 for the underlying elliptic curve~\cite{Ber14}.
We make use of this notation in this paper.
TweetNaCl\cite{BGJ+15} is a compact reimplementation of the
NaCl\cite{BLS12} high security cryptographic library.
TweetNaCl~\cite{BGJ+15} is a compact reimplementation of the
NaCl~\cite{BLS12} high security cryptographic library.
It does not aim for high speed application and has been optimized for source
code compactness (100 tweets). It maintains some degree of readability in order
to be easily auditable.
This library makes use of Curve25519\cite{Ber06}, a function over a \F{p}-restricted
This library makes use of Curve25519~\cite{Ber06}, a function over a \F{p}-restricted
$x$-coordinate computing a scalar multiplication on $E(\F{p^2})$, where $p$ is
the prime number $\p$ and $E$ is the elliptic curve $y^2 = x^3 + 486662 x^2 + x$.
This function is used by a wide variety of applications\cite{this-that-use-curve25519}
This function is used by a wide variety of applications~\cite{this-that-use-curve25519}
to establish a shared secret over an insecure channel.
\subheading{Contribution of this paper}
\todo{Proof that TweetNaCl's X25519 code correctly implements math definition from 25519 paper}
\todo{State additional contributions, e.g., extension of EC framework by Bartiza and Strub.}
Implementing cryptographic primitives without any bugs is difficult.
While tests provides with code coverage, they still don't cover 100\% of the
possible input values. In order to get formal guaranties a software meets its
......@@ -30,38 +44,32 @@ specifications, two methodologies exist.
The first one is by synthesizing a formal specification and generating machine
code by refinment in order to get a software correct by construction.
This approach is being used in e.g. the B-method\cite{Abrial:1996:BAP:236705},
F* \cite{DBLP:journals/corr/BhargavanDFHPRR17}, or with Coq \cite{CpdtJFR}.
This approach is being used in e.g. the B-method~\cite{Abrial:1996:BAP:236705},
F*~\cite{DBLP:journals/corr/BhargavanDFHPRR17}, or with Coq~\cite{CpdtJFR}.
However this first method cannot be applied to an existing piece of software.
In such case we need to write the specifications and then verify the correctness
of the implementation.
\subheading{Contribution of this paper}
\todo{Proof that TweetNaCl's X25519 code correctly implements math definition from 25519 paper}
\todo{State additional contributions, e.g., extension of EC framework by Bartiza and Strub.}
\subheading{Our Formal Approach.}
Verifying an existing cryptographic library, we use the second approach.
Our method can be seen as a static analysis over the input values coupled
with the formal proof that the code of the algorithm matches its specification.
We use Coq\cite{coq-faq}, a formal system that allows us to machine-check our proofs.
A famous example of its use is the proof of the Four Color Theorem \cite{gonthier2008formal}.
The CompCert, a C compiler\cite{Leroy-backend} proven correct and sound is being build on top of it.
We use Coq~\cite{coq-faq}, a formal system that allows us to machine-check our proofs.
A famous example of its use is the proof of the Four Color Theorem~\cite{gonthier2008formal}.
The CompCert, a C~compiler~\cite{Leroy-backend} proven correct and sound is being build on top of it.
To prove its correctness, CompCert uses multiple intermediate languages. The first step of CompCert is done by the parser \textit{clightgen}.
It takes as input C code and generates its Clight\cite{Blazy-Leroy-Clight-09} translation.
It takes as input C code and generates its Clight~\cite{Blazy-Leroy-Clight-09} translation.
Using this intermediate representation Clight, we use the Verifiable Software Toolchain
(VST)\cite{2012-Appel}, a framework which uses Separation logic\cite{1969-Hoare,Reynolds02separationlogic}
(VST)~\cite{2012-Appel}, a framework which uses Separation logic~\cite{1969-Hoare,Reynolds02separationlogic}
and shows that the semantics of the program satisfies a functionnal specification in Coq.
VST steps through each line of Clight using a strongest post-condition strategy.
We write a specification of the crypto scalar multiplication of TweetNaCl and using
VST we prove that the code matches our definitions.
Bartzia and Strub wrote a formal library for elliptic curves\cite{DBLP:conf/itp/BartziaS14}.
Bartzia and Strub wrote a formal library for elliptic curves~\cite{DBLP:conf/itp/BartziaS14}.
We extend it to support Montgomery curves. With this formalization, we prove the
correctness of a generic Montgomery ladder and show that our specification is an instance of it.
......@@ -69,21 +77,21 @@ correctness of a generic Montgomery ladder and show that our specification is an
\todo{Separate verification of existing code from generating proof-carrying code.}
Similar approaches have been used to prove the correctness of OpenSSL HMAC~\cite{Beringer2015VerifiedCA}
Similar approaches have been used to prove the correctness of OpenSSL HMAC~\cite{Beringer2015VerifiedCA}
and SHA-256~\cite{2015-Appel}. Compared to their work
our approaches makes a complete link between the C implementation and the formal
mathematical definition of the group theory behind elliptic curves.
Using the synthesis approach, Zinzindohou{\'{e}} et al. wrote an verified extensible
library of elliptic curves\cite{Zinzindohoue2016AVE}. This served as ground work for the
cryptographic library HACL*\cite{zinzindohoue2017hacl} used in the NSS suite from Mozilla.
library of elliptic curves~\cite{Zinzindohoue2016AVE}. This served as ground work for the
cryptographic library HACL*~\cite{zinzindohoue2017hacl} used in the NSS suite from Mozilla.
Coq does not only allows verification but also synthesis.
Using correct-by-construction finite field arithmetic\cite{Philipoom2018CorrectbyconstructionFF}
one can synthesize\cite{Erbsen2016SystematicSO} certified elliptic curve
implementations\cite{Erbsen2017CraftingCE}. These implementation are now being used in BoringSSL\cite{fiat-crypto}.
Using correct-by-construction finite field arithmetic~\cite{Philipoom2018CorrectbyconstructionFF}
one can synthesize~\cite{Erbsen2016SystematicSO} certified elliptic curve
implementations~\cite{Erbsen2017CraftingCE}. These implementation are now being used in BoringSSL~\cite{fiat-crypto}.
Curve25519 implementations has already been under the scrutiny \cite{Chen2014VerifyingCS}.
Curve25519 implementations has already been under the scrutiny~\cite{Chen2014VerifyingCS}.
However in this paper we provide a proofs of correctness and soundness of a C program up to
the theory of elliptic curves.
......@@ -121,4 +129,3 @@ of in order to avoid unnecessary work.
% \end{itemize}
% Add comparison with Fiat-crypto
\section{Proving equivalence of X25519 in C and Coq}
In this section we describe techniques used to prove the equivalence between the
Clight description of TweetNaCl and Coq functions producing similar behaviors.
\subsection{Number Representation and C Implementation}
As described in Section \ref{sec:impl}, numbers in \TNaCle{gf} are represented
in base $2^{16}$ and we use a direct mapping to represent that array as a list
integers in Coq. However in order to show the correctness of the basic operations,
we need to convert this number as a full integer.
Let \Coqe{ZofList} : $\Z \rightarrow \texttt{list} \Z \rightarrow \Z$, a parametrized map by $n$ betwen a list $l$ and its
it's little endian representation with a base $2^n$.
We define it in Coq as:
Fixpoint ZofList {n:Z} (a:list Z) : Z :=
match a with
| [] => 0
| h :: q => h + (pow 2 n) * ZofList q
We define a notation where $n$ is $16$.
Notation "Z16.lst A" := (ZofList 16 A).
We also define a notation to do the modulo, projecting any numbers in $\Zfield$.
Notation "A :GF" := (A mod (2^255-19)).
Remark that this representation is different from \Coqe{Zmodp}.
However the equivalence between operations over $\Zfield$ and $\F{p}$ is easily proven.
Using these two definitions, we proved intermediates lemmas such as the correctness of the
multiplication \Coqe{M} where \Coqe{M} replicate the computations and steps done in C.
For all list of integers \texttt{a} and \texttt{b} of length 16 representing
$A$ and $B$ in $\Zfield$, the number represented in $\Zfield$ by the list \Coqe{(M a b)}
is equal to $A \times B \bmod \p$.
And seen in Coq as follows:
Lemma mult_GF_Zlength :
forall (a:list Z) (b:list Z),
Zlength a = 16 ->
Zlength b = 16 ->
(Z16.lst (M a b)) :GF =
(Z16.lst a * Z16.lst b) :GF.
\subsection{Inversions in \Zfield}
In a similar fashion we define a Coq version of the inversion mimicking
the behavior of \TNaCle{inv25519} over \Coqe{list Z}.
sv inv25519(gf o,const gf a)
gf c;
int i;
for(i=253;i>=0;i--) {
if(i!=2 && i!=4) M(c,c,a);
FOR(i,16) o[i]=c[i];
We specify this with 2 functions: a recursive \Coqe{pow_fn_rev} to to simulate the for loop and a simple
\Coqe{step_pow} containing the body. Note the off by one for the loop.
Definition step_pow (a:Z) (c g:list Z) : list Z :=
let c := Sq c in
if a <>? 1 && a <>? 3
then M c g
else c.
Function pow_fn_rev (a:Z) (b:Z) (c g: list Z)
{measure Z.to_nat a} : (list Z) :=
if a <=? 0
then c
let prev := pow_fn_rev (a - 1) b c g in
step_pow (b - 1 - a) prev g.
This \Coqe{Function} requires a proof of termination. It is done by proving the
Well-foundness of the decreasing argument: \Coqe{measure Z.to_nat a}. Calling
\Coqe{pow_fn_rev} 254 times allows us to reproduce the same behavior as the \texttt{Clight} definition.
Definition Inv25519 (x:list Z) : list Z :=
pow_fn_rev 254 254 x x.
Similarily we define the same function over $\Z$.
Definition step_pow_Z (a:Z) (c:Z) (g:Z) : Z :=
let c := c * c in
if a <>? 1 && a <>? 3
then c * g
else c.
Function pow_fn_rev_Z (a:Z) (b:Z) (c:Z) (g: Z)
{measure Z.to_nat a} : Z :=
if (a <=? 0)
then c
let prev := pow_fn_rev_Z (a - 1) b c g in
step_pow_Z (b - 1 - a) prev g.
Definition Inv25519_Z (x:Z) : Z :=
pow_fn_rev_Z 254 254 x x.
And prove their equivalence in $\Zfield$.
Lemma Inv25519_Z_GF : forall (g:list Z),
length g = 16 ->
(Z16.lst (Inv25519 g)) :GF =
(Inv25519_Z (Z16.lst g)) :GF.
In TweetNaCl, \TNaCle{inv25519} computes an inverse in $\Zfield$. It uses the
Fermat's little theorem by doing an exponentiation to $2^{255}-21$.
This is done by applying a square-and-multiply algorithm. The binary representation
of $p-2$ implies to always do a multiplications aside for bit 2 and 4, thus the if case.
To prove the correctness of the result we can use multiple strategies such as:
\item Proving it is special case of square-and-multiply algorithm applied to
a specific number and then show that this number is indeed $2^{255}-21$.
\item Unrolling the for loop step-by-step and applying the equalities
$x^a \times x^b = x^{(a+b)}$ and $(x^a)^2 = x^{(2 \times a)}$. We prove that
the resulting exponent is $2^{255}-21$.
We use the second method for the benefits of simplicity. However it requires to
apply the unrolling and exponentiation formulas 255 times. This can be automated
in Coq with tacticals such as \Coqe{repeat}, but it generates a proof object which
will take a long time to verify.
\subsection{Speeding up with Reflections}
In order to speed up the verification, we use a technique called reflection.
It provides us with flexibility such as we don't need to know the number of
times nor the order in which the lemmas needs to be applied (chapter 15 in \cite{CpdtJFR}).
The idea is to \textit{reflect} the goal into a decidable environment.
We show that for a property $P$, we can define a decidable boolean property
$P_{bool}$ such that if $P_{bool}$ is \Coqe{true} then $P$ holds.
$$reify\_P : P_{bool} = true \implies P$$
By applying $reify\_P$ on $P$ our goal become $P_{bool} = true$.
We then compute the result of $P_{bool}$. If the decision goes well we are
left with the tautology $true = true$.
To prove that the \Coqe{Inv25519_Z} is computing $x^{2^{255}-21}$,
we define a Domain Specific Language.
Let \Coqe{expr_inv} denote an expression which is either a term;
a multiplication of expressions; a squaring of an expression or a power of an expression.
And Let \Coqe{formula_inv} denote an equality between two expressions.
Inductive expr_inv :=
| R_inv : expr_inv
| M_inv : expr_inv -> expr_inv -> expr_inv
| S_inv : expr_inv -> expr_inv
| P_inv : expr_inv -> positive -> expr_inv.
Inductive formula_inv :=
| Eq_inv : expr_inv -> expr_inv -> formula_inv.
The denote functions are defined as follows:
Fixpoint e_inv_denote (m:expr_inv) : Z :=
match m with
| R_inv =>
| M_inv x y =>
(e_inv_denote x) * (e_inv_denote y)
| S_inv x =>
(e_inv_denote x) * (e_inv_denote x)
| P_inv x p =>
pow (e_inv_denote x) (Z.pos p)
Definition f_inv_denote (t : formula_inv) : Prop :=
match t with
| Eq_inv x y => e_inv_denote x = e_inv_denote y
All denote functions also take as an argument the environment containing the variables.
We do not show it here for the sake of readability.
Given that an environment, \Coqe{term_denote} returns the appropriate variable.
With such Domain Specific Language we have the equality between:
(Eq_inv (M_inv R_inv (S_inv R_inv))
(P_inv R_inv 3))
= (x * x^2 = x^3)
On the right side, \Coqe{(x * x^2 = x^3)} depends on $x$. On the left side,
\texttt{(Eq\_inv (M\_inv R\_inv (S\_inv R\_inv)) (P\_inv R\_inv 3))} does not depend on $x$.
This allows us to use computations in our decision precedure.
We define \Coqe{step_inv} and \Coqe{pow_inv} to mirror the behavior of
\Coqe{step_pow_Z} and respectively \Coqe{pow_fn_rev_Z} over our DSL and
we prove their equality.
Lemma step_inv_step_pow_eq :
forall (a:Z) (c:expr_inv) (g:expr_inv),
e_inv_denote (step_inv a c g) =
step_pow_Z a (e_inv_denote c) (e_inv_denote g).
Lemma pow_inv_pow_fn_rev_eq :
forall (a:Z) (b:Z) (c:expr_inv) (g:expr_inv),
e_inv_denote (pow_inv a b c g) =
pow_fn_rev_Z a b (e_inv_denote c) (e_inv_denote g).
We then derive the following lemma.
With an appropriate choice of variables,
\Coqe{pow_inv} denotes \Coqe{Inv25519_Z}.
In order to prove formulas in \Coqe{formula_inv},
we have the following a decidable procedure.
We define \Coqe{pow_expr_inv}, a function which returns the power of an expression.
We then compare the two values and decide over their equality.
Fixpoint pow_expr_inv (t:expr_inv) : Z :=
match t with
| R_inv => 1
(* power of a term is 1. *)
| M_inv x y =>
(pow_expr_inv x) + (pow_expr_inv y)
(* power of a multiplication is
the sum of the exponents. *)
| S_inv x =>
2 * (pow_expr_inv x)
(* power of a squaring is the double
of the exponent. *)
| P_inv x p =>
(Z.pos p) * (pow_expr_inv x)
(* power of a power is the multiplication
of the exponents. *)
Definition decide_e_inv (l1 l2:expr_inv) : bool :=
(pow_expr_inv l1) ==? (pow_expr_inv l2).
Definition decide_f_inv (f:formula_inv) : bool :=
match f with
| Eq_inv x y => decide_e_inv x y
We prove our decision procedure correct.
For all formulas $f$, if the decision over $f$ returns \Coqe{true},
then the denoted equality by $f$ is true.
Which is formalized as:
Lemma decide_formula_inv_impl :
forall (f:formula_inv),
decide_f_inv f = true ->
f_inv_denote f.
By reification to over DSL (lemma \ref{reify}) and by applying our decision (lemma \ref{decide}).
we proved the following theorem.
\Coqe{Inv25519_Z} computes an inverse in \Zfield.
Theorem Inv25519_Z_correct :
forall (x:Z),
Inv25519_Z x = pow x (2^255-21).
From \Coqe{Inv25519_Z_correct} and \Coqe{Inv25519_Z_GF}, we conclude the
functionnal correctness of the inversion over \Zfield.
\Coqe{Inv25519} computes an inverse in \Zfield.
Corollary Inv25519_Zpow_GF :
forall (g:list Z),
length g = 16 ->
Z16.lst (Inv25519 g) :GF =
(pow (Z16.lst g) (2^255-21)) :GF.
\subsection{Packing and other Applications of Reflection}
We prove the functional correctness of \Coqe{Inv25519} with reflections.
This technique can also be used where proofs requires some computing or a small and
finite domain of variable to test e.g. for all $i$ such that $0 \le i < 16$.
Using reflection we prove that we can split the for loop in \TNaCle{pack25519} into two parts.
for(i=1;i<15;i++) {
The first loop is computing the substraction while the second is applying the carrying.
for(i=1;i<15;i++) {
for(i=1;i<15;i++) {
This loop separation allows simpler proofs. The first loop is seen as the substraction of a number in \Zfield.
We then prove that with the iteration of the second loop, the number represented in \Zfield stays the same.
This leads to the proof that \TNaCle{pack25519} is effectively reducing mod $\p$ and returning a number in base $2^8$.
Lemma Pack25519_mod_25519 :
forall (l:list Z),
Zlength l = 16 ->
Forall (fun x => -2^62 < x < 2^62) l ->
ZofList 8 (Pack25519 l) = (Z16.lst l) mod (2^255-19).
......@@ -2,6 +2,7 @@
\subsection{The X25519 key exchange}