Commit 39d97c51 authored by Benoit Viguier's avatar Benoit Viguier
Browse files

more text

parent dc9fe4e1
This diff is collapsed.
......@@ -7,156 +7,8 @@ when using VST.
\subsection{The Trusted Base}
Our proof relies on a trusted base , i.e. a foundation of specifications
and implementations that must stay correct with respect to their specifications.
One should not be able to prove a false statement in that system e.g. by proving
an inconsistency.
In our case we rely on:
\begin{itemize}
\item \textbf{Calculus of Inductive Construction}. The intuitionistic logic
used by Coq must be consistent in order to trust the proofs. As an axiom,
we assumed that the functional extensionality, which is also consistent with that logic.
$$\forall x, f(x) = g(x) ) \implies f = g$$
\begin{lstlisting}[language=Coq]
Lemma f_ext: forall (A B:Types),
forall (f g: A -> B),
(forall x, f(x) = g(x)) -> f = g.
\end{lstlisting}
\item \textbf{Verifiable Software Toolchain}. This framework developped at
Princeton allows a user to prove that a \texttt{CLight} code matches pure Coq
specification. However one must trust that the framework properly captures and
map the CLight behavior to the basic pure Coq functions. At the begining of
the project we found inconsistency and reported them to the authors.
\item \textbf{CompCert}. The formally proven compiler. We trust that the Clight
model captures correctly the C standard. (VERIFY THIS, WHICH STANDARD ?).
Our proof also assumes that the TweetNaCl code will behave as expected if
compiled under CompCert. We do not provide garantees for other C compilers
such as Clang or GCC.
\item \textbf{\texttt{clightgen}}. The tool making the translation from \textbf{C} to
\textbf{Clight}. It is the first step of the compilation.
VST does not support the direct verification of \texttt{o[i] = a[i] + b[i]}.
This required us to rewrite the lines into:
\begin{lstlisting}[language=C]
aux1 = a[i];
aux2 = b[i];
o[i] = aux1 + aux2;
\end{lstlisting}
The trust of the proof relied on the trust of a correct translation from the
initial version of \textit{TweetNaCl} to \textit{TweetNaclVerificable}.
While this problem is still present, the Compcert developpers provided us with
the \texttt{-normalize} option for \texttt{clightgen} which takes care of
generating auxiliary variables in order to automatically derive these steps.
The changes required for a C-code to make it Verifiable are now minimal.
\item Last but not the least, we must trust: the \textbf{Coq kernel} and its
associated libraries; the \textbf{Ocaml compiler} on which we compiled Coq;
the \textbf{Ocaml Runtime} and the \textbf{CPU}. Those are common to all proofs
done with this architecture \cite{2015-Appel,coq-faq}.
\end{itemize}
\subsection{Using the Verifiable Software Toolchain}
The Verifiable Software Toolchain uses a strongest postcondition strategy.
The user must first write a formal specification of the function he wants to verify in Coq.
This should be as close as possible to the C implementation behavior.
This will simplify the proof and help with stepping throught the CLight version of the software.
With the range of inputes defined, VST steps mechanically through each instruction
and ask the user to verify auxiliary goals such as array bound access, or absence of overflows/underflows.
We call this specification a low level specification. A user will then have an easier
time to prove that his low level specification matches a simpler higher level one.
In order to further speed-up the verification process, it has to be know that to
prove \VSTe{crypto_scalarmult}, a user only need the specification of e.g. \VSTe{M}.
This provide with multiple advantages: the verification by the Coq kernel can be done
in parallel and multiple users can work on proving different functions at the same time.
For the sake of completeness we proved all intermediate functions.
Memory aliasing is the next point a user should pay attention to. The way VST
deals with the separation logic is similar to a consumer producer problem.
A simple specification of \texttt{M(o,a,b)} will assume three distinct memory share.
When called with three memory share (\texttt{o, a, b}), the three of them will be consumed.
However assuming this naive specification when \texttt{M(o,a,a)} is called (squaring),
the first two memory shares (\texttt{o, a}) are consumed and VST will expect a third
memory share where the last \texttt{a} is pointing at which does not \textit{exist} anymore.
Examples of such cases are summarized in Fig \ref{tk:MemSame}.
\begin{figure}[h]
\include{tikz/memory_same_sh}
\caption{Aliasing and Separation Logic}
\label{tk:MemSame}
\end{figure}
This forces the user to either define multiple specifications for a single function
or specify in his specification which aliasing version is being used.
For our specifications of functions with 3 arguments, named here after \texttt{o, a, b},
we define an additional parameter $k$ with values in
$\{0,1,2,3\}$:
\begin{itemize}
\item if $k=0$ then \texttt{o} and \texttt{a} are aliased.
\item if $k=1$ then \texttt{o} and \texttt{b} are aliased.
\item if $k=2$ then \texttt{a} and \texttt{b} are aliased.
\item else there is no aliasing.
\end{itemize}
This solution allows us to make cases analysis over possible aliasing.
\subsection{Verifiying \texttt{for} loops}
Final state of \texttt{for} loops are usually computed by simple recursive functions.
However we must define invariants which are true for each iterations.
Assume we want to prove a decreasing loop where indexes go from 3 to 0.
Define a function $g : \N \rightarrow State \rightarrow State $ which takes as input an integer for the index and a state and return a state.
It simulate the body of the \texttt{for} loop.
Assume it's recursive call: $f : \N \rightarrow State \rightarrow State $ which iteratively apply $g$ with decreasing index:
\begin{equation*}
f ( i , s ) =
\begin{cases}
s & \text{if } s = 0 \\
f( i - 1 , g ( i - 1 , s )) & \text{otherwise}
\end{cases}
\end{equation*}
Then we have :
\begin{align*}
f(4,s) &= g(0,g(1,g(2,g(3,s))))
% \\
% f(3,s) &= g(0,g(1,g(2,s)))
\end{align*}
To prove the correctness of $f(4,s)$, we need to prove that intermediate steps
$g(3,s)$; $g(2,g(3,s))$; $g(1,g(2,g(3,s)))$; $g(0,g(1,g(2,g(3,s))))$ are correct.
Due to the computation order of recursive function, our loop invariant for $i\in\{0;1;2;3;4\}$ cannot use $f(i)$.
To solve this, we define an auxiliary function with an accumulator such that given $i\in\{0;1;2;3;4\}$, it will compute the first $i$ steps of the loop.
We then prove for the complete number of steps, the function with the accumulator and without returns the same result.
We formalized this result in a generic way as follows:
\begin{Coq}
Variable T : Type.
Variable g : nat -> T -> T.
Fixpoint rec_fn (n:nat) (s:T) :=
match n with
| 0 => s
| S n => rec_fn n (g n s)
end.
Fixpoint rec_fn_rev_acc (n:nat) (m:nat) (s:T) :=
match n with
| 0 => s
| S n => g (m - n - 1) (rec_fn_rev_acc n m s)
end.
Definition rec_fn_rev (n:nat) (s:T) :=
rec_fn_rev_acc n n s.
Lemma Tail_Head_equiv :
forall (n:nat) (s:T),
rec_fn n s = rec_fn_rev n s.
\end{Coq}
Using this formalization, we prove that the 255 steps of the montgomery ladder in C provide the same computations are the one defined in Algorithm \ref{montgomery-double-add}.
\subsection{Time and Space Complexity}
......
\section{Conclusion}
\label{Conclusion}
\subsection{TCB of the proof}
Any formal system relies on a trusted base. In this section we describe our
chain of trust.
\todo{Explain and compare to, e.g., F*}
\subsection{Trusted Core Base of the proof}
Our proof relies on a trusted base , i.e. a foundation of specifications
and implementations that must stay correct with respect to their specifications.
One should not be able to prove a false statement in that system e.g. by proving
an inconsistency.
In our case we rely on:
\begin{itemize}
\item \textbf{Calculus of Inductive Construction}. The intuitionistic logic
used by Coq must be consistent in order to trust the proofs. As an axiom,
we assumed that the functional extensionality, which is also consistent with that logic.
$$\forall x, f(x) = g(x) ) \implies f = g$$
\begin{lstlisting}[language=Coq]
Lemma f_ext: forall (A B:Type),
forall (f g: A -> B),
(forall x, f(x) = g(x)) -> f = g.
\end{lstlisting}
\item \textbf{Verifiable Software Toolchain}. This framework developped at
Princeton allows a user to prove that a \texttt{CLight} code matches pure Coq
specification. However one must trust that the framework properly captures and
map the CLight behavior to the basic pure Coq functions. At the begining of
the project we found inconsistency and reported them to the authors.
\item \textbf{CompCert}. The formally proven compiler. We trust that the Clight
model captures correctly the C standard. (VERIFY THIS, WHICH STANDARD ?).
Our proof also assumes that the TweetNaCl code will behave as expected if
compiled under CompCert. We do not provide garantees for other C compilers
such as Clang or GCC.
\item \textbf{\texttt{clightgen}}. The tool making the translation from \textbf{C} to
\textbf{Clight}. It is the first step of the compilation.
VST does not support the direct verification of \texttt{o[i] = a[i] + b[i]}.
This required us to rewrite the lines into:
\begin{lstlisting}[language=C]
aux1 = a[i];
aux2 = b[i];
o[i] = aux1 + aux2;
\end{lstlisting}
The trust of the proof relied on the trust of a correct translation from the
initial version of \textit{TweetNaCl} to \textit{TweetNaclVerificable}.
While this problem is still present, the Compcert developpers provided us with
the \texttt{-normalize} option for \texttt{clightgen} which takes care of
generating auxiliary variables in order to automatically derive these steps.
The changes required for a C-code to make it Verifiable are now minimal.
\item Last but not the least, we must trust: the \textbf{Coq kernel} and its
associated libraries; the \textbf{Ocaml compiler} on which we compiled Coq;
the \textbf{Ocaml Runtime} and the \textbf{CPU}. Those are common to all proofs
done with this architecture \cite{2015-Appel,coq-faq}.
\end{itemize}
\subsection{Proof toolchain}
......@@ -3,9 +3,8 @@ for network communication, encryption, decryption, signatures, etc.
TweetNaCl~\cite{BGJ+15} is its compact reimplementation.
It does not aim for high speed application and has been optimized for source
code compactness (100 tweets). It maintains some degree of readability in order
to be easily auditable.
TweetNaCl is being used by ZeroMQ~\cite{zmq} messaging queue system to provide
portability to its users.
to be easily auditable. For example TweetNaCl is being used by ZeroMQ~\cite{zmq}
messaging queue system to provide portability to its users.
% ``TweetNaCl is the first cryptographic library that allows correct functionality
% to be verified by auditors with reasonable effort''~\cite{BGJ+15}
......@@ -15,14 +14,14 @@ portability to its users.
One core component of TweetNaCl (and NaCl) is the key exchange protocol X25519~\cite{rfc7748}.
This protocol is being used by a wide variety of applications~\cite{this-that-use-curve25519}
such as SSH~\cite{rfc4253}, Signal Protocol, Tor, Zcash, TLS to establish a shared secret over
such as SSH, Signal Protocol, Tor, Zcash, TLS to establish a shared secret over
an insecure channel.
This library makes use of Curve25519~\cite{Ber06}, a function over a \F{p}-restricted
$x$-coordinate computing a scalar multiplication on $E(\F{p^2})$, where $p$ is
the prime number $\p$ and $E$ is the elliptic curve $y^2 = x^3 + 486662 x^2 + x$.
Originally, the name ``Curve25519'' referred to this keyexchange protocol,
Originally, the name ``Curve25519'' referred to this key exchange protocol,
but Bernstein suggested to rename the scheme to X25519 and to use the name
Curve25519 for the underlying elliptic curve~\cite{Ber14}.
We make use of this notation in this paper.
......@@ -31,7 +30,7 @@ We make use of this notation in this paper.
We provide a mechanized formal proof of the correctness of X25519
implementation in TweetNaCl. This is done by first extending the formal library for
elliptic curves~\cite{DBLP:conf/itp/BartziaS14} of Bartzia and Strub wrote to
support Curve25519. Then proving that the code correctly implements the definitions
support Curve25519. Then we prove that the code correctly implements the definitions
from the original paper by Bernstein~\cite{Ber14}.
% Implementing cryptographic primitives without any bugs is difficult.
......@@ -55,9 +54,10 @@ Our method can be seen as a static analysis over the input values coupled
with the formal proof that the code of the algorithm matches its specification.
We use Coq~\cite{coq-faq}, a formal system that allows us to machine-check our proofs.
A famous example of its use is the proof of the Four Color Theorem~\cite{gonthier2008formal}.
The CompCert, a C~compiler~\cite{Leroy-backend} proven correct and sound is being build on top of it.
To prove its correctness, CompCert uses multiple intermediate languages. The first step of CompCert is done by the parser \textit{clightgen}.
Some famous examples of its use are the proof of the Four Color Theorem~\cite{gonthier2008formal}; or
CompCert, a C~compiler~\cite{Leroy-backend} which proven correct and sound by being build on top of it.
In its proof CompCert uses multiple intermediate languages and show equivalence between them.
The first step of the compilation by CompCert is done by the parser \textit{clightgen}.
It takes as input C code and generates its Clight~\cite{Blazy-Leroy-Clight-09} translation.
Using this intermediate representation Clight, we use the Verifiable Software Toolchain
......
......@@ -4,6 +4,55 @@
In this section we describe techniques used to prove the equivalence between the
Clight description of TweetNaCl and Coq functions producing similar behaviors.
\todo{SUBSECTION?}
Verifying \texttt{crypto\_scalarmult} also implies to verify all the functions
subsequently called: \texttt{unpack25519}; \texttt{A}; \texttt{Z}; \texttt{M};
\texttt{S}; \texttt{car25519}; \texttt{inv25519}; \texttt{set25519}; \texttt{sel25519};
\texttt{pack25519}.
We prove that the implementation of Curve25519 is \textbf{sound} \ie
\begin{itemize}
\item absence of access out-of-bounds of arrays (memory safety).
\item absence of overflows/underflow on the arithmetic.
\end{itemize}
We also prove that TweetNaCl's code is \textbf{correct}:
\begin{itemize}
\item Curve25519 is correctly implemented (we get what we expect).
\item Operations on \texttt{gf} (\texttt{A}, \texttt{Z}, \texttt{M}, \texttt{S})
are equivalent to operations ($+,-,\times,x^2$) in \Zfield.
\item The Montgomery ladder does compute a scalar multiplication between a natural number and a point.
\end{itemize}
In order to prove the soundness and correctness of \texttt{crypto\_scalarmult},
we first create a skeleton of the Montgomery ladder with abstract operations which
can be instanciated over lists, integers, field elements...
A high level specification (over a generic field $\K$) allows use to prove the
correctness of the ladder with respect to the curves theory.
This high specification does not rely on the parameters of Curve25519.
By instanciating $\K$ with $\Zfield$, and the parameters of Curve25519 ($a = 486662, b = 1$),
we define a middle level specification.
Additionally we also provide a low level specification close to the \texttt{C} code
(over lists of $\Z$). We show this specification to be equivalent to the
\textit{semantic version} of C (\texttt{CLight}) with VST.
This low level specification gives us the soundness assurance.
By showing that operations over instances ($\K = \Zfield$, $\Z$, list of $\Z$) are
equivalent we bridge the gap between the low level and the high level specification
with Curve25519 parameters.
As such we prove all specifications to equivalent (Fig.\ref{tk:ProofStructure}).
This garantees us the correctness of the implementation.
\begin{figure}[h]
\include{tikz/specifications}
\caption{Structural construction of the proof}
\label{tk:ProofStructure}
\end{figure}
\subsection{Correctness Specification}
We show the soundness of TweetNaCl by proving the following specification matches a pure Coq function.
......@@ -106,7 +155,7 @@ Theorem Crypto_Scalarmult_Correct:
\subsection{Number Representation and C Implementation}
As described in Section \ref{sec:impl}, numbers in \TNaCle{gf} are represented
As described in Section \ref{preliminaries:B}, numbers in \TNaCle{gf} are represented
in base $2^{16}$ and we use a direct mapping to represent that array as a list
integers in Coq. However in order to show the correctness of the basic operations,
we need to convert this number as a full integer.
......@@ -152,7 +201,7 @@ Lemma mult_GF_Zlength :
\subsection{Inversions in \Zfield}
In a similar fashion we define a Coq version of the inversion mimicking
We define a Coq version of the inversion mimicking
the behavior of \TNaCle{inv25519} over \Coqe{list Z}.
\begin{lstlisting}[language=Ctweetnacl]
sv inv25519(gf o,const gf a)
......@@ -186,7 +235,7 @@ Function pow_fn_rev (a:Z) (b:Z) (c g: list Z)
\end{lstlisting}
This \Coqe{Function} requires a proof of termination. It is done by proving the
Well-foundness of the decreasing argument: \Coqe{measure Z.to_nat a}. Calling
well-foundness of the decreasing argument: \Coqe{measure Z.to_nat a}. Calling
\Coqe{pow_fn_rev} 254 times allows us to reproduce the same behavior as the \texttt{Clight} definition.
\begin{lstlisting}[language=Coq]
Definition Inv25519 (x:list Z) : list Z :=
......@@ -375,7 +424,7 @@ Theorem Inv25519_Z_correct :
Inv25519_Z x = pow x (2^255-21).
\end{Coq}
From \Coqe{Inv25519_Z_correct} and \Coqe{Inv25519_Z_GF}, we conclude the
From \Coqe{Inv25519_Z_GF} and \Coqe{Inv25519_Z_correct}, we conclude the
functionnal correctness of the inversion over \Zfield.
\begin{corollary}
\Coqe{Inv25519} computes an inverse in \Zfield.
......@@ -411,7 +460,7 @@ for(i=1;i<15;i++) {
}
\end{lstlisting}
This loop separation allows simpler proofs. The first loop is seen as the substraction of a number in \Zfield.
We then prove that with the iteration of the second loop, the number represented in \Zfield stays the same.
We then prove that with the iteration of the second loop, the number represented in $\Zfield$ stays the same.
This leads to the proof that \TNaCle{pack25519} is effectively reducing mod $\p$ and returning a number in base $2^8$.
\begin{Coq}
......@@ -421,3 +470,102 @@ Zlength l = 16 ->
Forall (fun x => -2^62 < x < 2^62) l ->
ZofList 8 (Pack25519 l) = (Z16.lst l) mod (2^255-19).
\end{Coq}
\subsection{Using the Verifiable Software Toolchain}
The Verifiable Software Toolchain uses a strongest postcondition strategy.
The user must first write a formal specification of the function he wants to verify in Coq.
This should be as close as possible to the C implementation behavior.
This will simplify the proof and help with stepping throught the CLight version of the software.
With the range of inputes defined, VST steps mechanically through each instruction
and ask the user to verify auxiliary goals such as array bound access, or absence of overflows/underflows.
We call this specification a low level specification. A user will then have an easier
time to prove that his low level specification matches a simpler higher level one.
In order to further speed-up the verification process, it has to be know that to
prove \VSTe{crypto_scalarmult}, a user only need the specification of e.g. \VSTe{M}.
This provide with multiple advantages: the verification by the Coq kernel can be done
in parallel and multiple users can work on proving different functions at the same time.
For the sake of completeness we proved all intermediate functions.
Memory aliasing is the next point a user should pay attention to. The way VST
deals with the separation logic is similar to a consumer producer problem.
A simple specification of \texttt{M(o,a,b)} will assume three distinct memory share.
When called with three memory share (\texttt{o, a, b}), the three of them will be consumed.
However assuming this naive specification when \texttt{M(o,a,a)} is called (squaring),
the first two memory shares (\texttt{o, a}) are consumed and VST will expect a third
memory share where the last \texttt{a} is pointing at which does not \textit{exist} anymore.
Examples of such cases are summarized in Fig \ref{tk:MemSame}.
\begin{figure}[h]
\include{tikz/memory_same_sh}
\caption{Aliasing and Separation Logic}
\label{tk:MemSame}
\end{figure}
This forces the user to either define multiple specifications for a single function
or specify in his specification which aliasing version is being used.
For our specifications of functions with 3 arguments, named here after \texttt{o, a, b},
we define an additional parameter $k$ with values in
$\{0,1,2,3\}$:
\begin{itemize}
\item if $k=0$ then \texttt{o} and \texttt{a} are aliased.
\item if $k=1$ then \texttt{o} and \texttt{b} are aliased.
\item if $k=2$ then \texttt{a} and \texttt{b} are aliased.
\item else there is no aliasing.
\end{itemize}
This solution allows us to make cases analysis over possible aliasing.
\subsection{Verifiying \texttt{for} loops}
Final state of \texttt{for} loops are usually computed by simple recursive functions.
However we must define invariants which are true for each iterations.
Assume we want to prove a decreasing loop where indexes go from 3 to 0.
Define a function $g : \N \rightarrow State \rightarrow State $ which takes as input an integer for the index and a state and return a state.
It simulate the body of the \texttt{for} loop.
Assume it's recursive call: $f : \N \rightarrow State \rightarrow State $ which iteratively apply $g$ with decreasing index:
\begin{equation*}
f ( i , s ) =
\begin{cases}
s & \text{if } s = 0 \\
f( i - 1 , g ( i - 1 , s )) & \text{otherwise}
\end{cases}
\end{equation*}
Then we have :
\begin{align*}
f(4,s) &= g(0,g(1,g(2,g(3,s))))
% \\
% f(3,s) &= g(0,g(1,g(2,s)))
\end{align*}
To prove the correctness of $f(4,s)$, we need to prove that intermediate steps
$g(3,s)$; $g(2,g(3,s))$; $g(1,g(2,g(3,s)))$; $g(0,g(1,g(2,g(3,s))))$ are correct.
Due to the computation order of recursive function, our loop invariant for $i\in\{0;1;2;3;4\}$ cannot use $f(i)$.
To solve this, we define an auxiliary function with an accumulator such that given $i\in\{0;1;2;3;4\}$, it will compute the first $i$ steps of the loop.
We then prove for the complete number of steps, the function with the accumulator and without returns the same result.
We formalized this result in a generic way as follows:
\begin{Coq}
Variable T : Type.
Variable g : nat -> T -> T.
Fixpoint rec_fn (n:nat) (s:T) :=
match n with
| 0 => s
| S n => rec_fn n (g n s)
end.
Fixpoint rec_fn_rev_acc (n:nat) (m:nat) (s:T) :=
match n with
| 0 => s
| S n => g (m - n - 1) (rec_fn_rev_acc n m s)
end.
Definition rec_fn_rev (n:nat) (s:T) :=
rec_fn_rev_acc n n s.
Lemma Tail_Head_equiv :
forall (n:nat) (s:T),
rec_fn n s = rec_fn_rev n s.
\end{Coq}
Using this formalization, we prove that the 255 steps of the montgomery ladder in C provide the same computations are the one defined in Algorithm \ref{montgomery-double-add}.
......@@ -206,8 +206,8 @@
% \newcommand{\PowMul}{{\text{\sf PowMul}}}
% \newcommand{\PowMulPsi}{{\text{\sf PowMul}_{\psi}}}
\newcommand{\ie}{{\it i.e.}\;}
\newcommand{\eg}{{\it e.g.}\;}
\newcommand{\ie}{{\textit{i.e.}}\;}
\newcommand{\eg}{{\textit{e.g.}}\;}
\newcommand{\p}{\ensuremath{2^{255}-19}}
\newcommand{\Zfield}{\ensuremath{\mathbb{Z}_{2^{255}-19}}}
\newcommand{\Ffield}{\ensuremath{\mathbb{F}_{2^{255}-19}}}
\newcommand{\Zfield}{\ensuremath{\mathbb{Z}_{\p}}}
\newcommand{\Ffield}{\ensuremath{\mathbb{F}_{\p}}}
\section{Preliminaries}
\label{preliminaries}
In this section, we describe X25519 and TweetNaCl implementation.
We then provide a brief description of the formal tools we use in our proofs.
\subsection{The X25519 key exchange}
\label{preliminaries:A}
% \begin{definition}
% Let $E$ be the elliptic curve defined by $y^2 = x^3 + 486662 x^2 + x$.
% \end{definition}
% \begin{lemma}
% For any value $x \in \F{p}$, for the elliptic curve $E$ over $\F{p^2}$
% defined by $y^2 = x^3 + 486662 x^2 + x$, there exist a point $P$ over $E(\F{p^2})$
% \end{lemma}
For any value $x \in \F{p}$, for the elliptic curve $E$ over $\F{p^2}$
defined by $y^2 = x^3 + 486662 x^2 + x$, there exist a point $P$ over $E(\F{p^2})$
such that $x$ is the $x$-coordinate of $P$. Remark that $x$ is also the $x$-coordinate of $-P$.
\F{p}, \F{p^2}, Coordinates, X-coordinates in \F{p}.
Given a natural number $n$ and $x$, X25519 returns the $x$-coordinate of the
scalar multiplication of $P$ by $n$. Note that the result would is the same with $-P$.
XXX: math definition from Curve25519 paper.
Using X25519, RFC~7748~\cite{rfc7748} formalized a Diffie–Hellman key-exchange algorithm.
Each party generate a secret random number $S_a$ (respectively $S_b$), and computes $P_a$ (respectively $P_b$),
the $x$-coordinate of the scalar multiplication of the base point where $x = 9$ and $S_a$ (respectively $S_b$).
The party exchanges $P_a$ and $P_b$ and computes their shared secret with X25519
over $S_a$ and $P_b$ (respectively $S_b$ and $P_a$).
\subsection{X25519 in TweetNaCl}
\label{preliminaries:B}
\subheading{Arithmetic in \Ffield}
Given a natural number $n$ and a value $x \in \F{p}$, X25519 is a function over a $\F{p}$-restricted
$x$-coordinate computing a scalar multiplication on $E(\F{p^2})$.
As a result of this restriction, all computations are done over $\F{p}$.
\subheading{Arithmetic in \Ffield} In X25519, all computations are done over $\F{p}$.
Numbers in that field can be represented with 256 bits.
We represent them in 8-bit limbs (respectively 16-bit limbs),
making use of a base $2^8$ (respectively $2^{16}$).
Consequently, inputs of the X25519 function are seen as arrays of bytes.
Consequently, inputs of the X25519 function are seen as arrays of bytes
in a little-endian format.
Computations inside this function makes use of the 16-bit limbs representation.
Those are placed into 64-bits signed container in order to mitigate overflows or underflows.
\begin{lstlisting}[language=Ctweetnacl]
......@@ -79,7 +100,7 @@ substraction (\texttt{Z}), and school-book multiplication (\texttt{M}).
% }
% \end{lstlisting}
Inverse in \Zfield are computed with \texttt{inv25519}.
Inverse in $\Zfield$ are computed with \texttt{inv25519}.
It takes the exponentiation by $2^{255}-21$ with the Square-and-multiply algorithm.
Fermat's little theorem brings the correctness.
Notice that in this case the inverse of $0$ is defined as $0$.
......@@ -123,8 +144,14 @@ substraction (\texttt{Z}), and school-book multiplication (\texttt{M}).
}
\end{lstlisting}
\subheading{The Montgomery ladder} The full ladder is defined as follow.
\subheading{The Montgomery ladder}
\todo{How do we describe the ladder here ? Do we use the description by Timmy ?
Do we use the simpler description in the RFC ?}
The \strikethrough{full ladder} is defined as follow.
First extract and clamp the value of $n$. Then unpack the value of $p$.
As per RFC~7748~\cite{rfc7748}, set its most significant bit to 0.
Compute the Montgomery ladder over the clamped $n$ and $p$, pack the result into $q$.
\begin{lstlisting}[language=Ctweetnacl]
int crypto_scalarmult(u8 *q,
......@@ -177,47 +204,12 @@ substraction (\texttt{Z}), and school-book multiplication (\texttt{M}).
\end{lstlisting}
\subsection{Coq and VST}
\label{preliminaries:C}
\todo{Describe Coq}
\todo{Describe VST}
\todo{Describe Hoare Logic}
Verifying \texttt{crypto\_scalarmult} also implies to verify all the functions
subsequently called: \texttt{unpack25519}; \texttt{A}; \texttt{Z}; \texttt{M};
\texttt{S}; \texttt{car25519}; \texttt{inv25519}; \texttt{set25519}; \texttt{sel25519};
\texttt{pack25519}.
We prove that the implementation of Curve25519 is \textbf{sound} \ie
\begin{itemize}
\item absence of access out-of-bounds of arrays (memory safety).
\item absence of overflows/underflow on the arithmetic.
\end{itemize}
We also prove that TweetNaCl's code is \textbf{correct}:
\begin{itemize}
\item Curve25519 is correctly implemented (we get what we expect).
\item Operations on \texttt{gf} (\texttt{A}, \texttt{Z}, \texttt{M}, \texttt{S})
are equivalent to operations ($+,-,\times,x^2$) in \Zfield.
\item The Montgomery ladder does compute a scalar multiplication between a natural number and a point.
\end{itemize}
In order to prove the soundness and correctness of \texttt{crypto\_scalarmult},
we first create a skeleton of the Montgomery ladder with abstract operations which
can be instanciated over lists, integers, field elements...
A high level specification (over a generic field $\K$) allows use to prove the
correctness of the ladder with respect to the curves theory.
This high specification does not rely on the parameters of Curve25519.
By instanciating $\K$ with $\Zfield$, and the parameters of Curve25519 ($a = 486662, b = 1$),
we define a middle level specification.
Additionally we also provide a low level specification close to the \texttt{C} code
(over lists of $\Z$). We show this specification to be equivalent to the
\textit{semantic version} of C (\texttt{CLight}) with VST.
This low level specification gives us the soundness assurance.
By showing that operations over instances ($\K = \Zfield$, $\Z$, list of $\Z$) are
equivalent we bridge the gap between the low level and the high level specification
with Curve25519 parameters.
As such we prove all specifications to equivalent (Fig.\ref{tk:ProofStructure}).
This garantees us the correctness of the implementation.
\begin{figure}[h]
\include{tikz/specifications}
\caption{Structural construction of the proof}
\label{tk:ProofStructure}
\end{figure}
\todo{Describe Separation Logic}
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment