\subsection{Applying the Verifiable Software Toolchain}
\label{subsec:with-VST}
\begin{sloppypar}
We now turn our focus to the formal specification of \TNaCle{crypto_scalarmult}.
We use our definition of X25519 from the RFC in the Hoare triple and prove
its correctness.
\end{sloppypar}
\subheading{Specifications.}
We show the soundness of TweetNaCl by proving a correspondence between
the C version of TweetNaCl and a pure function in Coq formalizing the RFC.
% why "pure" ?
% A pure function is a function where the return value is only determined by its
% input values, without observable side effects (Side effect are e.g. printing)
This defines the equivalence between the Clight representation and our Coq
definition of the ladder (\coqe{RFC}).
\begin{lstlisting}[language=CoqVST]
Definition crypto_scalarmult_spec :=
DECLARE _crypto_scalarmult_curve25519_tweet
WITH
v_q: val, v_n: val, v_p: val, c121665:val,
sh : share,
q : list val, n : list Z, p : list Z
(*------------------------------------------*)
PRE [ _q OF (tptr tuchar),
_n OF (tptr tuchar),
_p OF (tptr tuchar) ]
PROP (writable_share sh;
Forall (fun x => 0 <= x < 2^8) p;
Forall (fun x => 0 <= x < 2^8) n;
Zlength q = 32; Zlength n = 32;
Zlength p = 32)
LOCAL(temp _q v_q; temp _n v_n; temp _p v_p;
gvar __121665 c121665)
SEP (sh [{ v_q }] <<(uch32)-- q;
sh [{ v_n }] <<(uch32)-- mVI n;
sh [{ v_p }] <<(uch32)-- mVI p;
Ews [{ c121665 }] <<(lg16)-- mVI64 c_121665)
(*------------------------------------------*)
POST [ tint ]
PROP (Forall (fun x => 0 <= x < 2^8) (RFC n p);
Zlength (RFC n p) = 32)
LOCAL(temp ret_temp (Vint Int.zero))
SEP (sh [{ v_q }] <<(uch32)-- mVI (RFC n p);
sh [{ v_n }] <<(uch32)-- mVI n;
sh [{ v_p }] <<(uch32)-- mVI p;
Ews [{ c121665 }] <<(lg16)-- mVI64 c_121665
\end{lstlisting}
In this specification we state preconditions like:
\begin{itemize}
\item[] \VSTe{PRE}: \VSTe{_p OF (tptr tuchar)}\\
The function \TNaCle{crypto_scalarmult} takes as input three pointers to
arrays of unsigned bytes (\VSTe{tptr tuchar}) \VSTe{_p}, \VSTe{_q} and \VSTe{_n}.
\item[] \VSTe{LOCAL}: \VSTe{temp _p v_p}\\
Each pointer represent an address \VSTe{v_p},
\VSTe{v_q} and \VSTe{v_n}.
\item[] \VSTe{SEP}: \VSTe{sh [{ v_p $\!\!\}\!\!]\!\!\!$ <<(uch32)-- mVI p}\\
In the memory share \texttt{sh}, the address \VSTe{v_p} points
to a list of integer values \VSTe{mVI p}.
\item[] \VSTe{Ews [{ c121665 $\!\!\}\!\!]\!$ <<(lg16)-- mVI64 c_121665}\\
In the global memory share \texttt{Ews}, the address \VSTe{c121665} points
to a list of 16 64-bit integer values corresponding to $a/4 = 121665$.
\item[] \VSTe{PROP}: \VSTe{Forall (fun x => 0 <= x < 2^8) p}\\
In order to consider all the possible inputs, we assume each
element of the list \texttt{p} to be bounded by $0$ included and $2^8$
excluded.
\item[] \VSTe{PROP}: \VSTe{Zlength p = 32}\\
We also assume that the length of the list \texttt{p} is 32. This defines the
complete representation of \TNaCle{u8[32]}.
\end{itemize}
As postcondition we have conditions like:
\begin{itemize}
\item[] \VSTe{POST}: \VSTe{tint}\\
The function \TNaCle{crypto_scalarmult} returns an integer.
\item[] \VSTe{LOCAL}: \VSTe{temp ret_temp (Vint Int.zero)}\\
The returned integer has value $0$.
\item[] \VSTe{SEP}: \VSTe{sh [{ v_q $\!\!\}\!\!]\!\!\!$ <<(uch32)-- mVI (RFC n p)}\\
In the memory share \texttt{sh}, the address \VSTe{v_q} points
to a list of integer values \VSTe{mVI (RFC n p)} where \VSTe{RFC n p} is the
result of the \TNaCle{crypto_scalarmult} of \VSTe{n} and \VSTe{p}.
\item[] \VSTe{PROP}: \VSTe{Forall (fun x => 0 <= x < 2^8) (RFC n p)}\\
\VSTe{PROP}: \VSTe{Zlength (RFC n p) = 32}\\
We show that the computation for \VSTe{RFC} fits in \TNaCle{u8[32]}.
\end{itemize}
\TNaCle{crypto_scalarmult} computes the same result as \VSTe{RFC}
in Coq provided that inputs are within their respective bounds: arrays of 32 bytes.
The correctness of this specification is formally proven in Coq as
\coqe{Theorem body_crypto_scalarmult}.
% \begin{sloppypar}
% This specification (proven with VST) shows that \TNaCle{crypto_scalarmult} in C.
% \end{sloppypar}
% The Verifiable Software Toolchain uses a strongest postcondition strategy.
% The user must first write a formal specification of the function he wants to verify in Coq.
% This should be as close as possible to the C implementation behavior.
% This will simplify the proof and help with stepping through the Clight version of the software.
% With the range of inputs defined, VST mechanically steps through each instruction
% and ask the user to verify auxiliary goals such as array bound access, or absence of overflows/underflows.
% We call this specification a low level specification. A user will then have an easier
% time to prove that his low level specification matches a simpler higher level one.
% In order to further speed-up the verification process, it has to be know that to
% prove the specification \TNaCle{crypto_scalarmult}, a user only need the specification of e.g. \TNaCle{M}.
% This provide with multiple advantages: the verification by the Coq kernel can be done
% in parallel and multiple users can work on proving different functions at the same time.
% For the sake of completeness we proved all intermediate functions.
\subheading{Memory aliasing.}
The semicolon in the \VSTe{SEP} parts of the Hoare triples represents the \emph{separating conjunction} (often written as a star), which means that
the memory shares of \texttt{q}, \texttt{n} and \texttt{p} do not overlap.
In other words,
we only prove correctness of \TNaCle{crypto_scalarmult} when it is called without aliasing.
But for other TweetNaCl functions, like the multiplication function \texttt{M(o,a,b)}, we cannot ignore aliasing, as it is called in the ladder in an aliased manner.
In VST, a simple specification of this function will assume that the pointer arguments
point to non-overlapping space in memory.
When called with three memory fragments (\texttt{o, a, b}),
the three of them will be consumed. However assuming this naive specification
when \texttt{M(o,a,a)} is called (squaring), the first two memory areas (\texttt{o, a})
are consumed and VST will expect a third memory section (\texttt{a}) which does not \emph{exist} anymore.
Examples of such cases are illustrated in \fref{tikz:MemSame}.
\begin{figure}[h]%
\centering%
\include{tikz/memory_same_sh}%
\caption{Aliasing and Separation Logic}%
\label{tikz:MemSame}%
\end{figure}
As a result, a function must either have multiple specifications or specify which
aliasing case is being used.
The first option would require us to do very similar proofs multiple times for a same function.
We chose the second approach: for functions with 3 arguments, named hereafter \texttt{o, a, b},
we define an additional parameter $k$ with values in $\{0,1,2,3\}$:
\begin{itemize}
\item if $k=0$ then \texttt{o} and \texttt{a} are aliased.
\item if $k=1$ then \texttt{o} and \texttt{b} are aliased.
\item if $k=2$ then \texttt{a} and \texttt{b} are aliased.
\item else there is no aliasing.
\end{itemize}
In the proof of our specification, we do a case analysis over $k$ when needed.
This solution does not cover all the possible cases of aliasing over 3 pointers
(\eg \texttt{o} = \texttt{a} = \texttt{b}) but it is enough to satisfy our needs.
\subheading{Improving verification speed.}
To make the verification the smoothest, the Coq formal definition of the function
should be as close as possible to the C implementation.
Optimizations of such definitions are often counter-productive as they increase the
amount of proofs required for \eg bounds checking, loop invariants.
In order to further speed-up the verification process, to prove the specification
\TNaCle{crypto_scalarmult}, we only need the specification of the subsequently
called functions (\eg \TNaCle{M}).
This provide with multiple advantages: the verification by Coq can be
done in parallel and multiple users can work on proving different functions at
the same time.
% We proved all intermediate functions.