4.1-VST.tex 8.26 KB
 Benoit Viguier committed Jan 15, 2020 1 2 3 4 \subsection{Applying the Verifiable Software Toolchain} \label{subsec:with-VST} \begin{sloppypar}  benoit committed Sep 28, 2020 5 6 7  We now turn our focus to the formal specification of \TNaCle{crypto_scalarmult}. We use our definition of X25519 from the RFC in the Hoare triple and prove its correctness.  Benoit Viguier committed Jan 15, 2020 8 9 10 11 \end{sloppypar} \subheading{Specifications.} We show the soundness of TweetNaCl by proving a correspondence between  benoit committed Sep 29, 2020 12 the C version of TweetNaCl and a pure function in Coq formalizing the RFC.  Benoit Viguier committed Jan 15, 2020 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 % why "pure" ? % A pure function is a function where the return value is only determined by its % input values, without observable side effects (Side effect are e.g. printing) This defines the equivalence between the Clight representation and our Coq definition of the ladder (\coqe{RFC}). \begin{lstlisting}[language=CoqVST] Definition crypto_scalarmult_spec := DECLARE _crypto_scalarmult_curve25519_tweet WITH v_q: val, v_n: val, v_p: val, c121665:val, sh : share, q : list val, n : list Z, p : list Z (*------------------------------------------*) PRE [ _q OF (tptr tuchar), _n OF (tptr tuchar), _p OF (tptr tuchar) ] PROP (writable_share sh; Forall (fun x => 0 <= x < 2^8) p; Forall (fun x => 0 <= x < 2^8) n; Zlength q = 32; Zlength n = 32; Zlength p = 32) LOCAL(temp _q v_q; temp _n v_n; temp _p v_p; gvar __121665 c121665) SEP (sh [{ v_q }] <<(uch32)-- q; sh [{ v_n }] <<(uch32)-- mVI n; sh [{ v_p }] <<(uch32)-- mVI p; Ews [{ c121665 }] <<(lg16)-- mVI64 c_121665) (*------------------------------------------*) POST [ tint ] PROP (Forall (fun x => 0 <= x < 2^8) (RFC n p); Zlength (RFC n p) = 32) LOCAL(temp ret_temp (Vint Int.zero)) SEP (sh [{ v_q }] <<(uch32)-- mVI (RFC n p); sh [{ v_n }] <<(uch32)-- mVI n; sh [{ v_p }] <<(uch32)-- mVI p; Ews [{ c121665 }] <<(lg16)-- mVI64 c_121665 \end{lstlisting} In this specification we state preconditions like: \begin{itemize}  benoit committed Sep 28, 2020 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72  \item[] \VSTe{PRE}: \VSTe{_p OF (tptr tuchar)}\\ The function \TNaCle{crypto_scalarmult} takes as input three pointers to arrays of unsigned bytes (\VSTe{tptr tuchar}) \VSTe{_p}, \VSTe{_q} and \VSTe{_n}. \item[] \VSTe{LOCAL}: \VSTe{temp _p v_p}\\ Each pointer represent an address \VSTe{v_p}, \VSTe{v_q} and \VSTe{v_n}. \item[] \VSTe{SEP}: \VSTe{sh [{ v_p $\!\!\}\!\!]\!\!\!$ <<(uch32)-- mVI p}\\ In the memory share \texttt{sh}, the address \VSTe{v_p} points to a list of integer values \VSTe{mVI p}. \item[] \VSTe{Ews [{ c121665 $\!\!\}\!\!]\!$ <<(lg16)-- mVI64 c_121665}\\ In the global memory share \texttt{Ews}, the address \VSTe{c121665} points to a list of 16 64-bit integer values corresponding to $a/4 = 121665$. \item[] \VSTe{PROP}: \VSTe{Forall (fun x => 0 <= x < 2^8) p}\\ In order to consider all the possible inputs, we assume each element of the list \texttt{p} to be bounded by $0$ included and $2^8$ excluded. \item[] \VSTe{PROP}: \VSTe{Zlength p = 32}\\ We also assume that the length of the list \texttt{p} is 32. This defines the complete representation of \TNaCle{u8[32]}.  Benoit Viguier committed Jan 15, 2020 73 74 75 76 \end{itemize} As postcondition we have conditions like: \begin{itemize}  benoit committed Sep 28, 2020 77 78 79 80 81 82 83 84 85 86 87  \item[] \VSTe{POST}: \VSTe{tint}\\ The function \TNaCle{crypto_scalarmult} returns an integer. \item[] \VSTe{LOCAL}: \VSTe{temp ret_temp (Vint Int.zero)}\\ The returned integer has value $0$. \item[] \VSTe{SEP}: \VSTe{sh [{ v_q $\!\!\}\!\!]\!\!\!$ <<(uch32)-- mVI (RFC n p)}\\ In the memory share \texttt{sh}, the address \VSTe{v_q} points to a list of integer values \VSTe{mVI (RFC n p)} where \VSTe{RFC n p} is the result of the \TNaCle{crypto_scalarmult} of \VSTe{n} and \VSTe{p}. \item[] \VSTe{PROP}: \VSTe{Forall (fun x => 0 <= x < 2^8) (RFC n p)}\\ \VSTe{PROP}: \VSTe{Zlength (RFC n p) = 32}\\ We show that the computation for \VSTe{RFC} fits in \TNaCle{u8[32]}.  Benoit Viguier committed Jan 15, 2020 88 89 \end{itemize}  Benoit Viguier committed Feb 14, 2020 90 91 \TNaCle{crypto_scalarmult} computes the same result as \VSTe{RFC} in Coq provided that inputs are within their respective bounds: arrays of 32 bytes.  Benoit Viguier committed Jan 15, 2020 92   Benoit Viguier committed Feb 14, 2020 93 The correctness of this specification is formally proven in Coq as  Benoit Viguier committed Jan 15, 2020 94 95 \coqe{Theorem body_crypto_scalarmult}.  Benoit Viguier committed Jan 15, 2020 96 97 98 % \begin{sloppypar} % This specification (proven with VST) shows that \TNaCle{crypto_scalarmult} in C. % \end{sloppypar}  Benoit Viguier committed Jan 15, 2020 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121  % The Verifiable Software Toolchain uses a strongest postcondition strategy. % The user must first write a formal specification of the function he wants to verify in Coq. % This should be as close as possible to the C implementation behavior. % This will simplify the proof and help with stepping through the Clight version of the software. % With the range of inputs defined, VST mechanically steps through each instruction % and ask the user to verify auxiliary goals such as array bound access, or absence of overflows/underflows. % We call this specification a low level specification. A user will then have an easier % time to prove that his low level specification matches a simpler higher level one. % In order to further speed-up the verification process, it has to be know that to % prove the specification \TNaCle{crypto_scalarmult}, a user only need the specification of e.g. \TNaCle{M}. % This provide with multiple advantages: the verification by the Coq kernel can be done % in parallel and multiple users can work on proving different functions at the same time. % For the sake of completeness we proved all intermediate functions. \subheading{Memory aliasing.} The semicolon in the \VSTe{SEP} parts of the Hoare triples represents the \emph{separating conjunction} (often written as a star), which means that the memory shares of \texttt{q}, \texttt{n} and \texttt{p} do not overlap. In other words, we only prove correctness of \TNaCle{crypto_scalarmult} when it is called without aliasing. But for other TweetNaCl functions, like the multiplication function \texttt{M(o,a,b)}, we cannot ignore aliasing, as it is called in the ladder in an aliased manner.  Benoit Viguier committed Feb 15, 2020 122 In VST, a simple specification of this function will assume that the pointer arguments  Benoit Viguier committed Jan 15, 2020 123 124 125 126 point to non-overlapping space in memory. When called with three memory fragments (\texttt{o, a, b}), the three of them will be consumed. However assuming this naive specification when \texttt{M(o,a,a)} is called (squaring), the first two memory areas (\texttt{o, a})  Benoit Viguier committed Feb 15, 2020 127 are consumed and VST will expect a third memory section (\texttt{a}) which does not \emph{exist} anymore.  Benoit Viguier committed Jan 15, 2020 128 129 Examples of such cases are illustrated in \fref{tikz:MemSame}. \begin{figure}[h]%  benoit committed Sep 28, 2020 130 131 132 133  \centering% \include{tikz/memory_same_sh}% \caption{Aliasing and Separation Logic}% \label{tikz:MemSame}%  Benoit Viguier committed Jan 15, 2020 134 135 136 137 138 139 140 \end{figure} As a result, a function must either have multiple specifications or specify which aliasing case is being used. The first option would require us to do very similar proofs multiple times for a same function. We chose the second approach: for functions with 3 arguments, named hereafter \texttt{o, a, b}, we define an additional parameter $k$ with values in $\{0,1,2,3\}$: \begin{itemize}  benoit committed Sep 28, 2020 141 142 143 144  \item if $k=0$ then \texttt{o} and \texttt{a} are aliased. \item if $k=1$ then \texttt{o} and \texttt{b} are aliased. \item if $k=2$ then \texttt{a} and \texttt{b} are aliased. \item else there is no aliasing.  Benoit Viguier committed Jan 15, 2020 145 146 \end{itemize} In the proof of our specification, we do a case analysis over $k$ when needed.  Benoit Viguier committed Feb 14, 2020 147 148 This solution does not cover all the possible cases of aliasing over 3 pointers (\eg \texttt{o} = \texttt{a} = \texttt{b}) but it is enough to satisfy our needs.  Benoit Viguier committed Jan 17, 2020 149   benoit committed Sep 28, 2020 150 \subheading{Improving verification speed.}  Benoit Viguier committed Jan 17, 2020 151 To make the verification the smoothest, the Coq formal definition of the function  Benoit Viguier committed Feb 14, 2020 152 should be as close as possible to the C implementation.  Benoit Viguier committed Jan 17, 2020 153 Optimizations of such definitions are often counter-productive as they increase the  Benoit Viguier committed Feb 14, 2020 154 amount of proofs required for \eg bounds checking, loop invariants.  Benoit Viguier committed Jan 17, 2020 155 156 157 158  In order to further speed-up the verification process, to prove the specification \TNaCle{crypto_scalarmult}, we only need the specification of the subsequently called functions (\eg \TNaCle{M}).  Benoit Viguier committed Feb 14, 2020 159 This provide with multiple advantages: the verification by Coq can be  Benoit Viguier committed Jan 17, 2020 160 done in parallel and multiple users can work on proving different functions at  Benoit Viguier committed Feb 14, 2020 161 162 the same time. % We proved all intermediate functions.