A4_reviews-UsenixA.tex 15 KB
Newer Older
benoit's avatar
benoit committed
1
2
\newpage
\section{Usenix Security 2020 Review \#324A}
benoit's avatar
benoit committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71

\begin{tabular}{rl}
    \toprule
    Review recommendation & 3. Major revision                  \\
    Writing quality       & 3. Adequate                        \\
    Reviewer interest     & 2. I might go to a talk about this \\
    Reviewer expertise    & 3. Knowledgeable                   \\
    \bottomrule
\end{tabular}




\begin{center}
    \subheading{===== Paper summary =====}
\end{center}

The Networking and Cryptography library (NaCl) admits a compact implementation,
TweetNaCl, that leans toward clarity over efficiency.
A key component of these libraries is the key exchange protocol poetically known
as X25519, after the parameters of the elliptic curve over which it carries its
computations.

This paper presents a proof of correctness of the TweetNaCl implementation of X25519.
This result is thorough and offers novel contributions in several aspects.
First it is a formal proof conducted in the Coq Proof Assistant, hence relying on a
minimalist trust base.
Second, it proves the correctness of the existing C implementation of the protocol
(except for minor technical necessary changes), as opposed to a re-implementation
designed with its formal proof in mind.
Finally, it not only prove the correctness of the C implementation against a
purely functional formalization of the RFC standard of the protocol, but also
the correctness of this formalized standard against (a formalization of) the purely
"mathematical" definition of the protocol as introduced by Bernstein.

The functional refinement of the C implementation is established using VST,
the separation logic for C developed at Princeton. Beyond being the adequate
tool for such a task, it also in principle allows for a compilation of TweetNaCl
through CompCert guaranteeing the result to be transported down to the assembly.

The equivalence between the RFC specification and the mathematical counterpart
is conducted by (significantly) extending the work on formal elliptic curves by
Bartzia and Strub. It relies on ssreflect and MathComp, but is overall a more
traditional Coq proof.

The complete formal development is provided as an artifact.

\begin{center}
    \subheading{===== Strengths =====}
\end{center}
I believe this paper describes a valuable work and contribution, that I appreciated learning about.
\begin{itemize}
    \item The authors follow a thorough and valuable methodology, establishing excellent guarantees on the implementation considered.
    \item It is great to see a development such as VST being used in this context.
    \item Additionally, going the extra-mile to both refine the C code against RFC and prove this specification to be equivalent to the mathematical model is indeed I believe the right way to go.
    \item They position their paper in a quite dense corpus of work in a quite satisfactory way.
    \item The provided artifact showcase a significant proof effort.
    \item I found barely a couple of typos while reading the paper.
    \item The conclusion is particularly nice.
    \item The diff provided between the verified C code and the original code is excellent. Overall there is a great care in stating clearly and precisely what has been proved.
\end{itemize}

\begin{center}
    \subheading{===== Weaknesses =====}
\end{center}

Such a paper is difficult to write. The authors have visibly devoted great efforts to tackle this difficulty, but it remains a challenging read.
\begin{itemize}
    \item The paper takes us from one technical point to another in a manner that seems arbitrary at times, and hinders the overall structure. The problem is quite global, but a typical example is Definition 2.3: as a non-expert, it is hard to understand why this notion is important to introduce here, and it is essentially not used anywhere in the paper.
72
73
74
75
76
77
\end{itemize}
\begin{answer}{EEEEEE}
    Computations over Curve25519 are done in $\F{p}$, as a result an easy mistake is to assume the curve defined over $\F{p}$ for all $x \in \F{p}$. However Curve25519 is defined over the quadratic extension which makes the Curve25519 over $\F{p}$ and its quadratic twist isomorphic over $\F{p^2}$ (Definition 2.3).
\end{answer}

\begin{itemize}
benoit's avatar
benoit committed
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
    \item Figure 1 and Figure 4 are great, and have the potential to help so much the reader against the previous issue. Unfortunately, they are not commenting, and hence fail to do so!
    \item The protocol considered is standard, and its implementation made to be of reasonable complexity. The paper should therefore really aim to:
          \begin{enumerate}
              \item promote a new approach (full stack, use of VST, use of the B\&S elliptic curve library,...)
              \item explain all specific difficulties and unexpected problems that came across to smooth the path to similar projects in the future;
              \item give a sense of whether it would scale and/or be reusable for similar/related projects. In particular, the choice of TweetNaCl is never really motivated. How much of a challenge would proving the same result over a more complex/efficient implementation be?
          \end{enumerate}

          The paper does a good job at 1, but unfortunately a quite poor at 2 and 3.
    \item Skimming through the development convinces me that this is a consequent work. But the paper does not help evaluating this effort. It is always difficult to find relevant metrics to evaluate such things, but it would be great to get a sense of the aspects of the work that turned out difficult, and the extent of these difficulties.
\end{itemize}

\begin{center}
    \subheading{===== Detailed comments for authors =====}
\end{center}
My feelings are essentially summed up through the strengths and weaknesses described above. I find this work to be a great contribution, but regret having a hard time getting as much as I would like out of the paper.

You mention in the rebuttal to a previous submission that constant time is not a property that fits in VST's Framework. This is indeed true, and absolutely fair not to cover it in this work, but a short discussion about constant time would nonetheless be interesting I think, for instance in the conclusion. In particular, we are getting quite close to having the adequate facilities for establishing Constant Time of real implementation in the style of this work: this year at POPL was presented an extension of CompCert proved to transport Constant Time from CLight to assembly:
"Formal Verification of a Constant-Time Preserving C Compiler", POPL'20, Barthe et al.

On a side note, I failed to compile the project: setting up an opam switch with Coq pinned to 8.8.2 as instructed, I had dependency issues with respect to `coq-coqprime`when running `opam install --deps-only coq-verif-tweetnacl `. Having installed manually `coq-coqprime`, I then ended up with the following error:

{\footnotesize\texttt{
    "The following dependencies couldn't be met:\\
    - coq-verif-tweetnacl $\rightarrow$ coq-stdpp \\
    unknown package\\
    No solution found, exiting"}}

Here are a few linear comments:

\begin{itemize}
    \item \textbf{page 1, column 2:}\\
          - "the program satisfy" $\rightarrow$ "the program satisfies"\\
          - "the result of this paper can readily be used in mechanized proofs of higher-level protocols (...)"
          This statement should probably be made more precise, it is quite unclear to me what you mean exactly.

    \item \textbf{page 2:}\\
          Figure 1 is great, but would deserve a lengthy explanation!
116
117
118
119
120
121
\end{itemize}
\begin{answer}{EEEEEE}
    We added an additional description of the figure.
\end{answer}

\begin{itemize}
benoit's avatar
benoit committed
122
123
124

    \item \textbf{page 3, column 1:}\\
          Definition 2.3: It's been very unclear to me as a non-expert in cryptography and this protocole in particular what was the purpose of this definition.\\
125
          {\color{gray}"depending of" $\rightarrow$ "depending on"}\\
benoit's avatar
benoit committed
126
127
128
129
          "depending of the value of the kth bit" $\rightarrow$ unclear what k is at this point

    \item \textbf{page 3, column 2:}\\
          Algorithm 1: unclear what "m" is\\
130
          {\color{gray}"RFC 7748 standardize" $\rightarrow$   "RFC 7748 standardizes"}
benoit's avatar
benoit committed
131
132
133
134
135
136
137
138
139
140
141
142

    \item \textbf{page 4, column 2:}\\
          It's minor, but it is more shiny nowadays to cite The Odd Order theorem that the Four Color theorem as a mathematical achievement in Coq\\
          CompCert $\rightarrow$ Wrong hyphenation\\
          Hoare-seq $\rightarrow$ It is clearly hard to find the balance with respect to what should be assumed on the crypto side, and what should be assumed on the verification side. I nonetheless think that this should be omitted.\\
          "Separation logic is an extension of Hoare logic" $\rightarrow$ Minor, but the work "extension" does not quite sit right with me. The model is quite intrinsically different.

    \item \textbf{page 5, column 2:}\\
          "'To implement (...)" $\rightarrow$ I am very confused by this. The whole paragraph is an unannounced quote, it would need context/explanation.

    \item \textbf{page 6, column 1:}\\
          In ListofZn\_fp $\rightarrow$ The use of fuel might deserve a comment. Don't you end up having to prove at some point that you can always compute ahead of time an overapproximation of the fuel needed? Wouldn't it have been simple to use the strong recursion principle of naturals to define the function?
benoit's avatar
benoit committed
143

144
145
146
          \begin{answer}{EEEEEE}
              In our case the fuel is used to garantee to have as an output a list of 32 elements. This allows to prove that for all List of 32 bytes, ListofZn\_fp (ZofList L) = L. With this lemma at hand we can later simplify some of the expressions.
          \end{answer}
benoit's avatar
benoit committed
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163

    \item \textbf{page 6, column 2:}\\
          "and the same code as a pure Coq function" $\rightarrow$ I would rephrase, the term "same code" is misleading.
          Specification: I think explaining the structure of a VST statement would be necessary to help an unfamiliar reader understand this specification.

    \item \textbf{page 7:}\\
          Discussion on memory aliasing is great, I would have liked more of this kind through the paper.\\
          Figure 2: I had to fish back a little for the definition of "sh", but "Ews" has really never been defined I believe.\\
          "Improving speed" $\rightarrow$ of what? This whole paragraph is quite hard to read. In particular be careful that it is not obvious to the reader whether you are speeding up the verification process or the runtime of the implementation. In particular it was unclear to me what you were referring to by saying "Optimizations of such definitions".\\
          The following paragraph also is a bit cryptic. I assume you are saying that identifying finely the dependencies between definitions allow for parallelizing the work? Arguably, simply admitting temporarily yon the fly any specification needed  achieves the same.\\
          "Numbers in gf" $\rightarrow$ Please remind the reader what "gf" is. Good section overall

    \item \textbf{page 8:}\\
          "Using reflection (...)" $\rightarrow$ Would deserve more explanations.\\
          Figure 3: Please comment generously this figure, it looks great but it is frustrating to try to decipher it without help.

    \item \textbf{page 9:}\\
164
165
          {\color{gray}"the type of field which characteristic" $\rightarrow$ "whose characteristic"\\
          "The value of add is proven to be on the curve (with coercion)" $\rightarrow$ This parenthesis is too cryptic, it should probably be dropped.}
benoit's avatar
benoit committed
166
167
168
169
170
171
172
173
174
175
176

    \item \textbf{page 11:}\\
          Figure 4: this one is the apex: it would deserve a full column of explanations

    \item \textbf{Conclusion:}\\
          Great wrap up overall.\\
          CompCert: "However, when compiling (...)" $\rightarrow$ I am quite confused about this sentence. Do you mean when compiling the verified C code with gcc? If so, beyond the question whether CompCert matches C17 (which it doesn't, it matches C99), the real problem is to assume that gcc is bug free! I would expect with this whole approach that the expectation is to run a protocole compiled with CompCert.\\
          clightGen: "VST does not support (...)" $\rightarrow$ Hard to undertand the relation between this sentence and the previous one.\\
          Extending our work: What about proving other NaCl implementations?
\end{itemize}

177
178
179
180
\begin{answer}{EEEEEE}
    \todo{Freek : which version of ISO.}
\end{answer}

benoit's avatar
benoit committed
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
\begin{center}
    \subheading{===== Requested changes =====}
\end{center}
\begin{itemize}
    \item Please provide high level explanations to your three Figures describing the infrastructure.
    \item Please reduce slightly the width of the technical material covered, and use the gained space to provide a bit more context to the one covered
    \item Please provide more get away lessons from this development that would benefit new similar works
    \item Please give a sense of the difficulties encountered during the formalization.
\end{itemize}

\begin{center}
    \subheading{===== Questions for authors' response =====}
\end{center}

\begin{itemize}
    \item What made TweetNaCl the right choice for this project?
\end{itemize}

\begin{answer}{EEEEEE}
    One goal of the project was to investigate how suitable the combination of
    VST+Coq is for verifying correctness of existing C code for asymmetric
    crypto primitives. The X25519 implementation in TweetNaCl was chosen because
    it is relatively simple, it has some real-world use cases, and the original
    paper claims that the library should be verifiable.
\end{answer}

\begin{itemize}
    \item Would following the same approach for other implementation radically change the proof effort?
\end{itemize}

\begin{answer}{EEEEEE}
    We would expect that the proof effort for another C implementation of X25519
    does not change drastically, as long as it does not use any features that are
    not supported by VST (e.g., the 128-bit integers used by the donna64
    implementation). The proof relating the RFC to the mathematical definition
    does not change for other implementations.
\end{answer}

\begin{itemize}
    \item Does compiling TweetNaCl with CompCert rather than gcc impact the performance beyond what is acceptable?
\end{itemize}

\begin{answer}{EEEEEE}
    For the X25519 implementation in TweetNaCl, CompCert generates code that is
    about 6x slower than code generated with gcc. While this sounds like a lot, it
    may not be too much of an issue for projects that use TweetNaCl, because they
    chose for a library that prioritizes simplicity over performance in the first
    place. A more serious issue however can be the non-free CompCert license.
\end{answer}

\begin{itemize}
    \item If so, what trust do you consider this proof effort to bring to a gcc compiled implementation?
\end{itemize}

\begin{answer}{EEEEEE}
    We are not aware of any bugs that were introduced into ECC software by bugs in
    gcc; so from a practical, crypto-engineering point of view we rule out all
    classes of bugs that users are typically concerned about. From a more
    theoretical point of view, relying on gcc (or any other unverified C compiler)
    massively expands the TCB and should be reason for concern.
\end{answer}