\section*{Usenix Security 2020 Review \#324A} \begin{tabular}{rrp{.6\textwidth}} \toprule Review recommendation & 3. & Major revision \\ Writing quality & 3. & Adequate \\ Reviewer interest & 2. & I might go to a talk about this \\ Reviewer expertise & 3. & Knowledgeable \\ \bottomrule \end{tabular} \begin{center} \subheading{===== Paper summary =====} \end{center} The Networking and Cryptography library (NaCl) admits a compact implementation, TweetNaCl, that leans toward clarity over efficiency. A key component of these libraries is the key exchange protocol poetically known as X25519, after the parameters of the elliptic curve over which it carries its computations. This paper presents a proof of correctness of the TweetNaCl implementation of X25519. This result is thorough and offers novel contributions in several aspects. First it is a formal proof conducted in the Coq Proof Assistant, hence relying on a minimalist trust base. Second, it proves the correctness of the existing C implementation of the protocol (except for minor technical necessary changes), as opposed to a re-implementation designed with its formal proof in mind. Finally, it not only prove the correctness of the C implementation against a purely functional formalization of the RFC standard of the protocol, but also the correctness of this formalized standard against (a formalization of) the purely "mathematical" definition of the protocol as introduced by Bernstein. The functional refinement of the C implementation is established using VST, the separation logic for C developed at Princeton. Beyond being the adequate tool for such a task, it also in principle allows for a compilation of TweetNaCl through CompCert guaranteeing the result to be transported down to the assembly. The equivalence between the RFC specification and the mathematical counterpart is conducted by (significantly) extending the work on formal elliptic curves by Bartzia and Strub. It relies on ssreflect and MathComp, but is overall a more traditional Coq proof. The complete formal development is provided as an artifact. \begin{center} \subheading{===== Strengths =====} \end{center} I believe this paper describes a valuable work and contribution, that I appreciated learning about. \begin{itemize} \item The authors follow a thorough and valuable methodology, establishing excellent guarantees on the implementation considered. \item It is great to see a development such as VST being used in this context. \item Additionally, going the extra-mile to both refine the C code against RFC and prove this specification to be equivalent to the mathematical model is indeed I believe the right way to go. \item They position their paper in a quite dense corpus of work in a quite satisfactory way. \item The provided artifact showcase a significant proof effort. \item I found barely a couple of typos while reading the paper. \item The conclusion is particularly nice. \item The diff provided between the verified C code and the original code is excellent. Overall there is a great care in stating clearly and precisely what has been proved. \end{itemize} \begin{center} \subheading{===== Weaknesses =====} \end{center} Such a paper is difficult to write. The authors have visibly devoted great efforts to tackle this difficulty, but it remains a challenging read. \begin{itemize} \item The paper takes us from one technical point to another in a manner that seems arbitrary at times, and hinders the overall structure. The problem is quite global, but a typical example is Definition 2.3: as a non-expert, it is hard to understand why this notion is important to introduce here, and it is essentially not used anywhere in the paper. \end{itemize} \begin{answer} The notion of the twist of Curve25519 is quite central in the proof of the second main theorem of our paper, which is linking our formalization of the RFC to the mathematical definition of Curve25519. We therefore feel like it should not be omitted from the preliminaries, but we are of course open to suggestions how to motivate the definition better at this early point of the paper. \end{answer} \begin{itemize} \item Figure 1 and Figure 4 are great, and have the potential to help so much the reader against the previous issue. Unfortunately, they are not commenting, and hence fail to do so! \end{itemize} \begin{answer} We extended the discussion of Figures 1, 3, and 4 to address this point. \end{answer} \begin{itemize} \item The protocol considered is standard, and its implementation made to be of reasonable complexity. The paper should therefore really aim to: \begin{enumerate} \item promote a new approach (full stack, use of VST, use of the B\&S elliptic curve library,...) \item explain all specific difficulties and unexpected problems that came across to smooth the path to similar projects in the future; \item give a sense of whether it would scale and/or be reusable for similar/related projects. In particular, the choice of TweetNaCl is never really motivated. How much of a challenge would proving the same result over a more complex/efficient implementation be? \end{enumerate} The paper does a good job at 1, but unfortunately a quite poor at 2 and 3. \item Skimming through the development convinces me that this is a consequent work. But the paper does not help evaluating this effort. It is always difficult to find relevant metrics to evaluate such things, but it would be great to get a sense of the aspects of the work that turned out difficult, and the extent of these difficulties. \end{itemize} \begin{answer} We extended the discussion in the conclusion and in particular added a paragraph on lessons learned. \end{answer} \begin{center} \subheading{===== Detailed comments for authors =====} \end{center} My feelings are essentially summed up through the strengths and weaknesses described above. I find this work to be a great contribution, but regret having a hard time getting as much as I would like out of the paper. You mention in the rebuttal to a previous submission that constant time is not a property that fits in VST's Framework. This is indeed true, and absolutely fair not to cover it in this work, but a short discussion about constant time would nonetheless be interesting I think, for instance in the conclusion. In particular, we are getting quite close to having the adequate facilities for establishing Constant Time of real implementation in the style of this work: this year at POPL was presented an extension of CompCert proved to transport Constant Time from CLight to assembly: "Formal Verification of a Constant-Time Preserving C Compiler", POPL'20, Barthe et al. On a side note, I failed to compile the project: setting up an opam switch with Coq pinned to 8.8.2 as instructed, I had dependency issues with respect to `coq-coqprime`when running `opam install --deps-only coq-verif-tweetnacl `. Having installed manually `coq-coqprime`, I then ended up with the following error: {\footnotesize\texttt{ "The following dependencies couldn't be met:\\ - coq-verif-tweetnacl $\rightarrow$ coq-stdpp \\ unknown package\\ No solution found, exiting"}} \begin{answer} We were not able to reproduce this error, but now make the code available in an anonymous GitHub repository that also contains a virtual-machine image with all software that is required to reproduce the proofs. \end{answer} Here are a few linear comments: \begin{itemize} \item \textbf{page 1, column 2:}\\ - "the program satisfy" $\rightarrow$ "the program satisfies"\\ - "the result of this paper can readily be used in mechanized proofs of higher-level protocols (...)" This statement should probably be made more precise, it is quite unclear to me what you mean exactly. \end{itemize} \begin{answer} \begin{itemize} \item[$-$] Typo fixed (the satisfy/satisfies actually refers to the semantics of a program in this sentence). \item[$-$]We rephrased the second statement to (hopefully) clarify what we mean. \end{itemize} \end{answer} \begin{itemize} \item \textbf{page 2:}\\ Figure 1 is great, but would deserve a lengthy explanation! \end{itemize} \begin{answer} We added an additional description of the figure. \end{answer} \begin{itemize} \item \textbf{page 3, column 1:}\\ Definition 2.3: It's been very unclear to me as a non-expert in cryptography and this protocole in particular what was the purpose of this definition.\\ {\color{gray}"depending of" $\rightarrow$ "depending on"}\\ "depending of the value of the kth bit" $\rightarrow$ unclear what k is at this point \end{itemize} \begin{answer} \begin{itemize} \item[$-$] Regarding Definition 2.3 see our answer above. \item[$-$] Typo fixed. \item[$-$] We updated the text to clarify what we mean by the $k$th bit. \end{itemize} \end{answer} \begin{itemize} \item \textbf{page 3, column 2:}\\ Algorithm 1: unclear what "m" is\\ "RFC 7748 standardize" $\rightarrow$ "RFC 7748 standardizes" \end{itemize} \begin{answer} Clarified that $m$ is an upper bound on the bitlength of the scalar. Typo fixed. \end{answer} \begin{itemize} \item \textbf{page 4, column 2:}\\ It's minor, but it is more shiny nowadays to cite The Odd Order theorem that the Four Color theorem as a mathematical achievement in Coq\\ CompCert $\rightarrow$ Wrong hyphenation\\ Hoare-seq $\rightarrow$ It is clearly hard to find the balance with respect to what should be assumed on the crypto side, and what should be assumed on the verification side. I nonetheless think that this should be omitted.\\ "Separation logic is an extension of Hoare logic" $\rightarrow$ Minor, but the work "extension" does not quite sit right with me. The model is quite intrinsically different. \end{itemize} \begin{answer} \begin{itemize} \item[$-$] While the Odd Order theorem is shinier for the complexity of the work, it may not be as well known as the Four Color theorem. This lack of exposition makes its proof less impressive to readers not familiar with the subject. \item[$-$] We fixed the hyphenation. \item[$-$] The Hoare logic is not a known subject for most cryptographers not familiar with formal methods. In our opinion, the Hoare-Sec rule is the easiest rule with material (as opposed to Hoare-Skip) to understand by its composition nature and as it also relates to how instructions are read in the source code of a program. \item[$-$] The Separation Logic was introduced by their authors \emph{``an extension of Hoare logic''}. See See \url{https://www.cs.cmu.edu/~jcr/seplogic.pdf}. \end{itemize} \end{answer} \begin{itemize} \item \textbf{page 5, column 2:}\\ "'To implement (...)" $\rightarrow$ I am very confused by this. The whole paragraph is an unannounced quote, it would need context/explanation. \end{itemize} \begin{answer} Fixed, we added an introduction sentence. \end{answer} \begin{itemize} \item \textbf{page 6, column 1:}\\ In ListofZn\_fp $\rightarrow$ The use of fuel might deserve a comment. Don't you end up having to prove at some point that you can always compute ahead of time an overapproximation of the fuel needed? Wouldn't it have been simple to use the strong recursion principle of naturals to define the function? \end{itemize} \begin{answer} In our case the fuel is used to garantee to have as an output a list of 32 elements. This allows us to prove that for all List of 32 bytes, ListofZn\_fp (ZofList L) = L. With this lemma at hand we can later simplify some of the expressions. \end{answer} \begin{itemize} \item \textbf{page 6, column 2:}\\ "and the same code as a pure Coq function" $\rightarrow$ I would rephrase, the term "same code" is misleading. Specification: I think explaining the structure of a VST statement would be necessary to help an unfamiliar reader understand this specification. \end{itemize} \begin{answer} We rephrased this paragraph and in particular do no not say ``same code'' anymore. \end{answer} \begin{itemize} \item \textbf{page 7:}\\ Discussion on memory aliasing is great, I would have liked more of this kind through the paper.\\ Figure 2: I had to fish back a little for the definition of "sh", but "Ews" has really never been defined I believe.\\ "Improving speed" $\rightarrow$ of what? This whole paragraph is quite hard to read. In particular be careful that it is not obvious to the reader whether you are speeding up the verification process or the runtime of the implementation. In particular it was unclear to me what you were referring to by saying "Optimizations of such definitions".\\ The following paragraph also is a bit cryptic. I assume you are saying that identifying finely the dependencies between definitions allow for parallelizing the work? Arguably, simply admitting temporarily on the fly any specification needed achieves the same.\\ "Numbers in gf" $\rightarrow$ Please remind the reader what "gf" is. Good section overall \end{itemize} \begin{answer} \begin{itemize} \item[$-$] We added a description of "Ews" in the precondition paragraph, this should clarify the global memory share name. \item[$-$] We clarified that we improve the speed of the verification effort; we also clarified what we mean by ``optimizing the definitions''. \item[$-$] In order to verify a file, Coq needs the compiled proof of dependencies. However in the case of VST it is possible to split the specification from the proof. As a result the proof of the full scalar multiplication does not require the proof of the the multiplication in \F{p}, only its specification. \item[$-$] We reminded the reader that "gf" is a type defined through a C \texttt{typedef} in TweetNaCl. \end{itemize} \end{answer} \begin{itemize} \item \textbf{page 8:}\\ "Using reflection (...)" $\rightarrow$ Would deserve more explanations.\\ Figure 3: Please comment generously this figure, it looks great but it is frustrating to try to decipher it without help. \end{itemize} \begin{answer} \begin{itemize} \item[$-$] We removed the reflection mention, more explanations would require too many implementation details. \item[$-$] We added a paragraph to describe the content of Figure 3. \end{itemize} \end{answer} \begin{itemize} \item \textbf{page 9:}\\ "the type of field which characteristic" $\rightarrow$ "whose characteristic"\\ "The value of add is proven to be on the curve (with coercion)" $\rightarrow$ This parenthesis is too cryptic, it should probably be dropped. \end{itemize} \begin{answer} Fixed. \end{answer} \begin{itemize} \item \textbf{page 11:}\\ Figure 4: this one is the apex: it would deserve a full column of explanations \end{itemize} \begin{answer} In addition to Figure 4, we added a full paragraph providing the red line of the proof of this theorem. We hope to provide suficiant insights of the dependencies between lemmas to arrive into the final theorem. \end{answer} \begin{itemize} \item \textbf{Conclusion:}\\ Great wrap up overall.\\ CompCert: "However, when compiling (...)" $\rightarrow$ I am quite confused about this sentence. Do you mean when compiling the verified C code with gcc? If so, beyond the question whether CompCert matches C17 (which it doesn't, it matches C99), the real problem is to assume that gcc is bug free! I would expect with this whole approach that the expectation is to run a protocole compiled with CompCert.\\ clightGen: "VST does not support (...)" $\rightarrow$ Hard to undertand the relation between this sentence and the previous one.\\ Extending our work: What about proving other NaCl implementations? \end{itemize} \begin{answer} \begin{itemize} \item[$-$] Sentence about other compilers (like GCC or Clang) is rephrased. \item[$-$] Added some context to explain the relation between \texttt{clightgen} and VST. \item[$-$] ``Other NaCl implementations'': It is not exactly clear to us what the reviewer means. There are other cryptographic primitives in TweetNaCl and of course also in NaCl. Large parts of our proof would be re-usable for a proof of Ed25519 signatures in TweetNaCl (which we mention). Correctness proofs of other cryptographic primitives like Salsa20 or SHA-512 would not be able to re-use much from our proof; verification of optimized (assembly) implementations in NaCl would need a different approach. \end{itemize} \end{answer} \begin{center} \subheading{===== Requested changes =====} \end{center} \begin{itemize} \item Please provide high level explanations to your three Figures describing the infrastructure. \end{itemize} \begin{answer} We added high-level description of Figures 1, 3 and 4. They should help the reader to follow the line of the proof. \end{answer} \begin{itemize} \item Please reduce slightly the width of the technical material covered, and use the gained space to provide a bit more context to the one covered \end{itemize} \begin{answer} We removed various technical details, in particular in Section 5. We used the space to extend the discussion of various other points raised by the reviewers. \end{answer} \begin{itemize} \item Please provide more get away lessons from this development that would benefit new similar works \end{itemize} \begin{answer} We added a ``lessons learned'' paragraph in the conclusion. \end{answer} \begin{itemize} \item Please give a sense of the difficulties encountered during the formalization. \end{itemize} \begin{answer} Many difficulties were of technical nature; we cover some of this now in the ``lessons learned'' paragraph. \end{answer} \begin{center} \subheading{===== Questions for authors' response =====} \end{center} \begin{itemize} \item What made TweetNaCl the right choice for this project? \end{itemize} \begin{answer} One goal of the project was to investigate how suitable the combination of VST+Coq is for verifying correctness of existing C code for asymmetric crypto primitives. The X25519 implementation in TweetNaCl was chosen because it is relatively simple, it has some real-world use cases, and the original paper claims that the library should be verifiable. \end{answer} \begin{itemize} \item Would following the same approach for other implementation radically change the proof effort? \end{itemize} \begin{answer} We would expect that the proof effort for another C implementation of X25519 does not change drastically, as long as it does not use any features that are not supported by VST (e.g., the 128-bit integers used by the donna64 implementation). The proof relating the RFC to the mathematical definition does not change for other implementations. \end{answer} \begin{itemize} \item Does compiling TweetNaCl with CompCert rather than gcc impact the performance beyond what is acceptable? \end{itemize} \begin{answer} For the X25519 implementation in TweetNaCl, CompCert generates code that is about 6$\times$ slower than code generated with gcc. While this sounds like a lot, it may not be too much of an issue for projects that use TweetNaCl, because they chose for a library that prioritizes simplicity over performance in the first place. A more serious issue however can be the non-free CompCert license. \end{answer} \begin{itemize} \item If so, what trust do you consider this proof effort to bring to a gcc compiled implementation? \end{itemize} \begin{answer} We are not aware of any bugs that were introduced into ECC software by bugs in gcc; so from a practical, crypto-engineering point of view we rule out all classes of bugs that users are typically concerned about. From a more theoretical point of view, relying on gcc (or any other unverified C compiler) massively expands the TCB and should be reason for concern. \end{answer}