diff --git a/implementation.tex b/implementation.tex index f87fc28efd87499812458f6b7e1653e6f9b6ccc4..e581a65694e211d697c92682f6b06ddeb6a473ec 100644 --- a/implementation.tex +++ b/implementation.tex @@ -48,7 +48,7 @@ These operations are---as well as the rest of the code---implemented in constant At the end of the double-and-add algorithm, we end up with a representation of $R = [k]P$ in projective coordinates. We compute the affine representation of $x_R$ and $y_R$ by computing the inverse of $Z_R$. -Like most implementations of Curve25519 scalar multiplciation, +Like most implementations of Curve25519 scalar multiplication, we use Fermat's little theorem and raise $Z_R$ to the power $2^{255} - 21$ to obtain $Z_R^{-1}$. We chose not to exploit the optimization described in~\cite{BY19}, because the implementations we compare to have not had the opportunity to implement this technique; @@ -173,8 +173,8 @@ This substitutes $8\mathbf{a}$ for $4\mathbf{m}$ in \Add{}, and $10\mathbf{a}$ f Last, we found that shuffling the \texttt{ymm} registers turns out to be relatively weak and expensive. -That is because Sandy Bridge has no arbitrary shuffle instruction, -such as \texttt{vpermq}. +That is because Sandy Bridge has no arbitrary shuffle instruction +(like the \texttt{vpermq} instruction from AVX2). To shuffle every value in a \texttt{ymm} register into the correct lane, we would need at least two µops on port 5. Then it is cheaper to put all the values in the first lane, and diff --git a/intro.tex b/intro.tex index 62cac742b0e5adf3de4f2734ffbdfbda71dc4ab0..00a1901ba8a77fe41b3b641b98b07902464d7cbc 100644 --- a/intro.tex +++ b/intro.tex @@ -84,7 +84,7 @@ properties of the Secure Scuttlebutt Gossip protocol and Tendermint's secure han In 2015, Hamburg presented the ``Decaf'' technique~\cite{Ham17}, which removes the cofactor of twisted Edwards curves through a clever encoding. He later refined the technique to ``Ristretto'' (see ~\cite{ristretto}), which is -now proposed in the crypto form research group (CFRG) of IETF for standardization~\cite{VGT+19}. +now proposed in the crypto forum research group (CFRG) of IETF for standardization~\cite{VGT+19}. The Decaf and Ristretto encodings come at some computational cost and also added complexity of the implementation, but it eliminates the burden of handling the cofactor in protocol design. diff --git a/prelim.tex b/prelim.tex index ac4cf9db401093c89d368e24229bea9f93d66dab..25863c775b95d3a3fbe7253e2b49aaa99db6b07c 100644 --- a/prelim.tex +++ b/prelim.tex @@ -32,7 +32,7 @@ Indeed, the Renes-Costello-Batina complete addition formulas have a specialized The second reason is that various cryptographic standards have adopted these kinds of curves~\cite{ETSI07,BSI12,FIPS186-4,Brainpool,SEC2}. Our results will apply to more commonly used curves if we mimic the standards. \subheading{Twist security.} -In the case an implementor uses formulas that do not depend on any of the constants $a$ and $b$, they could choose to omit checking whether the input point lies on the curve. To prevent invalid curve attacks in this case, $E$'s twist ($E^d$) must also be of prime order. Then, the first valid value for $b$ is $13318$. +In the case an implementor uses formulas that do not depend on any of the constants $a$ and $b$, they could choose to omit checking whether the input point lies on the curve. To prevent invalid-curve attacks in this case, $E$'s twist ($E^d$) must also be of prime order. Then, the first valid value for $b$ is $13318$. \subheading{Point validation.}\label{sec:pointvalidation} All scalar-multiplication algorithms on Curve13318---or any short diff --git a/results.tex b/results.tex index 1eea1cf46875e291352458131ce0933770ff8054..d79a2cd9a71a6a712589b2eb114f8d96e55e4d92 100644 --- a/results.tex +++ b/results.tex @@ -105,7 +105,7 @@ I.e.\ it is based on the formulas from Bosma and Lenstra~(\cite{BL95}), } Their variable-basepoint scalar-multiplication runs in $278\unit{kcc}$ on the Sandy Bridge microarchitecture. -Comparing that measurement with ours, suggests that the complete formulas add---% +Comparing that measurement to ours, suggests that the complete formulas add---% relative to their incomplete formulas based on conditional masking---% an overhead of about $40\%$, which strongly affirms the overhead measured by Renes, Costello, and Batina.