Commit 929d2cc4 authored by Paul Fiterau Brostean's avatar Paul Fiterau Brostean

Some updates

parent 626786b8
......@@ -20,11 +20,15 @@ We have adapted the setting of timing parameters to each implementation.
\label{fig:sshserver}
\end{figure*}
OpenSSH was learned using a full alphabet, whereas DropBear and BitVise were learned using a restricted alphabet (as defined in Subsection~\ref{subsec:alphabet}). The primary reason for using a restricted alphabet was to speed up learning. Most inputs excluded were inputs that either didn't change behavior (like \textsc{debug} or \textsc{unimpl}), or that proved costly time-wise,
and were not critical to penetrating all layers. A concrete example is the user/password based authentication inputs (\textsc{ua\_pw\_ok} and
\textsc{ua\_pw\_nok}). It would take the system 2-3 seconds to respond to an invalid password, a typical countermeasure to slow down
brute force attacks. By contrast, public key authentication resulted in quick responses. The \textsc{disconnect} input presented similar
challenges, as it would take a varying amount of time until the system responded. This was particularly problematic for BitVise.
OpenSSH was learned using a full alphabet, whereas DropBear and BitVise were learned using a restricted alphabet (as defined in Subsection~\ref{subsec:alphabet}).
The main reason for using a restricted alphabet reduce learning times. Based on the model learned for OpenSSH (the first implementation analyzed) and the specification, we excluded
inputs that seemed unlikely to produce state change (like \textsc{debug} or \textsc{unimpl}). We also excluded inputs that proved costly time-size, yet were not were not needed to
visit all states in our happy flow. We excluded, for example, the user/password based authentication inputs (\textsc{ua\_pw\_ok} and
\textsc{ua\_pw\_nok}) as it they would take the system 2-3 seconds to respond to. By contrast, public key authentication resulted in quick responses. For a similar reason we excluded
\textsc{disconnect}.
%The \textsc{disconnect} input presented similar
%challenges, as it would take a varying amount of time before the system responded.
%Not only that, but failed password authentication
%attempts are also likely to trigger security mechanisms that would block subsequent authentication attempts. While this is
......@@ -39,11 +43,12 @@ distinguishing sequence. The exhaustive variant generates tests for all possible
namely, that the learned model is correct unless the (unknown) model of the implementation has at least {\dk} more states than the learned hypothesis. The random variant produces tests
with randomly generated middle sections. No formal confidence is provided, but past experience shows this to be more effective at finding counterexamples since {\dk}
can be set to higher values. We executed a random test suite with {\dk} of 4 comprising 40000 tests for OpenSSH, and 20000 tests for BitVise and DropBear.
We then ran an exhaustive test suite with {\dk} of 2 for all implementations.
We then ran an exhaustive test suite with {\dk} of 2 for all implementations. %We have additionally included in our suit suite counterexamples to the formalized specification generated by the model checker. To that end, we built a java adapter that would automatically run the model checker on the hypothesis, and transform any counterexamples into tests. This proved essential in learning DropBear, as the last counterexample was generated by the model checker.
Table~\ref{tab:experiments} describes the exact versions of the systems analyzed together with statistics on learning and testing, namely:
(1) the number of states in the learned model, (2) the number of hypotheses built during the learning process and (3) the total number of learning and test queries run. For test queries, we only consider those run on the last hypothesis. All learned models plus the specifications checked can be found at \url{https://gitlab.science.ru.nl/pfiteraubrostean/Learning-SSH-Paper/tree/master/models}.
(1) the number of states in the learned model, (2) the number of hypotheses built during the learning process and (3) the total number of learning and test queries run. For test queries, we only consider those run on the last hypothesis. All learned models plus the specifications checked can be found at \url{https://gitlab.science.ru.nl/pfiteraubrostean/Learning-SSH-Paper/tree/master/models}. The statistics
give a glimpse into the issue of scalability. Assuming each input took 0.5 seconds to process, and an average query length of 10, to perform 40000 queries would have taken roughly 55 hours. This is consistent with the time experiments took, which span several days. The long duration compelled us to resort to restricted alphabets, which lead to reduction in the number of queries needed. Our work could have benefited from parallel execution.
%BitVise: MemQ: 24996 TestQ: 58423
%Dropbear: MemQ: 3561 TestQ: 30629
%OpenSSH: MemQ: 19836 TestQ: 76418
......@@ -66,7 +71,7 @@ DropBear v2014.65 & 29 & 8 & 8357
The large number of states is down to several reasons. First of all, some systems exhibited buffering behavior. In particular, BitVise would queue
responses for higher layer inputs sent during key re-exchange, and would deliver them all at once after rekeying was done. Rekeying was also
a major contributor to the number of states. For each state where rekeying is possible, the sequence of transitions constituting the complete
rekeying process should lead back to that state. This leads to two additional rekeying states for every state allowing rekey. Many states were also added due to {\dmapper} generated outputs such as \textsc{ch\_none} or \textsc{ch\_max}, outputs which signal that no channel is open or that the maximum number of channels have been opened.
rekeying process should lead back to that state. This leads to two additional rekeying states for every state allowing rekey. Many states were also added due to the {\dmapper} generated outputs \textsc{ch\_none} or \textsc{ch\_max}, outputs which signal that no channel is open or that the maximum number of channels have been opened.
Figure~\ref{fig:sshserver} shows the model learned for OpenSSH, with some edits to improve readability. The happy flow, in green, is fully explored in the model and mostly matches our earlier description of it\footnote{The only exception is in the Transport layer, where unlike in our happy flow definition, the server is the first to send the \textsc{newkeys} message. This is also accepted behavior, as the protocol does not specify which side should send \textsc{newkeys} first.}. Also explored is what happens when a rekeying sequence is attempted. We notice that rekeying is only allowed in states of the Connection layer. Strangely, for these states, rekeying is not state preserving, as the generated output on receiving a \textsc{sr\_auth}, \textsc{sr\_conn} or \textsc{kex30} changes from \textsc{unimpl} to \textsc{no\_resp}. This leads to two sub-clusters of states, one before the first rekey, the other afterward. In all other states, the first step of a
rekeying (\textsc{kexinit}) yields (\textsc{unimpl}), while the last step (\textsc{newkeys}) causes the system to disconnect.
......
\section{The learning setup} \label{sec:setup}
%This chapter will cover the setup used to infer the state machines. We provide a general setup outline in Section~\ref{components}. The tested SSH servers are described in Section~\ref{suts}, which were queried with the alphabet described in Section~\ref{alphabet}. Section~\ref{setup-handling} will cover the challenging SUT behaviour faced when implementing the mapper, and the adaptations that were made to overcome these challenges. Section~\ref{layers-individual} will discuss the relation between state machines for individual layers and the state machine of the complete SSH protocol. The conventions on visualisation of the inferred state machines are described in Section~\ref{visualisation}.
......@@ -153,11 +154,9 @@ The {\dmapper} must provide a translation between abstract messages
and well-formed SSH messages: it has to translate abstract inputs
listed in Tables~\ref{trans-alphabet}-\ref{conn-alphabet} to actual
SSH packets, and translate the SSH packets received in answer
to our abstract outputs.
A special case here occurs when no output is received from the
{\dsut}; in that case the {\dmapper} gives back to the learner a
\textsc{no\_resp} message, to indicate that a time-out occurred.
to our abstract outputs. If no answer it received on an input, the
{\dmapper} must return an output indicating timeout, which in our case
is the \textsc{no\_resp} message.
The sheer complexity of the {\dmapper} meant that it was easier to
adapt an existing SSH implementation, rather than construct the
......@@ -172,9 +171,9 @@ logic with one dictated by messages received from the {\dlearner}.
The {\dmapper} maintains a set of state variables to record parameters
of the ongoing session, including the server's preferences
for key exchange and encryption algorithm, parameters of these
protocols, and -- once it has been established -- the session key.
These parameters are updated when receiving messages from the server,
and are used to concretize inputs to actual SSH messages to the server.
protocols, and, once it has been established, the session key.
These parameters are updated when receiving messages from the server
and used to concretize inputs to actual SSH messages to the server.
For example, upon receiving a \textsc{kexinit} from the {\dsut}, the {\dmapper} saves
the {\dsut}'s preferences for key exchange, hashing and encryption
......@@ -234,7 +233,7 @@ to succeed. Sources of non-determinism are:
\item Timing is another source of non-deterministic behavior. For
example, the {\dmapper} might time-out before the {\dsut} had
sent its response. Some {\dsuts} also behave
unexpectedly when a new query is received too shortly after the
unexpectedly when a new input is received too shortly after the
previous one. Hence in our experiments we adjusted time-out
periods accordingly so that neither of these events occur, and the {\dsut}
behaves deterministically all the time.
......@@ -248,9 +247,9 @@ to succeed. Sources of non-determinism are:
\end{enumerate}
%
To detect non-determinism, the {\dmapper} caches all observations
in an SQLite database and verifies if a new observations are consistent
with previous ones. If not, it raises a warning, which then needs to be manually investigated. We acted on
each case until we found a setting under which behavior was deterministic.
in an SQLite database and verifies if new observations are consistent
with previous ones. If not, it raises a warning, which then needs to be manually investigated. We analyzed
each warning until we found a setting under which behavior was deterministic.
The cache also acts as a cheap source of responses for already answered queries.
Finally, by re-loading the cache from a previous experiment, we were able to start
......
......@@ -102,8 +102,8 @@ A key note is that properties are checked not on the actual concrete model of th
escapeinside={(*}{*)},%
}
Before introducing the properties, we mention some basic predicates and conventions we use in their definition. The happy flow in SSH consists in a series of steps, the user first exchanges keys, then requests for the authentication service, followed up by supplying valid credentials to authenticate, concluded by opening of a channel. Whereas the first step is complex, the subsequent steps can be captured by the simple predicates {\dreqauth}, {\dvauth} and {\dopchan} respectively. The predicates are defined in terms of the output generated at a given moment, with certain values of this output indicating that the step was performed successfully. For example, \textsc{ch\_open\_success} indicates that a channel has been opened successfully. Sometimes we also need the input that generated the output, so as to distinguish this step from a different step. In particular, requesting the authentication service is distinguished from requesting the connection service by \textsc{sr\_auth}.
To these predicates, we add predicates for valid, invalid and all authentication methods, a predicate for the receipt of \textsc{newkeys} from the server, and receipt of \textsc{kexinit}, which can also be seen as initiation of key (re-) exchange. These latter predicates have to be tweaked in accordance with the input alphabet used and with the output the {\dsut} generated (\textsc{kexinit} could be sent in different packaging, either alone, or joined by a different message). Their formulations correspond to the OpenSSH setting. Finally, by {\dconnlost} we define a predicate suggesting that connection was lost, and by {\dend}, the end condition for most higher layer properties.
Before introducing the properties, we mention some basic predicates and conventions we use in their definition. The happy flow in SSH consists in a series of steps, the user (1) exchanges keys, (2) requests for the authentication service, (3) supplies valid credentials to authenticate and finally (4) opens a channel. Whereas step (1) is complex, the subsequent steps can be captured by the simple predicates {\dreqauth}, {\dvauth} and {\dopchan} respectively. The predicates are defined in terms of the output generated at a given moment, with certain values of this output indicating that the step was performed successfully. For example, \textsc{ch\_open\_success} indicates that a channel has been opened successfully. Sometimes we also need the input that generated the output, so as to distinguish this step from a different step. In particular, requesting the authentication service is distinguished from requesting the connection service by \textsc{sr\_auth}.
To these predicates, we add predicates for valid, invalid and all authentication methods, a predicate for the receipt of \textsc{newkeys} from the server, and receipt of \textsc{kexinit}, which can also be seen as initiation of key (re-) exchange. These latter predicates have to be tweaked in accordance with the input alphabet used and with the output the {\dsut} generated (\textsc{kexinit} could be sent in different packaging, either alone, or joined by a different message). Their formulations correspond to the OpenSSH setting. Finally, by {\dconnlost} we define a predicate suggesting that connection was lost, and by {\dend}, the condition after which higher layer properties no longer have to hold.
\begin{center}
\begin{lstlisting}[basicstyle=\footnotesize]
(*{\dreqauth}*) := inp=SR_AUTH & out=SR_ACCEPT;
......@@ -124,7 +124,7 @@ Our formulation uses NuSMV syntax.
%We occasionally rely on past modalities operators such Once (O) and Since (S), which are uncommon, but are supported by NuSMV.
We also use the weak until operator W, which is not supported by NuSMV, but can be easily defined in terms of the until operator
U and globally operator G that are supported:
$p\,W\,q\, =\, p \,U\, q\, | \, G\, p$. Many of the higher layer properties we formulate should hold only until a disconnect or a key (re-)exchange happens, hence the definition of the {\dend} predicate. This is because the RFC's don't specify what should happen when no connection exists. Moreover, higher layer properties in the RFC's only apply outside of rekey sequences, as inside a rekey sequence, the RFC's suggest implementations to reject all higher layer inputs, regardless of the state before the rekey.
$p\,W\,q\, =\, p \,U\, q\, | \, G\, p$. Many of the higher layer properties we formulate should hold only until a disconnect or a key (re-)exchange happens, hence the definition of the {\dend} predicate. This is because the RFC's don't specify what should happen when no connection exists. Moreover, higher layer properties in the RFC's only apply outside of rekey sequences, as inside a rekey sequence, the RFC's advise implementations to reject all higher layer inputs, regardless of the state before the rekey.
%In the actual specification, W was replaced by this re-formulation. %Finally, we may define properties which we later use when defining other properties. This feature again isn't supported by NuSMV, hence the properties appear in expanded form in the run specification.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment