Commit 1cc821bd authored by Paul Fiterau Brostean's avatar Paul Fiterau Brostean
Browse files

Updated sections, in particular sut section

parent c7463cc3
\section{Learning results} \label{sec:result}
\newcommand{\dk}{\emph{k}}
We use the setup described in Section~\ref{sec:setup} to learn models for OpenSSH, BitVise and DropBear SSH server implementations.
OpenSSH represents the focal point, as it is the most popular implementation of SSH (with over 80 percent of market share in 2008)
OpenSSH represents the focal point, as it is the most popular implementation of SSH (with over 80 percent of market share in 2008~\cite{Albrecht2009Plaintext})
and also, the default server for many UNIX-based systems. DropBear is an alternative to OpenSSH designed for low resource
systems. BitVise is a well known proprietary Windows-only SSH implementation.
......@@ -11,10 +11,18 @@ Certain arrangements had to be made including the setting of timing parameters t
OpenSSH was learned using a full alphabet, whereas DropBear and BitVise were learned using a reduced alphabet. Both versions of
the alphabets are described in Subsection~\ref{subsec:alphabet}. The primary reason for using a reduced alphabet was to reduce learning times.
Most inputs excluded were inputs that either didn't change behavior (like \textsl{debug} or \textsl{unimpl}), or that triggered behavior
predictably similar to other inputs. As an example, \textsl{ua\_pw\_ok} contours the same behavior as \textsl{ua\_pk\_ok}. But while authenticating
with a public key was done quickly, authenticating with a username/password proved expensive (it would take the system 2-3 seconds to respond to
false credentials \textsl{ua\_pw\_ok}). The \textsl{disconnect} proved expensive in a similar way.
Most inputs excluded were inputs that either didn't change behavior (like \textsc{debug} or \textsc{unimpl}), or that proved costly time-wise,
and were not critical to penetrating all layers. A concrete example is the user/password based authentication inputs (\textsc{ua\_pw\_ok} and
\textsc{ua\_pw\_nok}). It would take the system 2-3 seconds to respond to an invalid password, perhaps in an attempt to thwart brute force attacks.
By contrast, public key authentication resulted in quick responses. The \textsc{disconnect} input presented similar
challenges, particularly for BitVise.
%Not only that, but failed password authentication
%attempts are also likely to trigger security mechanisms that would block subsequent authentication attempts. While this is
%As an example, \textsl{ua\_pw\_ok} contours the same behavior as \textsl{ua\_pk\_ok}. But while authenticating
%with a public key was done quickly, authenticating with a username/password proved time consuming (it would take the system 2-3 seconds to respond to
%invalid credentials \textsl{ua\_pw\_ok}). The \textsl{disconnect} proved expensive in a similar way.
For testing, we used random and exhaustive variants of testing algorithm described in
\cite{SMJV15}, which generate efficient test suites. Tests generated comprise an access sequence, a middle section of length {\dk} and a
......@@ -22,7 +30,7 @@ distinguishing sequence. The exhaustive variant for a set {\dk}, generates tests
namely, that the learned model is correct unless the (unknown) model of the implementation has at least {\dk} more states. The random variant produces tests
with randomly generated middle sections. No formal confidence is provided, but past experience shows this to be more effective at finding counterexamples since {\dk}
can be set to higher values. We executed a random test suite with {\dk} of 4 comprising 40000 tests for OpenSSH, and 20000 tests for BitVise and DropBear.
We then ran an exhaustive test suite with {\dk} of 2 for for all implementations.
We then ran an exhaustive test suite with {\dk} of 2 for for all implementations.
Table~\ref{tab:experiments} describes the exact versions of the systems analyzed together with statistics on learning and testing, namely:
(1) the number of states in the learned model, (2) the number of hypotheses built during the learning process and (3) the total number of
......@@ -44,9 +52,12 @@ DropBear v2014.65 & 17 & 2 & tbc
The large number of states is down to several reasons. First of all, some systems exhibited buffering behavior. In particular, BitVise would queue
responses for inputs sent during a key re-exchange and would deliver them all once the exchange was done.
A considerable number of states were added due to {\dmapper} generated outputs such as \textsl{ch\_none} or \textsl{ch\_max}, outputs which signal that no channel is open or
that the maximum number of channels have been opened.
responses for higher layer inputs sent during key re-exchange, and would deliver them all at once, after the exchange was done. Re-exchanging keys (rekey-ing) was also
a major contributor to the number of states. In states permitting rekey, following the sequence of transitions comprising the rekey should lead back to the starting state. This
leads to 2 additional rekey states for every state permitting rekey. A considerable number of states were also added due to {\dmapper} generated outputs such as \textsl{ch\_none} or \textsl{ch\_max}, outputs which signal that no channel is open or that the maximum number of channels have been opened.
%Figure~{fig:sshserver} shows
%To give a concrete example, the {\dmapper} on every \textsl{ch\_open} saves a channel identifier and sends
%a corresponding message to the {\dsut}. If \textsl{ch\_open} is called again, the {\dmapper} responds with a \textsl{ch\_max}. The channel identifier is removed
%by a \textsl{ch\_close} input leading to pairs of identical states with and without the channel identifier, even in states where channels are not relevant (like for example states prior to authentication).
......
......@@ -11,7 +11,7 @@ The learning setup consists of three components: a \emph{learner}, \emph{mapper}
The learner uses LearnLib ~\cite{LearnLib2009}, a Java library implementing $L^{\ast}$ based algorithms for learning Mealy machines. The {\dmapper} is based on Paramiko, an open source SSH implementation written in Python\footnote{Paramiko is available at \url{http://www.paramiko.org/}}. We opted for Paramiko because its code is relatively well structured and documented. The {\dsut} can be any existing implementation of a SSH server. The {\dlearner} communicates with the {\dmapper} over sockets. A graphical representation of our setup is shown in Figure~\ref{fig:components}.
\begin{figure}
\centering
\includegraphics[scale=0.60]{setup.png}
\includegraphics[scale=0.40]{components.png}
\caption{The SSH learning setup.}
\label{fig:components}
\end{figure}
......
......@@ -9,7 +9,11 @@ SSHv2 follows a client-server paradigm consisting of three components (Figure~\r
\item The \textit{connection protocol} (RFC 4254~\cite{rfc4254}) allows the encrypted channel to be multiplexed in different channels. These channels enable a user to run multiple processes, such as terminal emulation or file transfer, over a single SSH connection.
\end{itemize}
Each layer has its own specific messages. The SSH protocol is interesting in that outer layers do not encapsulate inner layers. This means that different layers can interact. Hence, it makes sense to learn SSH as a whole, instead of analyzing its constituent layers independently. We review the happy flow for each layer, outlining the relevant messages which are later used in learning.
Each layer has its own specific messages. The SSH protocol is interesting in that outer layers do not encapsulate inner layers. This means that different layers can interact. Hence, it makes sense to learn SSH as a whole, instead of analyzing its constituent layers independently. We review each layer, outlining the relevant messages which are later used in learning. We refer to as \textit{happy flow},
the common case of a successful operation. The SSH protocol happy flow at a high level consists of the following steps: (1) the client first establishes a TCP connection with the server, (2) the two sides negotiate encryption algorithms, and subsequently establish one-time session keys via a negotiated key exchange method, further communication is secured using these keys, (3) the client authenticates to the server by providing valid credentials and finally, (4) the client accesses services on the server like for example the terminal service. We omit the first step in our detailed description. The following steps are an integral part of each of SSH's layers.
%perhaps room for figure
%Different layers are identified by their message numbers. These message numbers will form the basis of the state fuzzing. The SSH protocol is especially interesting because outer layers do not encapsulate inner layers. This means that different layers can interact. One could argue that this is a less systematic approach, in which a programmer is more likely to make state machine-related errors.
\begin{figure}[!hb]
......@@ -20,28 +24,34 @@ Each layer has its own specific messages. The SSH protocol is interesting in th
\end{figure}
\subsection{Transport layer}\label{ssh-run-trans}
SSH provides end-to-end encryption based on pseudo-random keys which are securely negotiated during key exchange. The negotiation begins with the \textsc{kexinit} message, which exchanges the preferences for negotiable parameters. Subsequently, the actual key exchange using the negotiated algorithm takes place, using \textsc{kex30} and \textsc{kex31} messages using Diffie-Hellman as negotiated key exchange algorithm. The keys are used from the moment the \textsc{newkeys} command has been issued by both parties. A subsequent \textsc{sr\_auth} requests the authentication service. %We consider an transport layer state machine secure if there is no path from the initial state to the point where the authentication service is invoked without exchanging and employing cryptographic keys.
SSH provides end-to-end encryption based on pseudo-random keys which are securely negotiated as part of a \textsl{key exchange} method.
Key exchange begins by the two sides exchanging their preferences for negotiable parameters relating to the actual key exchange algorithm used, as well as encryption, compression and hashing algorithms preferred and supported by either side. Preferences are sent via the \textsc{kexinit} message. Subsequently, key exchange using the negotiated algorithm takes place. Following this algorithm, one-time session keys for encryption and hashing are generated by each side, together with an identifier for the session. Diffie-Hellman is the main key exchange algorithm, and the only one required for support by the RFC. Under the Diffie-Hellman scheme, \textsc{kex30} and \textsc{kex31} are exchanged and new session keys are produced. These keys are used from the moment the \textsc{newkeys} command has been issued by both parties. A subsequent \textsc{sr\_auth} requests the authentication service. The happy flow thus consists of the succession of the three steps comprising key exchange, followed up by a successful authentication service request. The sequence is shown in Figure~\ref{fig:sshcomponents}.
\begin{figure}[!hb]
\includegraphics[width=0.5\textwidth]{hf-trans.pdf}
\includegraphics[scale=0.32]{hf-trans.pdf}
\caption{The happy flow for the transport layer.}
\label{fig:sshcomponents}
\end{figure}
\subsection{User authentication layer}\label{ssh-run-auth}
RFC 4252~\cite{rfc4252} defines four authentication methods (password, public-key, host-based and none). The authentication request includes a user name, service name and authentication data, which includes both the authentication method as well as the data needed to perform the actual authentication, such the password or public key. The happy flow for this layer is defined as the sequence that results in a successful authentication. The messages \textsc{ua\_pw\_ok} and \textsc{ua\_pk\_ok} achieve this for respectively password or public key authentication.
\textsl{Key re-exchange}, or rekey, is a near identical process, with the difference being that instead of taking place at the beginning, it takes place once session keys are already in place. The purpose is to renew session keys so as to foil potential replay attacks. It follows the same steps as key exchange. Messages exchanged as part of it are encrypted using the old set of keys, messages exchanged afterward are encrypted using the new keys.
%We consider an transport layer state machine secure if there is no path from the initial state to the point where the authentication service is invoked without exchanging and employing cryptographic keys.
\subsection{Authentication layer}\label{ssh-run-auth}
Once the authentication service has been summoned, authentication can commence. To this end, RFC 4252~\cite{rfc4252} defines four authentication methods (password, public-key, host-based and none). The authentication request includes a user name, service name and authentication data, which includes both the authentication method as well as the data needed to perform the actual authentication, such the password or public key. The happy flow for this layer is defined as the sequence that results in a successful authentication. The messages \textsc{ua\_pw\_ok} and \textsc{ua\_pk\_ok} achieve this for respectively password or public key authentication.
%We consider a user authentication layer state machine secure if there is no path from the unauthenticated state to the authenticated state without providing correct credentials.
\begin{figure}[!ht]
\includegraphics[width=0.5\textwidth]{hf-auth.pdf}
\includegraphics[scale=0.45]{hf-auth.pdf}
\caption{The happy flow for the user authentication layer.}
\label{fig:sshcomponents}
\end{figure}
\subsection{Connection protocol}\label{ssh-run-conn}
The connection protocol's requests may opening and closing channels of different types, with each type providing access to specific services. From the various services available, we focus on the remote terminal over a session channel, a standout feature of SSH. A happy flow would involve opening a session channel, \textsc{ch\_open}, requesting a terminal \textsc{ch\_request\_pty}, sending and managing data via the messages \textsc{ch\_send\_data}, \textsc{ch\_adj\_win}, \textsc{ch\_send\_eof}, and eventually closing the channel via \textsc{ch\_close}.
\subsection{Connection layer}\label{ssh-run-conn}
Successful authentication makes available services of the Connection layer. The Connection layer enables the user to open and close channels of different types, with each type providing access to specific services. From the various services available, we focus on the remote terminal over a session channel, a standout feature of SSH. A happy flow would involve opening a session channel, \textsc{ch\_open}, requesting a terminal \textsc{ch\_request\_pty}, sending and managing data via the messages \textsc{ch\_send\_data}, \textsc{ch\_adj\_win}, \textsc{ch\_send\_eof}, and eventually closing the channel via \textsc{ch\_close}.
%Because the connection protocol offers a wide range of functionalities, we it is hard to define a single happy flow. Requesting a terminal is one of the main features of SSH and has therefore been selected as the happy flow. This behaviour is typically triggered by the trace \textsc{ch\_open}; \textsc{ch\_request\_pty}. Other
......@@ -49,7 +59,7 @@ The connection protocol's requests may opening and closing channels of different
%TODO Perhaps change this figure so to reflect text
\begin{figure}[!ht]
\includegraphics[width=0.5\textwidth]{hf-conn.pdf}
\includegraphics[scale=0.35]{hf-conn.pdf}
\caption{The happy flow for the connection layer.}
\label{fig:sshcomponents}
\end{figure}
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment