Commit fcbbed49 authored by Paul Fiterau Brostean's avatar Paul Fiterau Brostean

Made it a bit shorted. Added cropped versions of imgs

parent 6c4feb2a
......@@ -25,9 +25,9 @@ Limitations of the work, hence possibilities for future work, are several. First
produce a concretization of the abstract models. Consequently, model checking results cannot be fully transferred to the actual implementations. Formal definition
of the mapper and concretization of the learned models (as defined in \cite{AJUV15}) would tackle this. The {\dmapper} also caused considerable redundancy in the learned models; tweaking the abstractions used, in particular those for managing channels, could alleviate this problem while also improving learning times. This in turn would facilitate learning using expanded alphabets instead of resorting to restricted alphabets.
Furthermore, the {\dmapper} abstraction could be refined, to give more
insight into the implementations. In particular, parameters,
such as the session identifier or data sent over channels, could be extracted from the {\dmapper} and potentially handled by existing Register Automata learners\cite{ralib2015,tomte2015}. These learners
can infer systems with parameterized alphabets, state variables and simple operations on data. Finally, we ignored all timing-related behavior, as it could not be handled by the classical learners used; there is preliminary work on learning timed automata\cite{GrinchteinJL10} which could use timing behavior.
insight into the implementations. In particular, parameters
such as the session identifier could be extracted from the {\dmapper} and potentially handled by existing Register Automata learners\cite{ralib2015,tomte2015}. These learners
can infer systems with parameterized alphabets, state variables and simple operations on data. Finally, we suppressed all timing-related behavior, as it could not be handled by the classical learners used; there is preliminary work on learning timed automata\cite{GrinchteinJL10} which could use timing behavior.
Despite these limitations, our work provides a compelling application of learning and model checking in a security setting, on a widely used protocol. We hope this lays
......
......@@ -78,8 +78,8 @@ Model learning has previously been used to infer state machines of
EMV bank cards~\cite{Aarts2013Formal}, electronic
passports~\cite{Aarts2010Inference}, hand-held readers for online
banking~\cite{Chalupar2014Automated}, and implementations of
TCP~\cite{TCP2016} and TLS~\cite{RuiterProtocol}. Some of these case
studies relied on manual analysis of the learned models
TCP~\cite{TCP2016} and TLS~\cite{RuiterProtocol}. Some of these
studies relied on manual analysis of learned models
\cite{Aarts2013Formal,Aarts2010Inference,RuiterProtocol}, but some
also used model checkers \cite{TCP2016,Chalupar2014Automated}.
......
......@@ -11,7 +11,7 @@ We have adapted the setting off timing parameters to each implementation.
\begin{figure*}
\centering
\includegraphics[scale=0.30]{ssh-server}
\includegraphics[scale=0.25]{ssh-server}
\caption{Model of the OpenSSH server. {\normalfont States are collected in 3 clusters,
indicated by the rectangles, where each cluster corresponds to
one of the protocol layers.
......@@ -60,6 +60,7 @@ DropBear v2014.65 & 17 & 2 & 19863
\end{tabular}
\caption{Statistics for learning experiments}
\label{tab:experiments}
\vspace{-10mm}
\end{table}
The large number of states is down to several reasons. First of all, some systems exhibited buffering behavior. In particular, BitVise would queue
......@@ -73,7 +74,7 @@ is possible. Many states were also added due to {\dmapper} generated outputs suc
Figure~\ref{fig:sshserver} shows the model learned for OpenSSH, with some edits to improve readability. The happy flow, in green, is fully explored in the model and mostly matches our earlier description of it\footnote{The only exception is in the transport layer, where unlike in our happy flow definition, the server is the first to send the \textsc{newkeys} message. This is also accepted behavior, as the protocol does not specify which side should send \textsc{newkeys} first.}. Also explored is what happens when a rekeying sequence is attempted. We notice that rekeying is only allowed in states of the Connection layer. Strangely, for these states, rekeying is not state preserving, as the generated output on receiving a \textsc{sr\_auth}, \textsc{sr\_conn} or \textsc{kex30} changes from \textsc{unimpl} to \textsc{no\_resp}. This leads to two sub-clusters of states, one before the first rekey, the other afterward. In all other states, the first step of a
rekeying (\textsc{kexinit}) yields (\textsc{unimpl}), while the last step (\textsc{newkeys}) causes the system to disconnect.
We found it strange how systems reacted to \textsc{sr\_conn}, the request for services of the Connection layer. These services can be accessed once the user had authenticated, without the need of an explicit service request. That in itself was not strange, because authentication messages already mention that connection services should start after authentication \footnote{This is a technical detail, the message format of authentication messages requires a field which says the service started after authentication. The only option is to start Connection layer services. }.
We were puzzled by how systems reacted to \textsc{sr\_conn}, the request for services of the Connection layer. These services can be accessed once the user had authenticated, without the need of an explicit service request. That in itself was not strange, because authentication messages already mention that connection services should start after authentication \footnote{This is a technical detail, the message format of authentication messages requires a field which says the service started after authentication. The only option is to start Connection layer services. }.
What was strange was that an explicit request either lead to \textsc{unimpl}/\textsc{no\_resp} with no state change, as in the case of OpenSSH, or termination of the connection, as in the case of BitVise. The latter was particularly strange, as in theory, once authenticated, the user should always have access to the service, and not be disconnected on requesting this service. Only DropBear seems to respond positively (\textsc{sr\_accept}) to \textsc{sr\_conn} after authentication.
We also notice the intricate authentication behavior:
......
......@@ -5,13 +5,13 @@
%\section{Components}\label{components}
The learning setup consists of three components: the {\dlearner}, {\dmapper} and {\dsut}. The {\dlearner} generates abstract inputs, representing SSH messages. The {\dmapper} transforms these messages into well-formed SSH packets and sends them to the {\dsut}. The {\dsut} sends response packets back to the {\dmapper}, which in turn, translates these packets to abstract outputs. The {\dmapper} then sends the abstract outputs back to the {\dlearner}.
The learning setup consists of three components: the {\dlearner}, the {\dmapper} and the {\dsut}. The {\dlearner} generates abstract inputs, representing SSH messages. The {\dmapper} transforms these messages into well-formed SSH packets and sends them to the {\dsut}. The {\dsut} sends response packets back to the {\dmapper}, which in turn, translates these packets to abstract outputs. The {\dmapper} then sends the abstract outputs back to the {\dlearner}.
The {\dlearner} uses LearnLib ~\cite{LearnLib2009}, a Java library implementing $L^{\ast}$ based algorithms for learning Mealy machines. The {\dmapper} is based on Paramiko, an open source SSH implementation written in Python\footnote{Paramiko is available at \url{http://www.paramiko.org/}}. We opted for Paramiko because its code is relatively well structured and documented. The {\dsut} can be any existing implementation of an SSH server. The three components communicate over sockets, as shown in Figure~\ref{fig:components}.
\begin{figure}
\centering
\includegraphics[scale=0.29]{components.pdf}
\includegraphics[scale=0.29]{components_cropped.pdf}
\caption{The SSH learning setup.}
\label{fig:components}
\end{figure}
......@@ -80,6 +80,7 @@ convey its own parameter preferences before key exchange can proceed. Also inclu
\textsc{sr\_conn}* & Requests the connection protocol~\cite[p. 23]{rfc4253}
\end{tabular}
\caption{Transport layer inputs}
\vspace{-7mm}
\label{trans-alphabet}
\end{table}
......@@ -97,6 +98,7 @@ The Authentication layer defines one single client message type in the form of t
\textsc{ua\_pw\_nok} & Provides an invalid name/password combination~\cite[p. 10]{rfc4252} \\
\end{tabular}
\caption{Authentication layer inputs}
\vspace{-7mm}
\label{auth-alphabet}
\end{table}
......@@ -119,6 +121,7 @@ The restricted alphabet only supports the most general channel management inputs
\textsc{ch\_request\_pty}* & Requests terminal emulation~\cite[p. 11]{rfc4254} \\
\end{tabular}
\caption{Connection layer inputs}
\vspace{-7mm}
\label{conn-alphabet}
\end{table}
%The learning alphabet comprises of input/output messages by which the {\dlearner} interfaces with the {\dmapper}. Section~\ref{sec:ssh} outlines essential inputs, while Table X provides a summary
......@@ -202,7 +205,7 @@ decision to built it by adapting an existing client implementation.
%\emph{buffering} behaviour of the server. These complication are
%discussed below.
SSH implementations can exhibit non-determistic behaviour. The
SSH implementations can exhibit non-determistic behavior. The
learning algorithm cannot cope with non-determinism -- learning will
not terminate -- so this has to be detected, which our {\dmapper} does.
There are a few sources of non-determinism in SSH:
......
......@@ -60,11 +60,11 @@ a different output sequence for both Mealy machines, that is, $A_{\CH}(\sigma) \
%
The MAT framework can be used to learn black box models of software.
If the behavior of a software system, or System Under Learning ({\dsut}), can be described by some unknown Mealy machine $\M$,
then a membership query can be implemented by sending inputs to the software system and observing the resulting outputs.
then a membership query can be implemented by sending inputs to the {\dsut} and observing resulting outputs.
An equivalence query can be approximated using a conformance testing tool \cite{LeeY96} via a finite number of \emph{test queries}.
A test query consists of asking the {\dsut} for the response to an input sequence
$\sigma \in I^{\ast}$, similar to a membership query.
Note that this cannot exclude the possibility that there is more behaviour
Note that this cannot rule out that there is more behavior
that has not been discovered. For a recent overview of model learning
algorithms for this setting see \cite{Isberner2015}.
......
......@@ -158,7 +158,7 @@ Outputs \textsc{ch\_max} and \textsc{ch\_none} are still generated because of a
\subsection{Security properties}
In SSH, upper layer services assume some notions of security, notions which are ensured by mechanisms in the lower layers. These mechanisms should have to be first engaged for the respective upper layer services to become available. As an example, the authentication service should only become available after exchanging and employing of kryptographic keys (key exchange) was done in the Transport layer, otherwise the service would be running over an unencrypted channel. Requests for this service should therefore not succeed unless key exchange was performed successfully.
Key exchange implies three steps which have to be performed in order but may be interleaved by other actions. Successful authentication should necessarily imply successful execution of the key exchange steps. We can tell each key exchange step were successful from the values of the input and output variables. Successful authentication request is indicated by the predicate defined earlier, {\dreqauth}. Following these principles, we define the LTL specification in Property~\ref{prop:sec-trans}, where O is the once operator. Formula $O p$ is true at time $t$ if $p$ held in at least one of the previous time steps $t' \leq t$.
Key exchange implies three steps which have to be performed in order but may be interleaved by other actions. Successful authentication necessarily imply successful execution of the key exchange steps. We can tell each key exchange step were successful from the values of the input and output variables. Successful authentication request is indicated by the predicate defined earlier, {\dreqauth}. Following these principles, we define the LTL specification in Property~\ref{prop:sec-trans}, where O is the once operator. Formula $O p$ is true at time $t$ if $p$ held in at least one of the previous time steps $t' \leq t$.
% SR_AUTH_AUTH -> SR_AUTH
......@@ -279,6 +279,7 @@ G ( (hasReqAuth & !O out=UA_SUCCESS) ->
\end{property}
\begin{property}
\vspace{-4mm}
\begin{lstlisting}[basicstyle=\footnotesize]
G ( (out=UA_SUCCESS) ->
X G (out!=UA_SUCCESS) )
......@@ -350,6 +351,7 @@ In particular, we used {\dvauth} instead of $out=UA\_SUCCESS$ to suggest success
\end{tabular}
\caption{Model checking results}
\label{tab:mcresults}
\vspace{-10mm}
\end{table}
%\end{center}
......
......@@ -11,9 +11,10 @@ SSHv2 follows a client-server paradigm. The protocol consists of three layers (F
\begin{figure}[t] % was !hb
\centering
\includegraphics[scale=0.35]{SSH_protocols.png}
\includegraphics[scale=0.20]{SSH_protocols.png}
\caption{SSH protocol layers}
\label{fig:sshcomponents}
\vspace{-3mm}
\end{figure}
......@@ -30,9 +31,11 @@ SSH runs over TCP, and provides end-to-end confidentiality and integrity using s
Subsequently, key exchange using the negotiated algorithm takes place. Following this algorithm, one-time session keys for encryption and hashing are generated by each side, together with an identifier for the session. The main key exchange algorithm is Diffie-Hellman, which is also the only one required by the RFC. For the Diffie-Hellman scheme, \textsc{kex30} and \textsc{kex31} are exchanged to establish fresh session keys. These keys are used from the moment the \textsc{newkeys} command has been issued by both parties. A subsequent \textsc{sr\_auth} requests the authentication service. The happy flow thus consists of the succession of the three steps comprising key exchange, followed up by a successful authentication service request. The sequence is shown in Figure~\ref{fig:hf-trans}.
\begin{figure}[!hb]
\includegraphics[scale=0.285]{hf-trans.pdf}
%\vspace{-1cm}
\includegraphics[scale=0.285]{hf-trans_cropped.pdf}
\caption{The happy flow for the transport layer.}
\label{fig:hf-trans}
\vspace{-3mm}
\end{figure}
\textsl{Key re-exchange}~\cite[p. 23]{rfc4253}, or \textsl{rekeying}, is an almost identical process, the difference being that instead of taking place at the beginning, it takes place once session keys are already in place. The purpose is to renew session keys so as to foil potential replay attacks~\cite[p. 17]{rfc4251}. It follows the same steps as key exchange. A fundamental property of rekeying is that it should preserve the state; that is, after the rekeying procedure is completed, the protocol should be in the same state as
......@@ -48,9 +51,11 @@ Once a secure tunnel has been established, the client can authenticate. For this
%We consider a user authentication layer state machine secure if there is no path from the unauthenticated state to the authenticated state without providing correct credentials.
\begin{figure}[!ht]
\includegraphics[scale=0.45]{hf-auth.pdf}
%\vspace{-1cm}
\includegraphics[scale=0.45]{hf-auth_cropped.pdf}
\caption{The happy flow for the user authentication layer.}
\label{fig:hf-auth}
\vspace{-3mm}
\end{figure}
......@@ -66,7 +71,9 @@ Successful authentication makes services of the connection layer available. The
%TODO Perhaps change this figure so to reflect text
\begin{figure}[!ht]
\includegraphics[scale=0.35]{hf-conn.pdf}
%\vspace{-1cm}
\includegraphics[scale=0.35]{hf-conn_cropped.pdf}
\caption{The happy flow for the connection layer.}
\label{fig:hf-conn}
\vspace{-3mm}
\end{figure}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment