Commit a7a9048f authored by Erik Poll's avatar Erik Poll
Browse files

fixed some times & condensed

parent 7145da0a
...@@ -9,10 +9,10 @@ Abstraction was provided by a {\dmapper} component placed between the ...@@ -9,10 +9,10 @@ Abstraction was provided by a {\dmapper} component placed between the
{\dlearner} and the {\dsut}. The {\dmapper} was constructed from an {\dlearner} and the {\dsut}. The {\dmapper} was constructed from an
existing SSH implementation. The input alphabet of the {\dmapper} existing SSH implementation. The input alphabet of the {\dmapper}
explored key exchange, setting up a secure connection, several explored key exchange, setting up a secure connection, several
authentication methods and opening and closing channels over which the authentication methods, and opening and closing channels over which the
terminal service could be requested. We used two input alphabets, a terminal service could be requested. We used two input alphabets, a
full version for OpenSSH, and a restricted version for the other full version for OpenSSH, and a restricted version for Bitvise and
implementations. The restricted alphabet was still sufficient to DropBear. The restricted alphabet was still sufficient to
explore most aforementioned behavior. explore most aforementioned behavior.
We encountered several challenges. Firstly, building a {\dmapper} presented a considerable technical challenge, as it required re-structuring of an actual We encountered several challenges. Firstly, building a {\dmapper} presented a considerable technical challenge, as it required re-structuring of an actual
......
...@@ -86,7 +86,7 @@ also used model checkers \cite{TCP2016,Chalupar2014Automated}. ...@@ -86,7 +86,7 @@ also used model checkers \cite{TCP2016,Chalupar2014Automated}.
Instead of using active learning as we do, it is also possible to use Instead of using active learning as we do, it is also possible to use
passive learning to obtain protocol state machines passive learning to obtain protocol state machines
\cite{Wang2011Inferring}. Here network traffic is observed, and not \cite{Wang2011Inferring}. Here network traffic is observed, and not
actively generated. This can then also provide a probabilistic actively generated. This can then provide a probabilistic
characterization of normal network traffic, but it cannot uncover characterization of normal network traffic, but it cannot uncover
implementation flaws that occur in strange message flows, which is our implementation flaws that occur in strange message flows, which is our
goal. goal.
......
...@@ -21,8 +21,9 @@ We have adapted the setting of timing parameters to each implementation. ...@@ -21,8 +21,9 @@ We have adapted the setting of timing parameters to each implementation.
\end{figure*} \end{figure*}
OpenSSH was learned using a full alphabet, whereas DropBear and BitVise were learned using a restricted alphabet (as defined in Subsection~\ref{subsec:alphabet}). OpenSSH was learned using a full alphabet, whereas DropBear and BitVise were learned using a restricted alphabet (as defined in Subsection~\ref{subsec:alphabet}).
The main reason for using a restricted alphabet reduce learning times. Based on the model learned for OpenSSH (the first implementation analyzed) and the specification, we excluded The reason for using a restricted alphabet was to reduce learning times. Based on the model learned for OpenSSH (the first implementation analyzed) and the specification, we excluded
inputs that seemed unlikely to produce state change (like \textsc{debug} or \textsc{unimpl}). We also excluded inputs that proved costly time-wise (like \textsc{disconnect}), yet were not were not needed to visit all states in our happy flow. We excluded, for example, the user/password based authentication inputs (\textsc{ua\_pw\_ok} and inputs that seemed unlikely to produce state change (such as
\textsc{debug} or \textsc{unimpl}). We also excluded inputs that proved costly time-wise (such as \textsc{disconnect}) but were not were not needed to visit all states in the happy flow. We excluded, for example, the user/password based authentication inputs (\textsc{ua\_pw\_ok} and
\textsc{ua\_pw\_nok}) as they would take the system 2-3 seconds to respond to. By contrast, public key authentication resulted in quick responses. \textsc{ua\_pw\_nok}) as they would take the system 2-3 seconds to respond to. By contrast, public key authentication resulted in quick responses.
%The \textsc{disconnect} input presented similar %The \textsc{disconnect} input presented similar
...@@ -45,8 +46,8 @@ can be set to higher values. We executed a random test suite with {\dk} of 4 com ...@@ -45,8 +46,8 @@ can be set to higher values. We executed a random test suite with {\dk} of 4 com
%To that end, we built a java adapter that would automatically run the model checker on the hypothesis, and transform any counterexamples into tests. This proved essential in learning DropBear, as the last counterexample was generated by the model checker. %To that end, we built a java adapter that would automatically run the model checker on the hypothesis, and transform any counterexamples into tests. This proved essential in learning DropBear, as the last counterexample was generated by the model checker.
Table~\ref{tab:experiments} describes the exact versions of the systems analyzed together with statistics on learning and testing, namely: Table~\ref{tab:experiments} describes the exact versions of the systems analyzed together with statistics on learning and testing:
(1) the number of states in the learned model, (2) the number of hypotheses built during the learning process and (3) the total number of learning and test queries run. For test queries, we only consider those run on the last hypothesis. All learned models plus the specifications checked can be found at \url{https://gitlab.science.ru.nl/pfiteraubrostean/Learning-SSH-Paper/tree/master/models}. The statistics (1) the number of states in the learned model, (2) the number of hypotheses built during the learning process and (3) the total number of learning and test queries run. For test queries, we only consider those run on the last hypothesis. All learned models and the properties checked are at \url{https://gitlab.science.ru.nl/pfiteraubrostean/Learning-SSH-Paper/tree/master/models}. The statistics
give a glimpse into the issue of scalability. Assuming each input took 0.5 seconds to process, and an average query length of 10, to perform 40000 queries would have taken roughly 55 hours. This is consistent with the time experiments took, which span several days. The long duration compelled us to resort to restricted alphabets, which lead to reduction in the number of queries needed. Our work could have benefited from parallel execution. give a glimpse into the issue of scalability. Assuming each input took 0.5 seconds to process, and an average query length of 10, to perform 40000 queries would have taken roughly 55 hours. This is consistent with the time experiments took, which span several days. The long duration compelled us to resort to restricted alphabets, which lead to reduction in the number of queries needed. Our work could have benefited from parallel execution.
%BitVise: MemQ: 24996 TestQ: 58423 %BitVise: MemQ: 24996 TestQ: 58423
%Dropbear: MemQ: 3561 TestQ: 30629 %Dropbear: MemQ: 3561 TestQ: 30629
......
...@@ -100,7 +100,7 @@ convey its own parameter preferences before key exchange can proceed. Also inclu ...@@ -100,7 +100,7 @@ convey its own parameter preferences before key exchange can proceed. Also inclu
\label{trans-alphabet} \label{trans-alphabet}
\end{table} \end{table}
The Authentication layer defines one single client message type in the form of the authentication request~\cite[p. 4]{rfc4252}. Its parameters contain all information needed for authentication. Four authentication methods exist: none, password, public key and host-based. Our mapper supports all methods except the host-based authentication because some SUTs don't support this feature. Both the public key and password methods have \textsc{ok} and \textsc{nok} variants, which provide respectively correct and incorrect credentials. Our restricted alphabet supports only public key authentication, as the implementations processed this faster than the other authentication methods. The Authentication layer defines a single client message type for the authentication requests~\cite[p. 4]{rfc4252}. Its parameters contain all information needed for authentication. Four authentication methods exist: none, password, public key and host-based. Our mapper supports all methods except host-based authentication because some SUTs don't support this feature. Both the public key and password methods have \textsc{ok} and \textsc{nok} variants, which provide respectively correct and incorrect credentials. Our restricted alphabet supports only public key authentication, as the implementations processed this faster than the other authentication methods.
\begin{table}[!ht] \begin{table}[!ht]
\centering \centering
...@@ -118,9 +118,9 @@ The Authentication layer defines one single client message type in the form of t ...@@ -118,9 +118,9 @@ The Authentication layer defines one single client message type in the form of t
\label{auth-alphabet} \label{auth-alphabet}
\end{table} \end{table}
The Connection layer allows the client to manage channels and to request/run services over them. In accordance with our learning goal, The Connection layer allows clients to manage channels and request services over them. In accordance with our learning goal,
our mapper only supports inputs for requesting terminal emulation, plus inputs for channel management as shown in Table~\ref{conn-alphabet}. our mapper only supports inputs for requesting terminal emulation, plus inputs for channel management as shown in Table~\ref{conn-alphabet}.
The restricted alphabet only supports the most general channel management inputs. Those excluded are not expected to produce state change. The restricted alphabet only supports the most general channel management inputs, and excludes those not expected to produce state change.
\begin{table}[!ht] \begin{table}[!ht]
...@@ -141,7 +141,7 @@ The restricted alphabet only supports the most general channel management inputs ...@@ -141,7 +141,7 @@ The restricted alphabet only supports the most general channel management inputs
\label{conn-alphabet} \label{conn-alphabet}
\end{table} \end{table}
\emph{The output alphabet} subsumes all messages an SSH server generates, which may include, with identical meaning, any of the messages defined as inputs. They also include responses to various requests: \textsc{kex31}~\cite[p. 21]{rfc4253} as reply to \textsc{kex30}, \textsc{sr\_succes} in response to service requests (\textsc{sr\_auth} and \textsc{sr\_conn}), \textsc{ua\_success} and \textsc{ua\_failure}~\cite[p. 5,6]{rfc4252} in response to authentication requests, and \textsc{ch\_open\_success}~\cite[p. 6]{rfc4254} and \textsc{ch\_success}~\cite[p. 10]{rfc4254} , in positive response to \textsc{ch\_open} and \textsc{ch\_request\_pty} respectively. To these outputs, we add \textsc{no\_resp} for when the {\dsut} generates no output, and the special outputs \textsc{ch\_none}, \textsc{ch\_max} and \textsc{no\_conn}, and \textsc{buffered}, which we discuss in the next Subsections. \emph{The output alphabet} includes all messages an SSH server generates, which may include, with identical meaning, any of the messages defined as inputs. This also includes responses to various requests: \textsc{kex31}~\cite[p. 21]{rfc4253} as reply to \textsc{kex30}, \textsc{sr\_succes} in response to service requests (\textsc{sr\_auth} and \textsc{sr\_conn}), \textsc{ua\_success} and \textsc{ua\_failure}~\cite[p. 5,6]{rfc4252} in response to authentication requests, and \textsc{ch\_open\_success}~\cite[p. 6]{rfc4254} and \textsc{ch\_success}~\cite[p. 10]{rfc4254} , in positive response to \textsc{ch\_open} and \textsc{ch\_request\_pty} respectively. To these outputs, we add \textsc{no\_resp} for when the {\dsut} generates no output, and the special outputs \textsc{ch\_none}, \textsc{ch\_max} and \textsc{no\_conn}, and \textsc{buffered}, which we discuss in the next subsections.
%The learning alphabet comprises of input/output messages by which the {\dlearner} interfaces with the {\dmapper}. Section~\ref{sec:ssh} outlines essential inputs, while Table X provides a summary %The learning alphabet comprises of input/output messages by which the {\dlearner} interfaces with the {\dmapper}. Section~\ref{sec:ssh} outlines essential inputs, while Table X provides a summary
%of all messages available at each layer. \textit{\textit{}} %of all messages available at each layer. \textit{\textit{}}
...@@ -160,7 +160,9 @@ is the \textsc{no\_resp} message. ...@@ -160,7 +160,9 @@ is the \textsc{no\_resp} message.
The sheer complexity of the {\dmapper} meant that it was easier to The sheer complexity of the {\dmapper} meant that it was easier to
adapt an existing SSH implementation, rather than construct the adapt an existing SSH implementation, rather than construct the
{\dmapper} from scratch. Paramiko already provides mechanisms for {\dmapper} from scratch.
After all, in many ways the {\dmapper} acts similar to an SSH client.
Paramiko already provides mechanisms for
encryption/decryption, as well as routines for constructing and encryption/decryption, as well as routines for constructing and
sending the different types of packets, and for receiving them. These sending the different types of packets, and for receiving them. These
routines are called by control logic dictated by Paramiko's own state routines are called by control logic dictated by Paramiko's own state
...@@ -186,7 +188,7 @@ negotiated earlier in place of the older ones, if such existed. ...@@ -186,7 +188,7 @@ negotiated earlier in place of the older ones, if such existed.
The {\dmapper} also contains a buffer for storing opened channels, which is initially empty. The {\dmapper} also contains a buffer for storing opened channels, which is initially empty.
On a \textsc{ch\_open} from the learner, the {\dmapper} adds a channel to the buffer On a \textsc{ch\_open} from the learner, the {\dmapper} adds a channel to the buffer
with a randomly generated channel identifier, on a \textsc{ch\_close}, it removes the channel with a randomly generated channel identifier; on a \textsc{ch\_close}, it removes the channel
(if there was any). The buffer size, or the maximum number of opened channels, is limited to one. Initially the buffer is empty. The {\dmapper} also stores the sequence number of the last received message from the {\dsut}. This number is then used when constructing \textsc{unimpl} inputs. (if there was any). The buffer size, or the maximum number of opened channels, is limited to one. Initially the buffer is empty. The {\dmapper} also stores the sequence number of the last received message from the {\dsut}. This number is then used when constructing \textsc{unimpl} inputs.
In the following cases, inputs are answered by the {\dmapper} directly In the following cases, inputs are answered by the {\dmapper} directly
...@@ -204,8 +206,6 @@ responds with a \textsc{no\_conn} message, as sending further messages to the {\ ...@@ -204,8 +206,6 @@ responds with a \textsc{no\_conn} message, as sending further messages to the {\
% messages to the {\dsut} is pointless in that case; % messages to the {\dsut} is pointless in that case;
%\end{enumerate} %\end{enumerate}
% %
In many ways, the {\dmapper} acts similar to an SSH client, hence the
decision to built it by adapting an existing client implementation.
\subsection{Practical complications} \subsection{Practical complications}
...@@ -278,7 +278,7 @@ complete are all these messages processed. This leads to a ...@@ -278,7 +278,7 @@ complete are all these messages processed. This leads to a
\textsc{newkeys} response (indicating rekeying has completed), \textsc{newkeys} response (indicating rekeying has completed),
directly followed by all the responses to the buffered requests. This directly followed by all the responses to the buffered requests. This
would lead to non-termination of the learning algorithm, as for every would lead to non-termination of the learning algorithm, as for every
sequence of buffered messages the response is different. To sequence of buffered messages the response differs. To
prevent this, we treat the sequence of queued responses as the single prevent this, we treat the sequence of queued responses as the single
output \textsc{buffered}. output \textsc{buffered}.
......
...@@ -92,8 +92,8 @@ tabsize=2 ...@@ -92,8 +92,8 @@ tabsize=2
\begin{abstract} \begin{abstract}
We apply model learning on three SSH implementations to infer state machine models, and then use model checking We apply model learning on three SSH implementations to infer state machine models, and then use model checking
to verify that these models satisfy basic security properties and conform to the RFCs. Our analysis showed that to verify that these models satisfy basic security properties and conform to the RFCs. Our analysis showed that
all tested SSH server models satisfy the stated security properties. all tested SSH server models satisfy the stated security properties,
However, our analysis uncovered several violations of the standard. but uncovered several violations of the standard.
%Frits: I would say the fingerprinting is a detail, standard violations much more important. %Frits: I would say the fingerprinting is a detail, standard violations much more important.
%The state machines of the implementations differ significantly, allowing them to be %The state machines of the implementations differ significantly, allowing them to be
%effectively fingerprinted. %effectively fingerprinted.
......
This diff is collapsed.
...@@ -251,7 +251,6 @@ machine learning algorithms}, ...@@ -251,7 +251,6 @@ machine learning algorithms},
abstract = {The secure shell ({SSH}) protocol is one of the most popular cryptographic protocols on the Internet. Unfortunately, the current {SSH} authenticated encryption mechanism is insecure. In this paper, we propose several fixes to the {SSH} protocol and, using techniques from modern cryptography, we prove that our modified versions of {SSH} meet strong new chosen-ciphertext privacy and integrity requirements. Furthermore, our proposed fixes will require relatively little modification to the {SSH} protocol and to {SSH} implementations. We believe that our new notions of privacy and integrity for encryption schemes with stateful decryption algorithms will be of independent interest.}, abstract = {The secure shell ({SSH}) protocol is one of the most popular cryptographic protocols on the Internet. Unfortunately, the current {SSH} authenticated encryption mechanism is insecure. In this paper, we propose several fixes to the {SSH} protocol and, using techniques from modern cryptography, we prove that our modified versions of {SSH} meet strong new chosen-ciphertext privacy and integrity requirements. Furthermore, our proposed fixes will require relatively little modification to the {SSH} protocol and to {SSH} implementations. We believe that our new notions of privacy and integrity for encryption schemes with stateful decryption algorithms will be of independent interest.},
author = {Bellare, M. and Kohno, T. and Namprempre, C.}, author = {Bellare, M. and Kohno, T. and Namprempre, C.},
journal = {ACM Trans. Inf. Syst. Secur.}, journal = {ACM Trans. Inf. Syst. Secur.},
month = may,
number = {2}, number = {2},
pages = {206--241}, pages = {206--241},
publisher = {ACM}, publisher = {ACM},
...@@ -357,7 +356,7 @@ machine learning algorithms}, ...@@ -357,7 +356,7 @@ machine learning algorithms},
author = {Aarts, F. and Ruiter, J. {de} and Poll, E.}, author = {Aarts, F. and Ruiter, J. {de} and Poll, E.},
booktitle = {Software Testing, Verification and Validation Workshops (ICSTW)}, booktitle = {Software Testing, Verification and Validation Workshops (ICSTW)},
pages = {461--468}, pages = {461--468},
publisher = {IEEE CS}, publisher = {IEEE},
title = {Formal Models of Bank Cards for Free}, title = {Formal Models of Bank Cards for Free},
year = {2013} year = {2013}
} }
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment