Commit 028a7869 authored by Paul Fiterau Brostean's avatar Paul Fiterau Brostean
Browse files

Merge branch 'master' of gitlab.science.ru.nl:pfiteraubrostean/Learning-SSH-Paper

parents 88acfa77 c13141c8
......@@ -3,7 +3,7 @@
We have combined model learning with abstraction techniques to infer models of the OpenSSH, Bitvise and DropBear SSH server implementations. We have also
formalized several security and functional properties drawn from the SSH RFC specifications. We have verified these
properties on the learned models using model checking and have uncovered several minor standard violations.
The security critical properties were met by all implementations.
The security-critical properties were met by all implementations.
Abstraction was provided by a {\dmapper} component placed between the
{\dlearner} and the {\dsut}. The {\dmapper} was constructed from an
......@@ -19,15 +19,15 @@ We encountered several challenges. Firstly, building a {\dmapper} presented a co
SSH implementation. Secondly, because we used classical learning algorithms, we had to ensure that the abstracted implementation behaved
like a (deterministic) Mealy Machine. Here time-induced non-determinism was difficult to eliminate. Buffering also presented problems,
leading to a considerable increase in the number of states. Moreover, the systems analyzed were relatively slow, which meant learning took
several days\marginpar{\tiny Erik: For a single server, right??}. This was compounded by the size of the learning alphabet, and it forced us into using a reduced alphabet for two of the analyzed implementations.
several days. This was compounded by the size of the learning alphabet, and it forced us into using a reduced alphabet for two of the implementations.
Limitations of the work, hence possibilities for future work, are several. First of all, the {\dmapper} was not formalized, unlike in~\cite{TCP2016}, thus we did not
produce a concretization of the abstract models. Consequently, model checking results cannot be fully transferred to the actual implementations. Formal definition
of the mapper and concretization of the learned models (as defined in \cite{AJUV15}) would tackle this. The {\dmapper} also caused considerable redundancy in the learned models, re-tweaking the abstractions used, in particular those for managing channels, could alleviate this problem while also improving learning times. This in turn would facilitate learning using expanded alphabets instead of resorting to restricted alphabets.
of the mapper and concretization of the learned models (as defined in \cite{AJUV15}) would tackle this. The {\dmapper} also caused considerable redundancy in the learned models; tweaking the abstractions used, in particular those for managing channels, could alleviate this problem while also improving learning times. This in turn would facilitate learning using expanded alphabets instead of resorting to restricted alphabets.
Furthermore, the {\dmapper} abstraction could be refined, to give more
insight into the implementations. In particular, parameters,
such as the session identifier or data sent over channels, could be extracted from the {\dmapper} and potentially handled by existing Register Automata learners\cite{ralib2015,tomte2015}. These learners
can infer systems with parameterized alphabets, state variables and simple operations on data. Finally, we had to eliminate any timing-related behavior, as it could not be handled by the classical learners used. To that end, preliminary work on learning timed automata\cite{GrinchteinJL10} could be leveraged.
can infer systems with parameterized alphabets, state variables and simple operations on data. Finally, we ignored all timing-related behavior, as it could not be handled by the classical learners used; there is preliminary work on learning timed automata\cite{GrinchteinJL10} which could use timing behavior.
Despite these limitations, our work provides a compelling application of learning and model checking in a security setting, on a widely used protocol. We hope this lays
......
......@@ -14,16 +14,17 @@ rather than cryptographic flaws. Such logical flaws have occurred in
implementations of other security protocols, notably TLS, with Apple's
'goto fail' bug and the FREAK attack \cite{Beurdouche:2017}. For this
we use model learning (a.k.a.\ active automata learning)
\cite{Angluin1987Learning,Vaa17} to infer state machines of three SSH
\cite{Angluin1987Learning,PeVaYa02,Vaa17} to infer state machines of three SSH
implementations, which we then analyze by model checking for
conformance to both functional and security properties.
The properties we verify for the inferred state machines are based on
the RFCs that specify SSH \cite{rfc4251,rfc4252,rfc4253,rfc4254}.
These properties are formalized in LTL and verified using
NuSMV~\cite{NuSMV2}. Manually verifying these properties would be
difficult, as the learned models are complex. Moreover, formalizing
properties means we can also better assess and overcome vagueness or
NuSMV~\cite{NuSMV2}.
We use a model checker since the models are too complex for manual inspection (they are trivial for NuSMV).
Moreover, by formalizing the
properties we can better assess and overcome vagueness or
under-specification in the RFC standards.
Model learning obtains a state machine model of a black-box system by
......@@ -55,7 +56,7 @@ options. Our analysis only considers the protocol logic. However,
their rules were tied to routines in the code, so had to be slightly adapted
to fit the different implementations. In contrast, our properties are
defined at an abstract level so do not need such tailoring. Moreover,
our black box approach approach means we can analyze any implementation
our black box approach means we can analyze any implementation
of SSH, not just C implementations.
......@@ -67,7 +68,7 @@ testing of SSH implementations \cite{Boss2012}. All this research
only considered the SSH transport layer, and not the other SSH
protocol layers.
This paper is borne out of 2 recent theses \cite{Verleg2016,Toon2016}
This paper is born out of two recent theses \cite{Verleg2016,Toon2016},
and is to our knowledge the first combined application of model
learning and model checking in verifying SSH implementations, or more
generally, implementations of any network security protocol.
......
......@@ -2,10 +2,10 @@
\newcommand{\dk}{\emph{k}}
We use the setup described in Section~\ref{sec:setup} to learn models for OpenSSH, BitVise and DropBear SSH server implementations.
OpenSSH represents the focal point, as it is the most popular implementation of SSH (with over 80 percent of market share in 2008~\cite{Albrecht2009Plaintext})
and also, the default server for many UNIX-based systems. DropBear is an alternative to OpenSSH designed for low resource
systems. BitVise is a well known proprietary Windows-only SSH implementation.
and the default server for many UNIX-based systems. DropBear is an alternative to OpenSSH designed for low resource
systems. BitVise is a well-known proprietary Windows-only SSH implementation.
In our experimental setup, the {\dlearner} and {\dmapper} were running inside a Linux Virtual Machine. OpenSSH and DropBear were
In our experimental setup, {\dlearner} and {\dmapper} ran inside a Linux Virtual Machine. OpenSSH and DropBear were
learned over a localhost connection, whereas BitVise was learned over a virtual connection with the Windows host machine.
Certain arrangements had to be made including the setting of timing parameters to fit each implementation.
......@@ -15,15 +15,14 @@ Certain arrangements had to be made including the setting of timing parameters t
\caption{Model of the OpenSSH server. {\normalfont States are collected in 3 clusters,
indicated by the rectangles, where each cluster corresponds to
one of the protocol layers.
We eliminate redundant states and information induced by the {\dmapper}, as well as states present in successful rekey sequences. Wherever rekey was permitted, we replaced the rekey states and transitions by a single \textsl{REKEY SEQUENCE} transition. We also factor out edges common to states within a cluster. We replace common disconnecting edges, by one edge from the cluster to the disconnect state. Common self loop edges are colored, and the actual i/o information only appears on one edge. Transitions with similar start and end states are joined together on the same edge. Transition labels are kept short by regular expressions(UA\_* stands for all inputs starting with UA\_) or by factoring out common start strings. Green edges highlight the happy flow. }}
We eliminate redundant states and information induced by the {\dmapper}, as well as states present in successful rekeying sequences. Wherever rekeying was permitted, we replaced the rekeying states and transitions by a single \textsl{REKEY SEQUENCE} transition. We also factor out edges common to states within a cluster. We replace common disconnecting edges, by one edge from the cluster to the disconnect state. Common self loop edges are colored, and the actual i/o information only appears on one edge. Transitions with similar start and end states are joined together on the same edge. Transition labels are kept short by regular expressions(UA\_* stands for all inputs starting with UA\_) or by factoring out common start strings. Green edges highlight the happy flow. }}
\label{fig:sshserver}
\end{figure*}
OpenSSH was learned using a full alphabet, whereas DropBear and BitVise were learned using a restricted alphabet. Both versions of
the alphabets are described in Subsection~\ref{subsec:alphabet}. The primary reason for using a restricted alphabet was to reduce learning times.
OpenSSH was learned using a full alphabet, whereas DropBear and BitVise were learned using a restricted alphabet (as defined in Subsection~\ref{subsec:alphabet}). The primary reason for using a restricted alphabet was to reduce learning times.
Most inputs excluded were inputs that either didn't change behavior (like \textsc{debug} or \textsc{unimpl}), or that proved costly time-wise,
and were not critical to penetrating all layers. A concrete example is the user/password based authentication inputs (\textsc{ua\_pw\_ok} and
\textsc{ua\_pw\_nok}). It would take the system 2-3 seconds to respond to an invalid password, perhaps in an attempt by the designers to thwart
\textsc{ua\_pw\_nok}). It would take the system 2-3 seconds to respond to an invalid password, a typical countermeasure to slow down
brute force attacks. By contrast, public key authentication resulted in quick responses. The \textsc{disconnect} input presented similar
challenges, as it would take a varying amount of time until the system responded. This was particularly problematic for BitVise.
......@@ -34,9 +33,9 @@ challenges, as it would take a varying amount of time until the system responded
%with a public key was done quickly, authenticating with a username/password proved time consuming (it would take the system 2-3 seconds to respond to
%invalid credentials \textsl{ua\_pw\_ok}). The \textsl{disconnect} proved expensive in a similar way.
For testing, we used random and exhaustive variants of testing algorithm described in
For the test queries we used random and exhaustive variants of the testing algorithm described in
\cite{SMJV15}, which generate efficient test suites. Tests generated comprise an access sequence, a middle section of length {\dk} and a
distinguishing sequence. The exhaustive variant for a set {\dk}, generates tests for all possible middle sections and all states. Passing all tests provides some notion of confidence,
distinguishing sequence. The exhaustive variant for a set {\dk} generates tests for all possible middle sections and all states. Passing all tests then provides some notion of confidence,
namely, that the learned model is correct unless the (unknown) model of the implementation has at least {\dk} more states. The random variant produces tests
with randomly generated middle sections. No formal confidence is provided, but past experience shows this to be more effective at finding counterexamples since {\dk}
can be set to higher values. We executed a random test suite with {\dk} of 4 comprising 40000 tests for OpenSSH, and 20000 tests for BitVise and DropBear.
......@@ -44,12 +43,12 @@ We then ran an exhaustive test suite with {\dk} of 2 for all implementations.
Table~\ref{tab:experiments} describes the exact versions of the systems analyzed together with statistics on learning and testing, namely:
(1) the number of states in the learned model, (2) the number of hypotheses built during the learning process and (3) the total number of learning and test queries run. For test queries, we only consider those run on the last hypothesis. All learned models along with the specifications checked can be found at~\cite{SSHResults}.
(1) the number of states in the learned model, (2) the number of hypotheses built during the learning process and (3) the total number of learning and test queries run. For test queries, we only consider those run on the last hypothesis. All learned models along with the specifications checked can be found at \url{https://gitlab.science.ru.nl/pfiteraubrostean/Learning-SSH-Paper/tree/master/models}.
%BitVise: MemQ: 24996 TestQ: 58423
%Dropbear: MemQ: 3561 TestQ: 30629
%OpenSSH: MemQ: 19836 TestQ: 76418
\begin{table}[!ht]
\begin{table}[t] % was !ht
\centering
\small
\begin{tabular}{|l|l|l|l|l|l|l|}
......@@ -64,19 +63,21 @@ DropBear v2014.65 & 17 & 2 & 19863
\end{table}
The large number of states is down to several reasons. First of all, some systems exhibited buffering behavior. In particular, BitVise would queue
responses for higher layer inputs sent during key re-exchange, and would deliver them all at once after the exchange was done. Rekeying was also
responses for higher layer inputs sent during key re-exchange, and would deliver them all at once after rekeying was done. Rekeying was also
a major contributor to the number of states. For each state where rekeying
is possible, the sequence of transitions constituting the complete
rekeying process should lead back to tha state. This
rekeying process should lead back to that state. This
leads to two additional rekeying states for each state where rekeying
is possible . A considerable number of states were also added due to {\dmapper} generated outputs such as \textsc{ch\_none} or \textsc{ch\_max}, outputs which signal that no channel is open or that the maximum number of channels have been opened.
is possible. Many states were also added due to {\dmapper} generated outputs such as \textsc{ch\_none} or \textsc{ch\_max}, outputs which signal that no channel is open or that the maximum number of channels have been opened.
Figure~\ref{fig:sshserver} shows the model learned for OpenSSH, with various changes applied to improve readability. The happy flow, contoured in green, is fully explored in the model and mostly matches our earlier description of it\footnote{The only exception is in the transport layer, where unlike in our happy flow definition, the server is the first to send the \textsc{newkeys} message. This is also accepted behavior, as the protocol does not specify which side should send \textsc{newkeys} first.}. Also explored is what happens when a rekey sequence is attempted. We notice that rekey is only allowed in states of the Connection layer. Strangely, for these states, rekey is not state preserving, as the generated output on receiving a \textsc{sr\_auth}, \textsc{sr\_conn} or \textsc{kex30} changes from \textsc{unimpl} to \textsc{no\_resp}. This leads to two sub-clusters of states, one before the first rekey, the other afterward. In all other states, the first step of a rekey (\textsc{kexinit}) yields (\textsc{unimpl}), while the last step (\textsc{newkeys}) causes the system to disconnect.
Figure~\ref{fig:sshserver} shows the model learned for OpenSSH, with some edits to improve readability. The happy flow, in green, is fully explored in the model and mostly matches our earlier description of it\footnote{The only exception is in the transport layer, where unlike in our happy flow definition, the server is the first to send the \textsc{newkeys} message. This is also accepted behavior, as the protocol does not specify which side should send \textsc{newkeys} first.}. Also explored is what happens when a rekeying sequence is attempted. We notice that rekeying is only allowed in states of the Connection layer. Strangely, for these states, rekeying is not state preserving, as the generated output on receiving a \textsc{sr\_auth}, \textsc{sr\_conn} or \textsc{kex30} changes from \textsc{unimpl} to \textsc{no\_resp}. This leads to two sub-clusters of states, one before the first rekey, the other afterward. In all other states, the first step of a
rekeying (\textsc{kexinit}) yields (\textsc{unimpl}), while the last step (\textsc{newkeys}) causes the system to disconnect.
We also found strange how systems reacted to \textsc{sr\_conn}, request for services of the Connection layer. These services can be accessed once the user had authenticated, without the need of an explicit service request. That in itself was not strange, because authentication messages already mention that connection services should start after authentication \footnote{This is a technical detail, the message format of authentication messages requires a field which says the service started after authentication. The only option is to start Connection layer services. }.
What was strange was that, making this explicit request either lead to \textsc{unimpl}/\textsc{no\_resp} with no state change, as in the case of OpenSSH, or termination of the connection, as in the case of BitVise. The latter was particularly strange, as in theory, once authenticated, the user should always have access to the service, and not be disconnected on requesting this service. Only DropBear seems to respond positively (\textsc{sr\_accept}) to \textsc{sr\_conn} after authentication.
We found it strange how systems reacted to \textsc{sr\_conn}, the request for services of the Connection layer. These services can be accessed once the user had authenticated, without the need of an explicit service request. That in itself was not strange, because authentication messages already mention that connection services should start after authentication \footnote{This is a technical detail, the message format of authentication messages requires a field which says the service started after authentication. The only option is to start Connection layer services. }.
What was strange was that an explicit request either lead to \textsc{unimpl}/\textsc{no\_resp} with no state change, as in the case of OpenSSH, or termination of the connection, as in the case of BitVise. The latter was particularly strange, as in theory, once authenticated, the user should always have access to the service, and not be disconnected on requesting this service. Only DropBear seems to respond positively (\textsc{sr\_accept}) to \textsc{sr\_conn} after authentication.
We also notice the intricate authentication behavior, it seems that password authentication is the only form that allows you to authenticate after an unsuccessful attempt. Finally, of all implementations tested, only BitVise allowed multiple terminals to be requested over the same channel. As depicted in the model, OpenSSH abruptly terminates on requesting a second terminal. DropBear exhibits a similar behavior.
We also notice the intricate authentication behavior:
after an unsuccessful authentication attempt the only authentication method still allowed is password authentication. Finally, only BitVise allowed multiple terminals to be requested over the same channel. As depicted in the model, OpenSSH abruptly terminates on requesting a second terminal. DropBear exhibits a similar behavior.
......
......@@ -19,7 +19,6 @@ The {\dlearner} uses LearnLib ~\cite{LearnLib2009}, a Java library implementing
SSH is a complex client-server protocol. In our work so far we concentrated on learning models of the implementation of the server, and not of the client.
We further restrict learning to only exploring the terminal service of the Connection layer, as we consider it to be the most interesting
from a security perspective. Algorithms for encryption, compression and hashing are left to default settings and are not purposefully explored. Also, the starting
state of the {\dsut} is one where a TCP connection has already been established and where SSH versions have been exchanged, which are prerequisites for starting the Transport layer protocol.
%figure
......@@ -40,11 +39,11 @@ intended to ever be sent to the server\footnote{This means we exclude
the messages \textsc{service\_accept}, \textsc{ua\_accept},
\textsc{ua\_failure}, \textsc{ua\_banner}, \textsc{ua\_pk\_ok},
\textsc{ua\_pw\_changereq}, \textsc{ch\_success} and
\textsc{ch\_failure} from our alphabet.} Furthermore, from the
\textsc{ch\_failure} from our alphabet.}. Furthermore, from the
Connection layer we only use messages for channel management and the
terminal functionality. Finally, because we will only explore
protocol behavior after SSH versions have been exchanged, we exclude
these messages for exchanging version numbers.
the messages for exchanging version numbers.
%\marginpar{\tiny Erik: I
%rephrased all this to make it simpler. Is it still ok?}
......@@ -57,7 +56,7 @@ models while only marginally limiting explored behavior. We discuss
this again in Section~\ref{sec:result}. Inputs included in the
restricted alphabet are marked by '*' in the tables below.
Table~\ref{trans-alphabet} introduces the Transport layer inputs. We include a version of the \textsc{kexinit} message with \texttt{first\_kex\_packet\_follows} disabled.
Table~\ref{trans-alphabet} lists the Transport layer inputs. We include a version of the \textsc{kexinit} message with \texttt{first\_kex\_packet\_follows} disabled.
This means no guess~\cite[p. 17]{rfc4253} is attempted on the {\dsut}'s parameter preferences. Consequently, the {\dsut} will have to send its own \textsc{kexinit} in order to
convey its own parameter preferences before key exchange can proceed. Also included are inputs for establishing new keys (\textsc{kex30}, \textsc{newkeys}), disconnecting (\textsc{disconnect}), as well as the special inputs \textsc{ignore}, \textsc{unimpl} and \textsc{debug}. The latter are not interesting, as they are normally ignored by implementations. Hence they are excluded from our restricted alphabet. \textsc{disconnect} proved costly time wise, so was also excluded.
%We include two versions of the \textsc{kexinit} message, one where \texttt{first\_kex\_packet\_follows} is disabled, the other when it is enabled, in which case, the message would make a guess on the security parameters~\cite[p. 17]{rfc4253}. Our mapper can only handle correct key guesses, so the wrong-guess procedure as described in ~\cite[p. 19]{rfc4253} was not supported.
......@@ -84,7 +83,7 @@ convey its own parameter preferences before key exchange can proceed. Also inclu
\label{trans-alphabet}
\end{table}
The Authentication layer defines one single client message type in the form of the authentication request~\cite[p. 4]{rfc4252}. Its parameters contain all information needed for authentication. Four authentication methods exist: none, password, public key and host-based. Our mapper supports all methods except the host-based authentication because various SUTs lack support for this feature. Both the public key and password methods have \textsc{ok} and \textsc{nok} variants, which provide respectively correct and incorrect credentials. Our restricted alphabet supports only public key authentication, as it was processed the fastest by implementations out of all authentication forms.
The Authentication layer defines one single client message type in the form of the authentication request~\cite[p. 4]{rfc4252}. Its parameters contain all information needed for authentication. Four authentication methods exist: none, password, public key and host-based. Our mapper supports all methods except the host-based authentication because some SUTs lack support for this feature. Both the public key and password methods have \textsc{ok} and \textsc{nok} variants, which provide respectively correct and incorrect credentials. Our restricted alphabet supports only public key authentication, as the implementations processed this faster than the other authentication methods.
\begin{table}[!ht]
\centering
......@@ -189,9 +188,9 @@ instead of being sent to the {\dsut} to find out its response:
responds with a \textsc{no\_conn} message, as sending further
messages to the {\dsut} is pointless in that case;
\end{enumerate}
Overall, we notice that in many ways, the {\dmapper} acts similarly to an SSH client. Hence it is unsurprising that it was built off an existing
implementation.
%
In many ways, the {\dmapper} acts similar to an SSH client, hence the
decision to built it by adapting an existing client implementation.
\subsection{Practical complications}
......
......@@ -89,11 +89,13 @@ tabsize=2
\begin{abstract}
We apply model learning on three SSH implementations to infer state machine models, which
are then verified by a model checker for functional and security properties. Our analysis showed that
all tested SSH server models satisfy the security properties. Nevertheless, we uncovered several minor standard violations.
The state machines of the implementations differ significantly, allowing them to be
effectively fingerprinted.
We apply model learning on three SSH implementations to infer state machine models, and then use model checking
to verify that these models satisfy basic security properties and conform to the RFCs. Our analysis showed that
all tested SSH server models satisfy the stated security properties.
However, our analysis uncovered several violations of the standard.
%Frits: I would say the fingerprinting is a detail, standard violations much more important.
%The state machines of the implementations differ significantly, allowing them to be
%effectively fingerprinted.
%Various shortcomings with regards to the RFCs were found. Opening multiple channels is not properly implemented on CiscoSSH and PowerShell. OpenSSH contains a bug which can result in connection closure after rekeying in some circumstances. Both Tectia and OpenSSH implement a liberal message acceptance policy in the first phase of the protocol. Such a liberal policy is unwise in this error-prone stage.
......@@ -145,7 +147,7 @@ effectively fingerprinted.
% We no longer use \terms command
%\terms{Theory}
\keywords{ACM proceedings, \LaTeX, text tagging}
\keywords{Model learning, model checking, SSH protocol}
\maketitle
......
......@@ -24,7 +24,7 @@
}
@misc{SSHResults,
title = {Learned SSH Models},
title = {Learned {SSH} Models},
year = {2017},
publisher = {GitLab},
journal = {GitLab repository},
......@@ -222,7 +222,8 @@ machine learning algorithms},
shorttitle = {{RALib}},
urldate = {2016-09-05},
booktitle = {{DIFTS}},
author = {Cassel, Sofia and Howar, Falk and Jonsson, Bengt},
author = {Cassel, S. and Howar, F. and Jonsson, B.},
url = {http://www.faculty.ece.vt.edu/chaowang/difts2015/papers/paper_5.pdf},
year = {2015},
keywords = {learning, ralib, register-automata},
}
......@@ -574,3 +575,13 @@ machine learning algorithms},
}
@ARTICLE{PeVaYa02,
AUTHOR = "Peled, D. and Vardi, M.Y. and Yannakakis, M.",
TITLE = "{Black Box Checking}",
JOURNAL = "Journal of Automata, Languages, and Combinatorics",
YEAR = "2002",
VOLUME = "7",
NUMBER = "2",
PAGES = "225--246"
}
......@@ -9,32 +9,33 @@ SSHv2 follows a client-server paradigm. The protocol consists of three layers (F
\item The \textit{connection protocol} (RFC 4254~\cite{rfc4254}) allows the encrypted channel to be multiplexed in different channels. These channels enable a user to run multiple applications, such as terminal emulation or file transfer, over a single SSH connection.
\end{enumerate}
Each layer has its own specific messages. The SSH protocol is interesting in that outer layers do not encapsulate inner layers. This means that different layers can interact. Hence, it makes sense to analyze SSH as a whole, instead of analyzing its constituent layers independently. We review each layer, outlining the relevant messages which are later used in learning, and characterising the so-called \textit{happy flow} that a normal protocol
\begin{figure}[t] % was !hb
\centering
\includegraphics[scale=0.35]{SSH_protocols.png}
\caption{SSH protocol layers}
\label{fig:sshcomponents}
\end{figure}
Each layer has its own specific messages. The SSH protocol is interesting in that outer layers do not encapsulate inner layers. This means that different layers can interact. Hence, it makes sense to analyze SSH as a whole, instead of analyzing its constituent layers independently. Below we discuss each layer, outlining the relevant messages which are later used in learning, and characterising the so-called \textit{happy flow} that a normal protocol
run follows.
At a high level, a typical SSH protocol run uses the three constituent protocols in the order given above: after the client establishes a TCP connection with the server, (1) the two sides use the transport layer protocol to negotiate key exchange and encryption algorithms, and use these to establish session keys, which are then used to secure further communication; (2) the client uses the user authentication protocol to authenticate to the server; (3) the client uses the connection protocol to access services on the server, for example the terminal service.
%Different layers are identified by their message numbers. These message numbers will form the basis of the state fuzzing. The SSH protocol is especially interesting because outer layers do not encapsulate inner layers. This means that different layers can interact. One could argue that this is a less systematic approach, in which a programmer is more likely to make state machine-related errors.
\begin{figure}[!hb]
\centering
\includegraphics[scale=0.35]{SSH_protocols.png}
\caption{SSH protocol components running on a TCP/IP stack.}
\label{fig:sshcomponents}
\end{figure}
\subsection{Transport layer}\label{ssh-run-trans}
SSH runs over TCP, and provides end-to-end confidentiality and integrity using session keys. Once a TCP connection has been established with the server, these session keys are securely negotiated using a \textsl{key exchange} algorithm, the first step of the protocol. The key exchange begins by the two sides exchanging their preferences for the key exchange algorithm to be used, as well as encryption, compression and hashing algorithms. Preferences are sent with a \textsc{kexinit} message.
%TODO How is the algorithm picked?
Subsequently, key exchange using the negotiated algorithm takes place. Following this algorithm, one-time session keys for encryption and hashing are generated by each side, together with an identifier for the session. The main key exchange algorithm is Diffie-Hellman, which is also the only one required by the RFC. For the Diffie-Hellman scheme, \textsc{kex30} and \textsc{kex31} are exchanged to establish fresh session keys. These keys are used from the moment the \textsc{newkeys} command has been issued by both parties. A subsequent \textsc{sr\_auth} requests the authentication service. The happy flow thus consists of the succession of the three steps comprising key exchange, followed up by a successful authentication service request. The sequence is shown in Figure~\ref{fig:hf-trans}.
\begin{figure}[!hb]
\begin{figure}[!hb]
\includegraphics[scale=0.285]{hf-trans.pdf}
\caption{The happy flow for the transport layer.}
\label{fig:hf-trans}
\end{figure}
\textsl{Key re-exchange}~\cite[p. 23]{rfc4253}, or \textsl{rekeying}, is a near identical process, with the difference being that instead of taking place at the beginning, it takes place once session keys are already in place. The purpose is to renew session keys so as to foil potential replay attacks~\cite[p. 17]{rfc4251}. It follows the same steps as key exchange. Messages exchanged as part of it are encrypted using the old set of keys, messages exchanged afterwards are encrypted using the new keys. A fundamental property of rekeying is that it should preserve the state; that is, after the rekeying procedure is completed, the protocol should be in the same state as
\textsl{Key re-exchange}~\cite[p. 23]{rfc4253}, or \textsl{rekeying}, is an almost identical process, the difference being that instead of taking place at the beginning, it takes place once session keys are already in place. The purpose is to renew session keys so as to foil potential replay attacks~\cite[p. 17]{rfc4251}. It follows the same steps as key exchange. A fundamental property of rekeying is that it should preserve the state; that is, after the rekeying procedure is completed, the protocol should be in the same state as
it was before the rekeying started, with as only difference that new keys are now in use. %Some implementations are known not support rekeying in certain states of the protocol.
%We consider an transport layer state machine secure if there is no path from the initial state to the point where the authentication service is invoked without exchanging and employing cryptographic keys.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment