\section{The learning setup} \label{sec:setup} %This chapter will cover the setup used to infer the state machines. We provide a general setup outline in Section~\ref{components}. The tested SSH servers are described in Section~\ref{suts}, which were queried with the alphabet described in Section~\ref{alphabet}. Section~\ref{setup-handling} will cover the challenging SUT behaviour faced when implementing the mapper, and the adaptations that were made to overcome these challenges. Section~\ref{layers-individual} will discuss the relation between state machines for individual layers and the state machine of the complete SSH protocol. The conventions on visualisation of the inferred state machines are described in Section~\ref{visualisation}. %Throughout this chapter, an individual SSH message to a SUT is denoted as a \textit{query}. A \textit{trace} is a sequence of multiple queries, starting from a SUT's initial state. Message names in this chapter are usually self-explanatory, but a mapping to the official RFC names is provided in Appendix~\ref{appendixa}. %\section{Components}\label{components} The learning setup consists of three components: the {\dlearner}, the {\dmapper} and the {\dsut}. The {\dlearner} generates abstract inputs, representing SSH messages. The {\dmapper} transforms these messages into well-formed SSH packets and sends them to the {\dsut}. The {\dsut} sends response packets back to the {\dmapper}, which in turn, translates these packets to abstract outputs. The {\dmapper} then sends the abstract outputs back to the {\dlearner}. The {\dlearner} uses LearnLib ~\cite{LearnLib2009}, a Java library implementing $L^{\ast}$ based algorithms for learning Mealy machines. The {\dmapper} is based on Paramiko, an open source SSH implementation written in Python\footnote{Paramiko is available at \url{http://www.paramiko.org/}}. We opted for Paramiko because its code is relatively well structured and documented. The {\dsut} can be any existing implementation of an SSH server. The three components communicate over sockets, as shown in Figure~\ref{fig:components}. \begin{figure} \centering \includegraphics[scale=0.29]{components_cropped.pdf} \caption{The SSH learning setup.} \label{fig:components} \end{figure} SSH is a complex client-server protocol. In our work so far we concentrated on learning models of the implementation of the server, and not of the client. We further restrict learning to only exploring the terminal service of the Connection layer, as we consider it to be the most interesting from a security perspective. Algorithms for encryption, compression and hashing are left to default settings and are not purposefully explored. Also, the starting state of the {\dsut} is one where a TCP connection has already been established and where SSH versions have been exchanged, which are prerequisites for starting the Transport layer protocol. %figure %It is therefore important to focus on messages for which interesting state-changing behaviour can be expected. \subsection{The learning alphabet}\label{subsec:alphabet} The alphabet we use consists of inputs, which correspond to messages sent to the server, and outputs, which correspond to messages received from the server. We split \emph{the input alphabet} into three parts, one for each of the protocol layers. %\marginpar{\tiny Erik: the output alphabet is not discussed anywhere, %but for the discussion of the mapper in the next section it should be} Learning does not scale with a growing input alphabet, and since we are only learning models of servers, we remove those inputs that are not intended to ever be sent to the server\footnote{This means we exclude the messages \textsc{service\_accept}, \textsc{ua\_accept}, \textsc{ua\_failure}, \textsc{ua\_banner}, \textsc{ua\_pk\_ok}, \textsc{ua\_pw\_changereq}, \textsc{ch\_success} and \textsc{ch\_failure} from our alphabet.}. Furthermore, from the Connection layer we only use messages for channel management and the terminal functionality. Finally, because we will only explore protocol behavior after SSH versions have been exchanged, we exclude the messages for exchanging version numbers. %The alphabet comprises abstract messages corresponding to the concrete messages exchanged during client/server interactions, as per RFC. All included messages follow %the Byte Protocol format, and all are processable, that is, the SSH server should react to all of them. Hence, all messages could be %included as inputs in the alphabet. Doing so, however, would lead to prohibitively %long learning times. Consequently, we restrict the input alphabet to messages a server %is expected to receive from the client. \footnote{Which means we exclude %the messages \textsc{service\_accept}, \textsc{ua\_accept}, %\textsc{ua\_failure}, \textsc{ua\_banner}, \textsc{ua\_pk\_ok}, %\textsc{ua\_pw\_changereq}, \textsc{ch\_success} and %\textsc{ch\_failure} from our input alphabet.} % % %For the same reason, from the Connection layer we only use as inputs messages for channel management and the %terminal functionality. Finally, as we only explore protocol behavior after SSH versions have been exchanged, we exclude %the messages for exchanging version numbers. We don't place any restrictions on the %output of the server. All outputs the server generates are also included in the output alphabet. The resulting lists of inputs for the three protocol layers are given in Tables~\ref{trans-alphabet}-\ref{conn-alphabet}. In some experiments, we used only a subset of the most essential inputs, to further speed up experiments. This \textit{restricted alphabet} significantly decreases the number of queries needed for learning models while only marginally limiting explored behavior. We discuss this again in Section~\ref{sec:result}. Inputs included in the restricted alphabet are marked by '*' in the tables below. Table~\ref{trans-alphabet} lists the Transport layer inputs. We include a version of the \textsc{kexinit} message with \texttt{first\_kex\_packet\_follows} disabled. This means no guess~\cite[p. 17]{rfc4253} is attempted on the {\dsut}'s parameter preferences. Consequently, the {\dsut} will have to send its own \textsc{kexinit} in order to convey its own parameter preferences before key exchange can proceed. Also included are inputs for establishing new keys (\textsc{kex30}, \textsc{newkeys}), disconnecting (\textsc{disconnect}), as well as the special inputs \textsc{ignore}, \textsc{unimpl} and \textsc{debug}. The latter are not interesting, as they are normally ignored by implementations. Hence they are excluded from our restricted alphabet. \textsc{disconnect} proved costly time wise, so was also excluded. %We include two versions of the \textsc{kexinit} message, one where \texttt{first\_kex\_packet\_follows} is disabled, the other when it is enabled, in which case, the message would make a guess on the security parameters~\cite[p. 17]{rfc4253}. Our mapper can only handle correct key guesses, so the wrong-guess procedure as described in ~\cite[p. 19]{rfc4253} was not supported. %\textsc{ignore}, \textsc{unimpl} and \textsc{debug} %When needed, SUTs were configured to make this guess work by altering their cipher preferences. The SSH version and comment string (described in Section~\ref{ssh-run-trans}) was not queried because it does not follow the binary packet protocol. \begin{table}[!ht] \centering \small \begin{tabular}{ll} \textbf{Message} & \textbf{Description} \\ \textsc{disconnect} & Terminates the current connection~\cite[p. 23]{rfc4253} \\ \textsc{ignore} & Has no intended effect~\cite[p. 24]{rfc4253} \\ \textsc{unimpl} & Intended response to unrecognized messages~\cite[p. 25]{rfc4253} \\ \textsc{debug} & Provides other party with debug information~\cite[p. 25]{rfc4253} \\ \textsc{kexinit}* & Sends parameter preferences~\cite[p. 17]{rfc4253} \\ %\textsc{guessinit}* & A \textsc{kexinit} after which a guessed \textsc{kex30} follows~\cite[p. 19]{rfc4253} \\ \textsc{kex30}* & Initializes the Diffie-Hellman key exchange~\cite[p. 21]{rfc4253} \\ \textsc{newkeys}* & Requests to take new keys into use~\cite[p. 21]{rfc4253} \\ \textsc{sr\_auth}* & Requests the authentication protocol~\cite[p. 23]{rfc4253} \\ \textsc{sr\_conn}* & Requests the connection protocol~\cite[p. 23]{rfc4253} \end{tabular} \caption{Transport layer inputs} \vspace{-7mm} \label{trans-alphabet} \end{table} The Authentication layer defines one single client message type in the form of the authentication request~\cite[p. 4]{rfc4252}. Its parameters contain all information needed for authentication. Four authentication methods exist: none, password, public key and host-based. Our mapper supports all methods except the host-based authentication because some SUTs don't support this feature. Both the public key and password methods have \textsc{ok} and \textsc{nok} variants, which provide respectively correct and incorrect credentials. Our restricted alphabet supports only public key authentication, as the implementations processed this faster than the other authentication methods. \begin{table}[!ht] \centering \small \begin{tabular}{ll} \textbf{Message} & \textbf{Description} \\ \textsc{ua\_none} & Authenticates with the ``none'' method~\cite[p. 7]{rfc4252} \\ \textsc{ua\_pk\_ok}* & Provides a valid name/key pair~\cite[p. 8]{rfc4252} \\ \textsc{ua\_pk\_nok}* & Provides an invalid name/key pair~\cite[p. 8]{rfc4252} \\ \textsc{ua\_pw\_ok} & Provides a valid name/password pair~\cite[p. 10]{rfc4252} \\ \textsc{ua\_pw\_nok} & Provides an invalid name/password pair~\cite[p. 10]{rfc4252} \\ \end{tabular} \caption{Authentication layer inputs} \vspace{-7mm} \label{auth-alphabet} \end{table} The Connection layer allows the client to manage channels and to request/run services over them. In accordance with our learning goal, our mapper only supports inputs for requesting terminal emulation, plus inputs for channel management as shown in Table~\ref{conn-alphabet}. The restricted alphabet only supports the most general channel management inputs. Those excluded are not expected to produce state change. \begin{table}[!ht] \centering \small \begin{tabular}{ll} \textbf{Message} & \textbf{Description} \\ \textsc{ch\_open}* & Opens a new channel~\cite[p. 5]{rfc4254} \\ \textsc{ch\_close}* & Closes a channel~\cite[p. 9]{rfc4254} \\ \textsc{ch\_eof}* & Indicates that no more data will be sent~\cite[p. 9]{rfc4254} \\ \textsc{ch\_data}* & Sends data over the channel~\cite[p. 7]{rfc4254} \\ \textsc{ch\_edata} & Sends typed data over the channel~\cite[p. 8]{rfc4254} \\ \textsc{ch\_window\_adjust} & Adjusts the window size~\cite[p. 7]{rfc4254} \\ \textsc{ch\_request\_pty}* & Requests terminal emulation~\cite[p. 11]{rfc4254} \\ \end{tabular} \caption{Connection layer inputs} \vspace{-7mm} \label{conn-alphabet} \end{table} \emph{The output alphabet} subsumes all messages an SSH server generates, which may include, with identical meaning, any of the messages defined as inputs. They also include responses to various requests: \textsc{kex31}~\cite[p. 21]{rfc4253} as reply to \textsc{kex30}, \textsc{sr\_succes} in response to service requests (\textsc{sr\_auth} and \textsc{sr\_conn}), \textsc{ua\_success} and \textsc{ua\_failure}~\cite[p. 5,6]{rfc4252} in response to authentication requests, and \textsc{ch\_open\_success}~\cite[p. 6]{rfc4254} and \textsc{ch\_success}~\cite[p. 10]{rfc4254} , in positive response to \textsc{ch\_open} and \textsc{ch\_request\_pty} respectively. To these outputs, we add \textsc{no\_resp} for when the {\dsut} generates no output, and the special outputs \textsc{ch\_none}, \textsc{ch\_max} and \textsc{no\_conn}, and \textsc{buffered}, which we discuss in the next Subsections. %The learning alphabet comprises of input/output messages by which the {\dlearner} interfaces with the {\dmapper}. Section~\ref{sec:ssh} outlines essential inputs, while Table X provides a summary %of all messages available at each layer. \textit{\textit{}} %table \subsection{The mapper}\label{subsec:mapper} The {\dmapper} must provide a translation between abstract messages and well-formed SSH messages: it has to translate abstract inputs listed in Tables~\ref{trans-alphabet}-\ref{conn-alphabet} to actual SSH packets, and translate the SSH packets received in answer to our abstract outputs. If no answer it received on an input, the {\dmapper} must return an output indicating timeout, which in our case is the \textsc{no\_resp} message. The sheer complexity of the {\dmapper} meant that it was easier to adapt an existing SSH implementation, rather than construct the {\dmapper} from scratch. Paramiko already provides mechanisms for encryption/decryption, as well as routines for constructing and sending the different types of packets, and for receiving them. These routines are called by control logic dictated by Paramiko's own state machine. The {\dmapper} was constructed by replacing this control logic with one dictated by messages received from the {\dlearner}. %over a socket connection The {\dmapper} maintains a set of state variables to record parameters of the ongoing session, including the server's preferences for key exchange and encryption algorithm, parameters of these protocols, and, once it has been established, the session key. These parameters are updated when receiving messages from the server and used to concretize inputs to actual SSH messages to the server. For example, upon receiving a \textsc{kexinit} from the {\dsut}, the {\dmapper} saves the {\dsut}'s preferences for key exchange, hashing and encryption algorithms. Initially these parameters are all set to the defaults that any server should support, as required by the RFC. On receiving \textsc{kex31} in response to the \textsc{kex30} input, the {\dmapper} saves the hash, as well as the new keys. Finally, a \textsc{newkeys} response prompts the {\dmapper} to use the new keys negotiated earlier in place of the older ones, if such existed. The {\dmapper} also contains a buffer for storing opened channels, which is initially empty. On a \textsc{ch\_open} from the learner, the {\dmapper} adds a channel to the buffer with a randomly generated channel identifier, on a \textsc{ch\_close}, it removes the channel (if there was any). The buffer size, or the maximum number of opened channels, is limited to one. Initially the buffer is empty. The {\dmapper} also stores the sequence number of the last received message from the {\dsut}. This number is then used when constructing \textsc{unimpl} inputs. In the following cases, inputs are answered by the {\dmapper} directly instead of being sent to the {\dsut} to find out its response: (1) on receiving a \textsc{ch\_open} input if the buffer has reached the size limit, the {\dmapper} directly responds with \textsc{ch\_max}; (2) on receiving any input operating on a channel (all Connection layer inputs other than \textsc{ch\_open}) when the buffer is empty, the {\dmapper} directly responds with \textsc{ch\_none}; (3) if connection with the {\dsut} was terminated, the {\dmapper} responds with a \textsc{no\_conn} message, as sending further messages to the {\dsut} is pointless in that case. %\begin{enumerate} %\item on receiving a \textsc{ch\_open} input and the buffer has reached the size limit, the {\dmapper} directly responds with \textsc{ch\_max}; %\item on receiving any input operating on a channel (all Connection layer inputs other than \textsc{ch\_open}) when the buffer is empty, the %{\dmapper} directly responds with \textsc{ch\_none}; %\item if connection with the {\dsut} was terminated, the {\dmapper} % responds with a \textsc{no\_conn} message, as sending further % messages to the {\dsut} is pointless in that case; %\end{enumerate} % In many ways, the {\dmapper} acts similar to an SSH client, hence the decision to built it by adapting an existing client implementation. \subsection{Practical complications} %There are three practical complications in learning models of SSH %servers: (1) an SSH server may exhibit \emph{non-determistic} %behaviour; (2) a single input to the server can produce a %\emph{sequence} of outputs ratheer than just a single output, and (3) %\emph{buffering} behaviour of the server. These complication are %discussed below. SSH implementations even behind the {\dmapper} abstraction may not behave like deterministic Mealy Machines, a prerequisite for the learning algorithm to succeed. Sources of non-determinism are: \begin{enumerate} \item Underspecification in the SSH specification (for example, by not specifying the order of certain messages) allows some non-deterministic behavior. Even if client and server do implement a fixed order for messages they sent, the asynchronous nature of communication means that the interleaving of sent and received messages may vary. Moreover, client and server are free to intersperse \textsc{debug} and \textsc{ignore} messages at any given time\footnote{The \textsc{ignore} messages are aimed to thwart traffic analysis.} \item Timing is another source of non-deterministic behavior. For example, the {\dmapper} might time-out before the {\dsut} had sent its response. Some {\dsuts} also behave unexpectedly when a new input is received too shortly after the previous one. Hence in our experiments we adjusted time-out periods accordingly so that neither of these events occur, and the {\dsut} behaves deterministically all the time. %did not occur. %However, other timing-related quirks can still %cause non-determinism. For example, some {\dsuts} behave %unexpectedly when a new query is received too shortly after the %previous one. %For example, a trace in which a valid user authentication is performed within five milliseconds after an authentication request on DropBear can cause the authentication to (wrongly) fail. \end{enumerate} % To detect non-determinism, the {\dmapper} caches all observations in an SQLite database and verifies if new observations are consistent with previous ones. If not, it raises a warning, which then needs to be manually investigated. We analyzed each warning until we found a setting under which behavior was deterministic. The cache also acts as a cheap source of responses for already answered queries. Finally, by re-loading the cache from a previous experiment, we were able to start from where this experiment left off. This proved useful, as experiments could take several days. %An added benefit of the cache is that it allows the {\dmapper} to %supply answer to some inputs without actually sending them to the %{\dsut}. This sped up learning a lot when we had to restart %experiments: any new experiment on the same {\dsut} could start where %the previous experiment left off, without re-running all inputs. This %was an important benefit, as experiments could take several days. %A subsequent identical learning run can quickly resume from where the previous one was ended, as the cache from the previous run is used to quickly respond to all queries up to the point the previous run ended. Another practical problem besides non-determinism is that an SSH server may produce a sequence of outputs in response to a single input. This means it is not behaving as a Mealy machines, which allows for only one output. To deal with this, the {\dmapper} concatenates all outputs into one, and it produces this sequence as the single output to the {\dlearner}. A final challenge is presented by forms of `buffering', which we encountered in two situations. Firstly, some implementations buffer incoming requests during rekey; only once rekeying is complete are all these messages processed. This leads to a \textsc{newkeys} response (indicating rekeying has completed), directly followed by all the responses to the buffered requests. This would lead to non-termination of the learning algorithm, as for every sequence of buffered messages the response is different. To prevent this, we treat the sequence of queued responses as the single output \textsc{buffered}. A different form of buffering occurs when opening and closing channels, since a {\dsut} can close only as many channels as have previously been opened. Learning this behavior would lead to an infinite state machine, as we would need a state `there are $n$ channels open' for every number $n$. For this reason, we restrict the number of simultaneously open channels to one. The {\dmapper} returns a custom response \textsc{ch\_max} to a \textsc{ch\_open} message whenever this limit is reached.