learning_setup.tex 17.4 KB
Newer Older
Paul Fiterau Brostean's avatar
updates    
Paul Fiterau Brostean committed
1
\section{The learning setup} \label{sec:setup}
2
3
4
5
6
7
%This chapter will cover the setup used to infer the state machines. We provide a general setup outline in Section~\ref{components}. The tested SSH servers are described in Section~\ref{suts}, which were queried with the alphabet described in Section~\ref{alphabet}. Section~\ref{setup-handling} will cover the challenging SUT behaviour faced when implementing the mapper, and the adaptations that were made to overcome these challenges. Section~\ref{layers-individual} will discuss the relation between state machines for individual layers and the state machine of the complete SSH protocol. The conventions on visualisation of the inferred state machines are described in Section~\ref{visualisation}.

%Throughout this chapter, an individual SSH message to a SUT is denoted as a \textit{query}. A \textit{trace} is a sequence of multiple queries, starting from a SUT's initial state. Message names in this chapter are usually self-explanatory, but a mapping to the official RFC names is provided in Appendix~\ref{appendixa}.

%\section{Components}\label{components}

Paul Fiterau Brostean's avatar
Paul Fiterau Brostean committed
8
The learning setup consists of three components: the {\dlearner}, {\dmapper} and {\dsut}. The {\dlearner} generates abstract inputs, representing SSH messages. The {\dmapper} transforms these messages into well-formed SSH packets and sends them to the {\dsut}. The {\dsut} sends response packets back to the {\dmapper}, which in turn, translates these packets to abstract outputs. The {\dmapper} then sends the abstract outputs back to the {\dlearner}. 
9
10


Erik Poll's avatar
shit    
Erik Poll committed
11
The {\dlearner} uses LearnLib ~\cite{LearnLib2009}, a Java library implementing $L^{\ast}$ based algorithms for learning Mealy machines. The {\dmapper} is based on Paramiko, an open source SSH implementation written in Python\footnote{Paramiko is available at \url{http://www.paramiko.org/}}. We opted for Paramiko because its code is relatively well structured and documented. The {\dsut} can be any existing implementation of an SSH server. The three components communicate over sockets, as shown in Figure~\ref{fig:components}.
12
\begin{figure}
13
	\centering
14
  \includegraphics[scale=0.29]{components.pdf}
Paul Fiterau Brostean's avatar
updates    
Paul Fiterau Brostean committed
15
  \caption{The SSH learning setup.}
16
17
18
  \label{fig:components}
\end{figure}

Erik Poll's avatar
shit    
Erik Poll committed
19
20
21
22
23
SSH is a complex client-server protocol. In our work so far we concentrated on learning models of the implementation of the server, and not of the client.
We further restrict learning to only exploring the terminal service of the Connection layer, as we consider it to be the most interesting
from a security perspective. Algorithms for encryption, compression and hashing are left to default settings and are not purposefully explored. Also, the starting

state of the {\dsut} is one where a TCP connection has already been established and where SSH versions have been exchanged, which are prerequisites for starting the Transport layer protocol.
24

25
26
%figure
%It is therefore important to focus on messages for which interesting state-changing behaviour can be expected. 
Erik Poll's avatar
shit    
Erik Poll committed
27

28
\subsection{The learning alphabet}\label{subsec:alphabet}
29

Erik Poll's avatar
shit    
Erik Poll committed
30
31
32
33
34
35
The alphabet we use consists of inputs, which correspond to messages
sent to the server, and outputs, which correspond to messages received
from the server. We split the input alphabet into three parts, one
for rach of the protocol layers. 
\marginpar{\tiny Erik: the output alphabet is not discussed anywhere,
but for the discussion of the mapper in the next section it should be}
36

Erik Poll's avatar
shit    
Erik Poll committed
37
38
39
40
41
42
43
44
45
46
47
48
Learning does not scale with a growing alphabet, and since we are only
learning models of servers, we remove those inputs that are not
intended to ever be sent to the server\footnote{This means we exclude
the messages \textsc{service\_accept}, \textsc{ua\_accept},
\textsc{ua\_failure}, \textsc{ua\_banner}, \textsc{ua\_pk\_ok},
\textsc{ua\_pw\_changereq}, \textsc{ch\_success} and
\textsc{ch\_failure} from our alphabet.} Furthermore, from the
Connection layer we only use messages for channel management and the
terminal functionality.  Finally, because we will only explore
protocol behaviour after SSH versions have been exchanged, we exclude
these messages for exchaning version numbers.  \marginpar{\tiny Erik: I
rephrased all this to make it simpler. Is it still ok?}
49

Erik Poll's avatar
shit    
Erik Poll committed
50
51
52
53
54
55
56
57
The resulting lists of inputs for the three protocol layers are given
in tables~\ref{trans-alphabet}-\ref{conn-alphabet}.  In some
experiments, we used only a subset of the most essential inputs, to
further speed up experiments. This \textit{restricted alphabet}
significantly decreases the number of queries needed for learning
models while only marginally limiting explored behavior.  We discuss
this again in Section~\ref{sec:result}. Inputs included in the
restricted alphabet are marked by '*' in the tables below.
58

59
60
Table~\ref{trans-alphabet} introduces the Transport layer inputs. We include a version of the \textsc{kexinit} message with \texttt{first\_kex\_packet\_follows} disabled.
This means no guess~\cite[p. 17]{rfc4253} is attempted on the {\dsut}'s parameter preferences. Consequently, the {\dsut} will have to send its own \textsc{kexinit} in order to 
Paul Fiterau Brostean's avatar
Paul Fiterau Brostean committed
61
convey its own parameter preferences before key exchange can proceed. Also included are inputs for establishing new keys (\textsc{kex30}, \textsc{newkeys}), disconnecting (\textsc{disconnect}), as well as the special inputs \textsc{ignore}, \textsc{unimpl} and \textsc{debug}. The latter are not interesting, as they are normally ignored by implementations. Hence they are excluded from our restricted alphabet. \textsc{disconnect} proved costly time wise, so was also excluded.
62
63
%We include two versions of the \textsc{kexinit} message, one where \texttt{first\_kex\_packet\_follows} is disabled, the other when it is enabled, in which case, the message would make a guess on the security parameters~\cite[p. 17]{rfc4253}. Our mapper can only handle correct key guesses, so the wrong-guess procedure as described in ~\cite[p. 19]{rfc4253} was not supported. 
%\textsc{ignore}, \textsc{unimpl} and \textsc{debug} 
64
%When needed, SUTs were configured to make this guess work by altering their cipher preferences. The SSH version and comment string (described in Section~\ref{ssh-run-trans}) was not queried because it does not follow the binary packet protocol.
65
66
67

\begin{table}[!ht]
\centering
68
\small
69
70
\begin{tabular}{ll}
\textbf{Message} & \textbf{Description} \\
71
\textsc{disconnect} & Terminates the current connection~\cite[p. 23]{rfc4253} \\
72
73
74
\textsc{ignore} & Has no intended effect~\cite[p. 24]{rfc4253} \\
\textsc{unimpl} & Intended response to an unimplemented message~\cite[p. 25]{rfc4253} \\
\textsc{debug} & Provides other party with debug information~\cite[p. 25]{rfc4253} \\
75
\textsc{kexinit}* & Sends parameter preferences~\cite[p. 17]{rfc4253} \\
76
%\textsc{guessinit}* & A \textsc{kexinit} after which a guessed \textsc{kex30} follows~\cite[p. 19]{rfc4253} \\
77
78
79
80
\textsc{kex30}* & Initializes the Diffie-Hellman key exchange~\cite[p. 21]{rfc4253} \\
\textsc{newkeys}* & Requests to take new keys into use~\cite[p. 21]{rfc4253} \\
\textsc{sr\_auth}* & Requests the authentication protocol~\cite[p. 23]{rfc4253} \\
\textsc{sr\_conn}* & Requests the connection protocol~\cite[p. 23]{rfc4253}
81
\end{tabular}
82
\caption{Transport layer inputs}
83
84
85
\label{trans-alphabet}
\end{table}

Paul Fiterau Brostean's avatar
Paul Fiterau Brostean committed
86
The Authentication layer defines one single client message type in the form of the authentication request~\cite[p. 4]{rfc4252}. Its parameters contain all information needed for authentication. Four authentication methods exist: none, password, public key and host-based. Our mapper supports all methods except the host-based authentication because various SUTs lack support for this feature. Both the public key and password methods have  \textsc{ok} and \textsc{nok} variants, which provide respectively correct and incorrect credentials. Our restricted alphabet supports only public key authentication, as it was processed the fastest by implementations out of all authentication forms.
87
88
89

\begin{table}[!ht]
\centering
90
\small
91
92
93
\begin{tabular}{ll}
\textbf{Message} & \textbf{Description} \\
\textsc{ua\_none} & Authenticates with the ``none'' method~\cite[p. 7]{rfc4252} \\
Paul Fiterau Brostean's avatar
Paul Fiterau Brostean committed
94
95
\textsc{ua\_pk\_ok}* & Provides a valid name/key combination~\cite[p. 8]{rfc4252} \\
\textsc{ua\_pk\_nok}* & Provides an invalid name/key combination~\cite[p. 8]{rfc4252} \\
96
97
98
\textsc{ua\_pw\_ok} & Provides a valid name/password combination~\cite[p. 10]{rfc4252} \\
\textsc{ua\_pw\_nok} & Provides an invalid name/password combination~\cite[p. 10]{rfc4252} \\
\end{tabular}
99
\caption{Authentication layer inputs}
100
101
102
\label{auth-alphabet}
\end{table}

103
The Connection layer allows the client manage channels and to request/run services over them. In accordance with our learning goal,
104
our mapper only supports inputs for requesting terminal emulation, plus inputs for channel management as shown in Table~\ref{conn-alphabet}.
105
The restricted alphabet only supports the most general channel management inputs. Those excluded are not expected to produce state change.
106

107
108
109

\begin{table}[!ht]
\centering
110
\small
111
112
\begin{tabular}{ll}
\textbf{Message} & \textbf{Description} \\
Paul Fiterau Brostean's avatar
Paul Fiterau Brostean committed
113
114
115
116
\textsc{ch\_open}* & Opens a new channel~\cite[p. 5]{rfc4254} \\
\textsc{ch\_close}* & Closes a channel~\cite[p. 9]{rfc4254} \\
\textsc{ch\_eof}* & Indicates that no more data will be sent~\cite[p. 9]{rfc4254} \\
\textsc{ch\_data}* & Sends data over the channel~\cite[p. 7]{rfc4254} \\
117
118
\textsc{ch\_edata} & Sends typed data over the channel~\cite[p. 8]{rfc4254} \\
\textsc{ch\_window\_adjust} & Adjusts the window size~\cite[p. 7]{rfc4254} \\
Paul Fiterau Brostean's avatar
Paul Fiterau Brostean committed
119
\textsc{ch\_request\_pty}* & Requests terminal emulation~\cite[p. 11]{rfc4254} \\
120
\end{tabular}
121
\caption{Connection layer inputs}
122
123
\label{conn-alphabet}
\end{table}
124
%The learning alphabet comprises of input/output messages by which the {\dlearner} interfaces with the {\dmapper}. Section~\ref{sec:ssh} outlines essential inputs, while Table X provides a summary
Paul Fiterau Brostean's avatar
updates    
Paul Fiterau Brostean committed
125
%of all messages available at each layer. \textit{\textit{}}
126

127
%table
128

Erik Poll's avatar
shit    
Erik Poll committed
129

130
\subsection{The mapper}
Paul Fiterau Brostean's avatar
updates    
Paul Fiterau Brostean committed
131

Erik Poll's avatar
shit    
Erik Poll committed
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
The {\dmapper} must provide a translation between abstract messages
and well-formed SSH messages: it has to translate abstract inputs
listed in tables~\ref{trans-alphabet}-\ref{conn-alphabet} to actual
SSH packets, and translate the SSH packets received in answer
to our abstract outputs.

A special case here occurs when no output is received from the
{\dsut}; in that case the {\dmapper} gives back to the learner a
\textsc{no\_resp} message, to indicate that a time-out occurred.

The sheer complexity of the {\dmapper} meant that it was easier to
adapt an existing SSH implementation, rather than construct the
{\dmapper} from scratch. Paramiko already provides mechanisms for
encryption/decryption, as well as routines for constructing and
sending the different types of packets, and for receiving them. These
routines are called by control logic dictated by Paramiko's own state
machine.  The {\dmapper} was constructed by replacing this control
logic with one dictated by messages received from the {\dlearner}.
%over a socket connection

The {\dmapper} maintains a set of state variables to record parameters
of the ongoing session, including for example the server's preferences
for key exchange and encryption algorithm, parameters of these
protocols, and -- once is has been established -- the session key.
These parameters are updated when receiving messages from the server,
and are used to concretize inputs to actual SSH messages to the server.
Paul Fiterau Brostean's avatar
updates    
Paul Fiterau Brostean committed
158

Erik Poll's avatar
shit    
Erik Poll committed
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
For example, upon receiving a \textsc{kexinit}, the {\dmapper} saves
the {\dsut}'s preferences for key exchange, hashing and encryption
algorithms. Initially these parameters are all set to the defaults
that any server should support, as required by the RFC.  The
{\dmapper} supports Diffie-Hellman key exchange, which it will
initiate if it gets  a \textsc{kex30} input from the learner. 
After this, the {\dsut} responds with a \textsc{kex31} message
(assuming the protocol run so far is correct), and from this
message, the {\dmapper} saves the hash, as well as the new
keys. Receipt of the \textsc{newkeys} response from the {\dsut} will
make the {\dmapper} use the new keys earlier negotiated in place of
the older ones, if such existed.

The {\dmapper} contains two other
state variables, 
one is a buffer for storing open channels. It is initially empty and
is increased/decreased on \textsc{ch\_open} and \textsc{ch\_close}
inputs respectively. The other is initially set to 0, and stores the
sequence number of the last received message. This is then used when
constructing \textsc{unimpl} inputs. \marginpar{\tiny Erik: I don't get this
bit}

In the following cases, inputs are answered by the {\dmapper} directly
instead of being sent to the {\dsut} fo find out its response:
183
\begin{enumerate}
Erik Poll's avatar
shit    
Erik Poll committed
184
185
186
187
\item if connection with the {\dsut} was terminated, the {\dmapper}
     responds with a \textsc{no\_conn} message, as sending furtheer
     messages to the {\dsut} is pointless in that case;
\item if no channel has been opened (the buffer variable is empty) or the maximum number of channels was reached (in our experiments 1), cases which prompt the {\dmapper} 
Paul Fiterau Brostean's avatar
Files    
Paul Fiterau Brostean committed
188
to respond with \textsc{ch\_none}, or \textsc{ch\_max} respectively
Erik Poll's avatar
shit    
Erik Poll committed
189
190
191
    \marginpar{\tiny Erik: i don't get this 2nd bullet; something is
    missing?}

192
\end{enumerate}
Paul Fiterau Brostean's avatar
updates    
Paul Fiterau Brostean committed
193

194
195
Overall, we notice that in many ways, the {\dmapper} acts similarly to an SSH client. Hence it is unsurprising that it was built off an existing 
implementation.
196
197


198
\subsection{Compacting SSH into a small Mealy machine}
Erik Poll's avatar
shit    
Erik Poll committed
199
200
201
202
203
204
205
206
207
208
209
210
211

There are three practical complications in learning models of SSH
servers: (1) an SSH server may exhibit \emph{non-determistic} behaviour; (2)
a single input to the server can produce a \emph{sequence} of outputs
ratheer than just a single output, and (3) \emph{buffering} behaviour of the
server. These complication are discussed below.

SSH implementations can exhibit non-determistic behaviour.  The
learning algorithm cannot deal with non-determinism, so the {\dmapper}
not only provides abstraction, but it also ensures\marginpar{\tiny
Erik:'checks' instead of 'ensures'??} that the abstract representation
shown to the learner behaves a deterministic Mealy machine. 
There are a few sources of non-determinism:
212
\begin{enumerate}
Erik Poll's avatar
shit    
Erik Poll committed
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
\item Underspecification in the SSH specification (for example, by not
     specifying the order of certain messages) allows
     non-deterministic behaviour of implementations. Even if client
     and server do implement a fixed order for messages they sent, the
     asynchronous nature of comunication between them means the
     interleaving of messages sent and received may vary.  Moreover,
     client and server are free to intersperse \textsc{debug} and
     \textsc{ignore} messages at any given time; the \textsc{ignore}
     messages are aimed to thwart traffic analysis.
\item Timing is another source of non-deterministic behaviour. For
     example, the {\dmapper} might time-out before the {\dsut} had
     sent its response.
     
     In our experiments we tried to set time-out periods so that this
     did not occur.  However, other timing-related quirks can still
     cause non-deterministism. For example, {\dsuts} behave
     unexpectedly when a new query is received too shortly after the
     previous one.
231
232
%For example, a trace in which a valid user authentication is performed within five milliseconds after an authentication request on DropBear can cause the authentication to (wrongly) fail.  
\end{enumerate}
Erik Poll's avatar
shit    
Erik Poll committed
233
234
235
236
237
238
239
240
241
242
243
244
%
To detect non-determinism, we cache all new observations in an SQLite database and verify observations against this cache. 
\marginpar{\tiny Erik: the text does not explain what we do if non-determinism
is detected. Do we ignore the new trace and report the old one from the
database? And/or warn that non-determinism is happening?}
The cache also enables us to answer to queries answered before
without running inputs on the {\dsut}. This transcended consecutive
learning runs, as we could reload the cache from a previous run,
enabling quick response of all queries up to the point the previous
run ended.\marginpar{\tiny Erik: but the trace still has to be
re-run, right, if we want to extend it beyond the end of the previous
run?}
245

Paul Fiterau Brostean's avatar
Paul Fiterau Brostean committed
246
247

%A subsequent identical learning run can quickly resume from where the previous one was ended, as the cache from the previous run is used to quickly respond to all queries up to the point the previous run ended.
248

Erik Poll's avatar
shit    
Erik Poll committed
249
250
251
252
253
An SSH server may also produce a sequence of outputs in response to an
input. This means it is not behaving as a Mealy machines, which allow
allow for only one output. Dealing with this is simple: the {\dmapper}
concatenates all outputs into one, and it produces this sequence
as the single output to the {\dlearner}. 
254

Erik Poll's avatar
shit    
Erik Poll committed
255
256
257
258
259
260
261
262
263
264
265
266
Another challenge is presented by buffering, which leads to an
explosion of states in learning, as buffers cannot be described
succinctly by Mealy machines.
We have encountered buffering in two situations. Firstly, some
implementations buffer responses during a key re-exchange; only once
the rekeying is complete, are all these queued responses released, all
at once. This leads to a \textsc{newkeys} response (indicating
rekeying has completed), directly followed by all buffered responses.
This would lead to non-termination of the learning algorithm, as for
every sequence of buffered messages there is a different respone.  To
prevent this, we treat the sequence of queued responses
as a single output \textsc{buffered}.
267

Erik Poll's avatar
shit    
Erik Poll committed
268
269
270
271
272
273
Secondly, buffering happens when opening and closing channels,
since a SUT can close only as many channels as have previously been
opened. For this reason, we restrict the number of simultaneously open
channels to one. The {\dmapper} returns a custom response
\textsc{ch\_max} to a \textsc{ch\_open} message whenever this limit is
reached.