Commit 4d82876a authored by Paul Fiterau Brostean's avatar Paul Fiterau Brostean
Browse files

Changes to introduction. Added some references

parent b72aea67
\section{Introduction}\label{introduction}
The SSH protocol is used interact securely with remote machines. Alongside TLS and IPSec, SSH is amongst the most frequently used security suites~\cite{Albrecht2009Plaintext}. Due to its significant user base and sensitive nature, flaws in the protocol or its implementation could have major impact. It therefore comes as no surprise that SSH has attracted scrutiny from the security community. Problems with information leakage via unencrypted MACs were identified~\cite{Bellare2004Breaking}, as well as possibilities for plain text recovery when a block cipher in cipher block chaining mode was used~\cite{Albrecht2009Plaintext}. Other analysis found no serious issues~\cite{Williams2011Analysis}\cite{Paterson2010PlaintextDependent}. Academic researchers have so far focused more on the theoretical aspects than on implementations of the protocol.
The SSH protocol is used interact securely with remote machines. Alongside TLS and IPSec, SSH is amongst the most frequently used security suites~\cite{Albrecht2009Plaintext}. Due to its significant user base and sensitive nature, flaws in the protocol or its implementation could have major impact. It therefore comes as no surprise that SSH has attracted scrutiny from the security community.
The protocol specification has been subjected to various security analyses in \cite{Albrecht2009Plaintext,Bellare2004Breaking,Williams2011Analysis,Paterson2010PlaintextDependent}.
Formal methods have also been applied. Poll et. al. in \cite{Poll_rigorous_2011} formulate a thorough specification of SSH's Transport layer. They then use this specification to verify OpenSSH by manually inspecting the source code. The same specification is later used by Erik Boss\cite{Boss2012} in model based testing of the implementation. Udrea et al.\cite{Udrea_rule-based_2008} use static analysis to check two implementations of SSH against an extensive set of rules.
In this work, we use classical active automata learning, which we refer to as model learning, to infer state machines of 2 SSH implementations. Model learning has previously been applied to infer state machines of EMV bank cards~\cite{Aarts2013Formal}, electronic passports~\cite{Aarts2010Inference} and hand-held readers for online banking~\cite{Chalupar2014Automated}. More recently, it was used to learn implementations of TCP~\cite{TCP2016} and TLS~\cite{RuiterProtocol}. Model learning's goal is obtaining a state model of a black-box system by providing inputs and observing outputs. The learned state model corresponds to the observed behavior, and can be used in system analysis. Since model learning builds from a finite number of observations, we can never be sure that the learned model is correct. To that end, advanced conformance algorithms are employed, which yield some confidence that the system inferred is in fact correct. In the context of testing protocols, model learning can be seen as a form of protocol state fuzzing, whereby unexpected inputs are sent to a system under test in the hope of exposing hidden anomalies. In model learning, inputs are sent with no regard to the order imposed by the protocol. Inputs can therefore be sent disorderly. Any anomalies are then exposed in the learned model.
%In \cite{Boss2012}, model based testing was used to verify
%Academic researchers have so far focused more on the theoretical aspects than on implementations of the protocol.
Having obtained models, we use model checking to automatically verify their conformance to security properties. Such verification would be difficult manually, as the learned models are large. Moreover, the very satisfaction of properties can be used to refine the learned model. In cases where the learned model does not satisfy a security property, the proving sequence of inputs can either signal non-conformance of the system, or can be used to further refine the learned model. The security properties are drawn out of the RFC specifications\cite{rfc4251}\cite{rfc4252}\cite{rfc4253}\cite{rfc4254} and formalized in LTL. They are then checked for truth on the learned model using NuSMV~\cite{NuSMV}.
In this work, we use classical active automata learning, or simply model learning, to infer state machines of three SSH implementations, which we then verify by model checking.
Model learning has previously been applied to infer state machines of EMV bank cards~\cite{Aarts2013Formal}, electronic passports~\cite{Aarts2010Inference} and hand-held readers for online banking~\cite{Chalupar2014Automated}. More recently, it was used to learn implementations of TCP~\cite{TCP2016} and TLS~\cite{RuiterProtocol}. Model learning's goal is obtaining a state model of a black-box system by providing inputs and observing outputs. The learned state model corresponds to the observed behavior, and can be used in system analysis. Since model learning builds from a finite number of observations, we can never be sure that the learned model is correct. To that end, advanced conformance algorithms are employed, which yield some confidence that the system inferred is in fact correct. In the context of testing protocols, model learning can be seen as a form of protocol state fuzzing, whereby unexpected inputs are sent to a system under test in the hope of exposing hidden anomalies. In model learning, inputs are sent with no regard to the order imposed by the protocol. Inputs can therefore be sent disorderly. Any anomalies are then exposed in the learned model.
Our work is borne out of 2 recent theses \cite{Verleg2016}, \cite{Toon2016}. It most closely resembles work on TCP~\cite{TCP2016}, where they also combine classical learning and model checking to analyze various TCP implementations, with our work being more focused on security properties. Other case studies rely on
manual analysis of the learned models ~\cite{Aarts2013Formal},\cite{Chalupar2014Automated}, \cite{RuiterProtocol}. Our work differs,
since we use a model checker to automatically check specifications. %Moreover, the model checker is also used as means of testing the learned model.
Inference of protocols has also been done from observing network traffic ~\cite{Wang2011Inferring}. Such inference is limited to the traces that
occur over a network. Such inference is further hampered if the analyzed protocols communicate over encrypted channels, as this severely limits
Having obtained models, we use model checking to automatically verify their conformance to both functional and security properties. The security properties are drawn out of the RFC specifications\cite{rfc4251,rfc4252,rfc4253,rfc4254} and formalized in LTL. They are then checked for truth on the learned model using NuSMV~\cite{NuSMV}. Manually verifying these properties would be difficult, as the learned models are reasonably large. Moreover, formalizing properties means we can also better assess vagueness or under-specification in the RFC standards.
%Moreover, the very satisfaction of properties can be used to refine the learned model.
%In cases where the learned model does not satisfy a security property, the proving sequence of inputs can either signal non-conformance of the system, or can be used to further refine the learned model.
Our work is borne out of 2 recent theses \cite{Verleg2016,Toon2016} and to our knowledge, is the first combined application of model learning and model checking in verifying SSH implementations.
Previous applications of formal verification methods involve model based testing\cite{Boss2012} or static analysis\cite{Poll_rigorous_2011,Udrea_rule-based_2008}. The former limited analysis to the Transport layer. The latter required access to source code, which might not be available for proprietary implementations. \cite{Udrea_rule-based_2008} formulated an extensive set of rules, which were then used by a static analysis tool to verify C implementations. The set of rules covered all layers and it described both elements of message ordering and data flow, whereas we only analyze message ordering. However, rules were tied to routines in the code, so had to be slightly adapted to fit the different implementations. By contrast, our properties are defined at an abstract level so don't need to be adapted. Moreover employing a black box approach makes our approach easily portable and means we don't impose any constraints on the actual implementation of the system (like for example, that it is implemented in C).
Our approach, most closely resembles work on TCP~\cite{TCP2016}, where they also combine classical learning and model checking to analyze various TCP implementations, with our work being more focused on security properties. Other case studies involving model learning rely on manual analysis of the learned models ~\cite{Aarts2013Formal},\cite{Chalupar2014Automated}, \cite{RuiterProtocol}. Our work differs, since we use a model checker to automatically check specifications. Inference of protocols has also been done from observing network traffic ~\cite{Wang2011Inferring}. Such inference is limited to the traces that occur over a network. Such inference is further hampered if the analyzed protocols communicate over encrypted channels, as this severely limits
information available from traces without knowledge of the security key.
%Whereas the combined work of \cite{Poll_rigurous2011}, \cite{Boss2012} leverages model
%based testing on a specification model, we
%Moreover, the model checker is also used as means of testing the learned model.
......
......@@ -38,8 +38,46 @@
title = {Applying Automata Learning to Embedded Control Software}
}
@MastersThesis{Boss2012,
title={Evaluating implementations of SSH by means of model-based testing},
author={Lenaerts, T},
year={2012},
document_type = {Bachelor's Thesis},
type = {Bachelor's Thesis},
howpublished = {Online \url{https://pdfs.semanticscholar.org/8841/47071555f50b614c6af640cea5152fee10f2.pdf}},
journal = {Radboud University}
}
@article{Poll_rigorous_2011,
title = {Rigorous specifications of the {SSH} {Transport} {Layer}},
url = {http://www.cs.kun.nl/~erikpoll/publications/ssh.pdf},
urldate = {2017-02-13},
journal = {Radboud University Nijmegen, Tech. Rep. ICIS–R11004},
author = {Poll, Erik and Schubert, Aleksy},
year = {2011},
keywords = {ssh},
}
@article{Udrea_rule-based_2008,
title = {Rule-based static analysis of network protocol implementations},
volume = {206},
issn = {08905401},
url = {http://linkinghub.elsevier.com/retrieve/pii/S0890540107001332},
doi = {10.1016/j.ic.2007.05.007},
language = {en},
number = {2-4},
urldate = {2017-02-13},
journal = {Information and Computation},
author = {Udrea, Octavian and Lumezanu, Cristian and Foster, Jeffrey S.},
month = feb,
year = {2008},
keywords = {ssh},
pages = {130--157},
file = {1-s2.0-S0890540107001332-main.pdf:C\:\\Users\\Paul\\AppData\\Roaming\\Zotero\\Zotero\\Profiles\\tt1zn5x1.default\\zotero\\storage\\HIVN5QUU\\1-s2.0-S0890540107001332-main.pdf:application/pdf}
}
@MastersThesis{Toon2016,
title={Improving Protocol State Fuzzing of SSH},
title={Evaluating implementations of SSH by means of model-based testing},
author={Lenaerts, T},
year={2016},
document_type = {Bachelor's Thesis},
......@@ -48,6 +86,7 @@
journal = {Radboud University}
}
@phdthesis{Isberner2015,
author = {Malte Isberner},
year = 2015,
......@@ -165,11 +204,11 @@ machine learning algorithms},
@article{FutoranskyAttack,
author = {Futoransky, Ariel and Kargieman, Emiliano},
citeulike-article-id = {13837770},
%citeulike-article-id = {13837770},
edition = {oct. 1998},
year={1998},
howpublished = {Online. \url{http://www.coresecurity.com/files/attachments/CRC32.pdf}},
posted-at = {2015-11-13 16:33:04},
howpublished = {Online. \url{https://www.coresecurity.com/system/files/publications/2016/05/KargiemanPacettiRicharte_1998-CRC32.pdf}},
%posted-at = {2016-05},
priority = {2},
title = {An attack on {CRC}-32 integrity checks of encrypted channels using {CBC} and {CFB} modes}
}
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment