Commit ddfdd67f authored by Joshua Moerman's avatar Joshua Moerman
Browse files

Updates section on conformance testing

parent b987aea1
......@@ -894,4 +894,14 @@ title="Formal Methods for Protocol Engineering and Distributed Systems",
year="1999",
publisher="Springer US",
doi="10.1007/978-0-387-35578-8_13"
}
@article{Fujiwara1991,
title={Test selection based on finite state models},
author={Fujiwara, Susumu and Bochmann, Gregor V and Khendek, Ferhat and Amalou, Mokhtar and Ghedamsi, Abderrazak},
journal={Software Engineering, IEEE Transactions on},
volume={17},
number={6},
pages={591--603},
year={1991},
publisher={IEEE}
}
\ No newline at end of file
......@@ -90,7 +90,7 @@ If the system's response is the same as the predicted response (by the hypothesi
Otherwise, if there is a test for which the target and the hypothesis produce different outputs, then this input sequence can be used as a counterexample.
One of the main advantages of using conformance testing is that it can efficiently identify the hypothesis from the set of all finite state machines of size at most $m$.
This means that if we know a bound $m$ for the size of the system we learn, we are guaranteed to find a counterexample.
This means that if we know a bound $m$ for the size of the system we learn, we are guaranteed to find a counterexample if there exists one.
Unfortunately, conformance testing has some notable drawbacks.
First, it is hard (or even impossible) in practice to determine an upper-bound on the number of states of the system's target FSM.
Second, it is known that testing becomes exponentially more expensive for higher values of $m$ \citep{Vasilevskii1973}. %, as a complete test set should include each sequence in the so-called \emph{traversal set}, which contains all input sequences of length $l = m - n + 1$ .
......@@ -390,8 +390,12 @@ For a description of discriminator finalization, we refer to \cite{Isberner2014a
%\end{figure}
\subsection{Conformance testing} \label{sec:testing}
One of the main advantages of using conformance testing for finding counterexamples is that it can discover all counterexamples for a hypothesis $H = (I, O, Q_H, q_H, \delta_H, \lambda_H)$ with $n$ states, under the assumption that the target FSM $M = (I, O, Q_M, q_M, \delta_M, \lambda_M)$ has at most $m$ states, $m \leq n$.
Several different methods exist for constructing a so-called \emph{$m$-complete} test set (for a complete introduction and an overview of these methods, we refer to \citet{Dorofeeva2010}).
Conformance testing for FSMs is a efficient way of finding counter examples.
There exist many methods which guarantee to find a counterexample to a hypothesis if it exists in an fairly efficient way.
Let $H = (I, O, Q_H, q_H, \delta_H, \lambda_H)$ be a hypothesis with $n$ states.
We call a conformance testing method $m$-complete if it can identify the hypothesis in the set of all FSMs with at most $m$ states.
Such $m$-complete methods are generally polynomially in the size of the hypothesis and exponential in $m - n$, which are far more efficient than an exhaustive search.
For an overview of some $m$-complete methods, we refer to \citet{Dorofeeva2010}.
All of these methods require the following information:
\begin{itemize}
\item A set of \emph{access sequences} $S = \{\lfloor q \rfloor_H | q \in Q_H\}$, possibly extended to a \emph{transition cover} set $S \cdot I$.
......@@ -399,12 +403,12 @@ All of these methods require the following information:
$.
\item A means of pairwise distinguishing all states of $H$, such as set of \emph{discriminators} $E$ for all pairs of states in $H$.
\end{itemize}
A test set is then constructed by taking the product of these sets, or subsets of these sets, e.g.\ $S \cdot I^{l} \cdot E$.
A test suite is then constructed by combining these sets, or subsets of these sets, e.g.\ $S \cdot I^{l} \cdot E$.
The difference between different testing methods is how states are distinguished (i.e.\ the last part).
In the so-called \emph{partial W-method}, or \emph{Wp-method}, \citep{Fujiwara1991} states are distinguished based on the current state of the hypothesis:
In the so-called \emph{partial W-method}, or \emph{Wp-method}, \citep{Fujiwara1991} states are distinguished pairwise:
For each state $q \in Q_H$ a set $E_{q} \subset E$ of discriminators is constructed, such that for each state $q' \in Q \setminus \{q\}$ there is a sequence $w \in E_{q}$ that distinguishes $q$ and $q'$, i.e.\ $\lambda_H(q, w) \neq \lambda_H(q', w)$.
Then, each trace $uv, u \in S \cdot I, v \in I^{l}$ is concatenated with the set $E_q$ such that $q = \delta_H(q_H, uv)$.
Then, each trace $uv, u \in S \cdot I, v \in I^{l}$ is extended with the set $E_q$ where $q = \delta_H(q_H, uv)$.
%The method that uses a set of discriminators for all pairs of states is called the \emph{W-method} \cite{chow1978testing}.
......@@ -414,12 +418,12 @@ Then, each trace $uv, u \in S \cdot I, v \in I^{l}$ is concatenated with the set
%Recently, however, \citet{Smeenk2015} have described a test method that uses adaptive distinguishing sequences to construct a test sets that (in most cases) is smaller than one constructed by the W-method (although they have the same worst case complexity).
%We will refer to this method as \emph{ADS}.
Conformance testing is typically very expensive due to the exponential size of the traversal set.
Given a hypothesis $H$ with $n$ states and $k$ inputs, the worst-case length of a test set (i.e. the sum of the length of all sequences) is of order $\mathcal{O}(k^{l}n^3)$ (recall that $l = m - n + 1$, where $m$ is the upper bound on the number of states of $M$).
Conformance testing is typically expensive due to the exponential size of the traversal set.
Given a hypothesis $H$ with $n$ states and $k$ inputs, the worst-case length of a test suite (i.e. the sum of the length of all sequences) is of order $\mathcal{O}(k^{l}n^3)$ (recall that $l = m - n + 1$, where $m$ is the upper bound on the number of states of $M$).
Moreover, it is hard to estimate an upper bound for $M$ in practice.
Therefore, instead of iterating over the traversal set $I^{l}$ (for each prefix in $S$), we sample the set $I^{\ast}$ by randomly generating sequences $v \in I^{\ast}$ according to a geometric distribution.
While generating a sequence, the probability of terminating the sequence is $p = 1/x$.
Therefore, the expected (mean) length of such a sequence is $E(l) = 1/p = x$, and the probability that a sequence is of length $l$ is $(1-p)^xp$.
Therefore, instead of exhausting these test suites, we randomly sample from them, where the length is random as well. To be precise, we first sample uniformly from $S$, then we sample the set $I^{\ast}$ by randomly generating sequences $v \in I^{\ast}$ according to a geometric distribution and finally we will sample uniformly from $E$.
By using a geometric distribution we are not bounding the length of the sequence.
For our tools the expected length can be set by a parameter.
%often a value close to $n$ is picked for $m$, in the hope that the test set contains at least some counterexamples.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment