Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
H
hybrid-ads
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
This is an archived project. Repository and other project resources are read-only.
Show more breadcrumbs
Joshua Moerman
hybrid-ads
Commits
d44610e4
Commit
d44610e4
authored
9 years ago
by
Joshua Moerman
Browse files
Options
Downloads
Patches
Plain Diff
Moved the (outdated) docs to other repositories
parent
613dd459
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
docs/explanation.tex
+0
-168
0 additions, 168 deletions
docs/explanation.tex
docs/test_selection.tex
+0
-61
0 additions, 61 deletions
docs/test_selection.tex
with
0 additions
and
229 deletions
docs/explanation.tex
deleted
100644 → 0
+
0
−
168
View file @
613dd459
\documentclass
[envcountsame]
{
llncs
}
\usepackage
{
amsmath
}
\usepackage
[backgroundcolor=white]
{
todonotes
}
\newcommand
{
\Def
}
[1]
{
\emph
{
#1
}}
\newcommand
{
\bigO
}{
\mathcal
{
O
}}
\begin{document}
\maketitle
\section
{
Introduction
}
Recently automata learning has gained popularity. Learning algorithms are
applied to real world systems (systems under learning, or SUL for short) and
this shows some problems.
In the classical active learning algorithms such as
$
L
^
\ast
$
one supposes a
teacher to which the algorithm can ask
\Def
{
membership queries
}
and
\Def
{
equivalence queries
}
. In the former case the algorithms asks the teacher for the
output (sequence) for a given input sequence. In the latter case the algorithm
provides the teacher with a hypothesis and the teacher answers with either an
input sequence on which the hypothesis behaves differently than the SUL or
answers affirmatively in the case the machines are behaviorally equivalent.
In real world applications we have to implement the teacher ourselves, despite
the fact that we do not know all the details of the SUL. The membership queries
are easily implemented by resetting the machine and applying the input. The
equivalence queries, however, are often impossible to implement. Instead, we
have to resort to some sort of random testing. Doing random testing naively is
of course hopeless, as the state space is often too big. Luckily we we have a
hypothesis at hand, we can use for model based testing.
One standard framework for model based testing is pioneered by Chow and
Vasilevski. Briefly, the framework supposes prefix sequences which allows us to
go from the initial state to a given state
$
s
$
(in the model) or even a given
transition
$
t
\to
s
$
, and suffix sequences which test whether the machine
actually is in state
$
s
$
. If we have the right suffixes and test every
transition of the model, we can ensure that the SUL is either equivalent or has
strictly more states. Such a test suite can be constructed with a size
polynomially in the number of states of the model. This is contrary to
exhaustive testing or (naive) random testing, where there are exponentially many
sequences.
For the prefixes we can use any single source shortest path algorithm. In fact,
if we restrict ourselves to the above framework, this is the best we can do.
This gives
$
n
$
sequences of length at most
$
n
-
1
$
(in fact the total sum is at
most
$
\frac
{
1
}{
2
}
n
(
n
-
1
)
$
).
For the suffixes, we can use the standard Hopcroft algorithm to generate
seperating sequences. If we want to test a given state
$
s
$
, we take the set of
suffixes (we allow the set of suffixes to depend on the state we want to test)
to be all seperating sequences for all other states
$
t
$
. This set has at most
$
n
-
1
$
elements of at most length
$
n
$
, again the total sum is
$
\frac
{
1
}{
2
}
n
(
n
-
1
)
$
.
A natural question arises: can we do better?
In the presence of a distinguishing sequence, Lee and Yannakakis prove that one
can take a set of suffixes of just
$
1
$
element of at most length
$
\frac
{
1
}{
2
}
n
(
n
-
1
)
$
. This does not provide an improvement in the worst case scenario. Even
worse, such a sequence might not exist.
In this paper we propose a testing algorithm which combines the two methods
described above. The distinguishing sequence might not exist, but the tree
constructed during the Lee and Yannakakis provides a lot of information which we
can complement with the classical Hopcroft approach. Despite the fact that this
is not an improvement in the worst case scenario, this hybrid method enabled us
to learn an industrial grade machine, which was infeasible to learn with the
standard methods provided by LearnLib.
\section
{
Preliminaries
}
We restrict our attention to
\Def
{
Mealy machines
}
. Let
$
I
$
(resp.
$
O
$
) denote
the finite set of inputs (resp. outputs). Then a Mealy machine
$
M
$
consists of a
set of states
$
S
$
with a initial states
$
s
_
0
$
, together with a transition
function
$
\delta
: I
\times
S
\to
S
$
and an output function
$
\lambda
: I
\times
S
\to
O
$
. Note that we assume machines to be deterministic and total. We also
assume that our system under learning is a Mealy machine. Both functions
$
\delta
$
and
$
\lambda
$
are extended to words in
$
I
^
\ast
$
.
We are in the context of learning, so we will generally denote the hypothesis by
$
H
$
and the system under learning by
$
SUL
$
. Note that we can assume
$
H
$
to be
minimal and reachable.
We assume that the alphabets
$
I
$
and
$
O
$
are fixed in these notes.
\begin{definition}
A
\Def
{
set of words
$
X
$
(over
$
I
$
)
}
is a subset
$
X
\subset
I
^
\ast
$
.
Given a set of states
$
S
$
, then a
\Def
{
family of sets
$
X
$
(over
$
I
$
)
}
is a
collection
$
X
=
\{
X
_
s
\}
_{
s
\in
S
}$
where each
$
X
_
s
$
is a set of words.
\end{definition}
All words we will consider are over
$
I
$
, so we will refer to them as simply
words. The idea of a family of sets was also introduced by Fujiwara. They are
used to collect sequences which are relevant for a certain state. We define
some operations on sets and families:
\newcommand
{
\tensor
}{
\otimes
}
\begin{itemize}
\item
Let
$
X
$
and
$
Y
$
be two sets of words over
$
I
$
, then
$
X
\cdot
Y
$
is the
set of all concatenations:
$
X
\cdot
Y
=
\{
x y
\,
|
\,
x
\in
X, y
\in
Y
\}
$
.
\item
Let
$
X
^
n
=
X
\cdots
X
$
denote the iterated concatenation and
$
X
^{
\leq
k
}
=
\bigcup
_{
n
\leq
k
}
X
^
n
$
all concatenations of at most length
$
k
$
.
In particular
$
I
^
n
$
are all words of length
precisely
$
n
$
and
$
I
^{
\leq
k
}$
are all words of length at most
$
k
$
.
\item
Let
$
X
=
\{
X
_
s
\}
_{
s
\in
S
}$
and
$
Y
=
\{
Y
_
s
\}
_{
s
\in
S
}$
be two
families of sets. We define a new family of words
$
X
\tensor
_
H Y
$
as
$
(
X
\tensor
_
H Y
)
_
s
=
\{
x y
\,
|
\,
x
\in
X
_
s, y
\in
Y
_{
\delta
(
s, x
)
}
\}
$
.
Note that this depends on the transitions in the machine
$
H
$
.
\item
Let
$
X
$
be a family and
$
Y
$
just a set of words,
then the usual concatenation is defined as
$
(
X
\cdot
Y
)
_
s
=
X
_
s
\cdot
Y
$
.
\item
Let
$
X
$
be a family of sets, then the union
$
\bigcup
X
$
forms a set
of words.
\end{itemize}
Let
$
H
$
be a fixed machine and let
$
\tensor
$
denote
$
\tensor
_
H
$
. We define some
useful sets (which depend on
$
H
$
):
\begin{itemize}
\item
The set of prefixes
$
P
_
s
=
\{
x
\,
|
\,
\text
{
a shortest
}
x
\text
{
such
that
}
\delta
(
s, x
)
=
t, t
\in
S
\}
$
. Note that
$
P
_{
s
_
0
}$
is particularly
interesting. These sets can be constructed by any shortest path algorithm.
Note that
$
P
\cdot
I
$
is a set covering all transitions in
$
H
$
.
\item
The set
$
W
_
s
=
\{
x
\,
|
\,
x
\text
{
seperates
}
s
\text
{
and
}
t, t
\in
S
\}
$
. This can be constructed using Hopcroft's algorithm or Gill's algorithm
if one wants minimal separating sequences.
\item
If
$
x
$
is an adaptive distinguishing sequence in the sense of Lee and
Yannakakis, and let
$
x
_
s
$
denote the associated UIO for state
$
x
$
, we
define
$
Z
_
s
=
\{
x
_
s
\}
$
.
\end{itemize}
We obtain different methods (note that all test suites are expressed as families
of sets
$
X
$
, the actual test suite is
$
X
_{
s
_
0
}$
):
\begin{itemize}
\item
The originial Chow and Vasilevski (W-method) test suite is given by:
$$
P
\cdot
I
^{
\leq
k
+
1
}
\cdot
\bigcup
W
$$
which distinguishes
$
H
$
from any non-equivalent machine with at
most
$
|S|
+
k
$
states.
\item
The Wp-method as described by Fujiwara:
$$
(
P
\cdot
I
^{
\leq
k
}
\cdot
\bigcup
W
)
\cup
(
P
\cdot
I
^{
\leq
k
+
1
}
\tensor
W
)
$$
which is a smaller test suite than the W-method, but just as strong. Note
that the original description by Fujiwara is more detailed in order to
reduce redundancy.
\item
The method proposed by Lee and Yannakakis:
$$
P
\cdot
I
^{
\leq
k
+
1
}
\tensor
Z
$$
which is as big as the Wp-method in the worst case (if it even exists) and
just as strong.
\end{itemize}
An important observation is that the size of
$
P
$
,
$
W
$
and
$
Z
$
are polynomially
in the number of states of
$
H
$
, but that the middle part
$
I
^{
\leq
k
+
1
}$
is
exponential. If the numbers of states of
$
SUL
$
is known, one can perform a (big)
exhaustive test. In practice this is not known or has a very large bound. To
mitigate this we can exhaust
$
I
^{
\leq
1
}$
and then resort to randomly sample
$
I
^
\ast
$
. It is in this sampling phase that we want
$
W
$
and
$
Z
$
to contain the
least number of elements as every element contributes to the exponential blowup.
Also note that
$
W
$
can also be constructed in different ways. For example taking
$
W
_
s
=
\{
u
_
s
\}
$
, where
$
u
_
s
$
is a UIO for state
$
s
$
(assuming they all exist)
gives valid variants of the first two methods. Also if an adaptive
distinguishing sequence exists, all states have UIOs and we can use the first
two methods. The third method, however, is slightly smallar as we do not need
$
\bigcup
W
$
in this case, because the UIOs constructed from an adaptive
distinguishing sequence share (non-empty) prefixes.
% fix from http://tex.stackexchange.com/questions/103735/list-of-todos-todonotes-is-empty-with-llncs
\setcounter
{
tocdepth
}{
1
}
\listoftodos
\end{document}
This diff is collapsed.
Click to expand it.
docs/test_selection.tex
deleted
100644 → 0
+
0
−
61
View file @
613dd459
\subsubsection
{
Augmented DS-method
}
\label
{
sec:randomPrefix
}
In order to reduce the number of tests, Chow~
\cite
{
Ch78
}
and
Vasilevskii~
\cite
{
vasilevskii1973failure
}
pioneered the so called W-method. In
their framework a test query consists of a prefix
$
p
$
bringing the SUL to a
specific state, a (random) middle part
$
m
$
and a suffix
$
s
$
assuring that the
SUL is in the appropriate state. This results in a test suite of the form
$
P
I
^{
\leq
k
}
W
$
, where
$
P
$
is a set of (shortest) access sequences,
$
I
^{
\leq
k
}$
the set of all sequences of length at most
$
k
$
, and
$
W
$
is a characterization
set. Classically, this characterization set is constructed by taking the set of
all (pairwise) separating sequences. For
$
k
=
1
$
this test suite is complete in
the sense that if the SUL passes all tests, then either the SUL is equivalent to
the specification or the SUL has strictly more states than the specification. By
increasing
$
k
$
we can check additional states.
We tried using the W-method as implemented by LearnLib to find counterexamples.
The generated test suite, however, was still too big in our learning context.
Fujiwara et al
\cite
{
FBKAG91
}
observed that it is possible to let the set
$
W
$
depend on the state the
SUL is supposed to be. This allows us to only take a subset of
$
W
$
which
is relevant for a specific state. This slightly reduces the test suite which is
as powerful as the full test suite. This methods is known as the Wp-method. More
importantly, this observation allows for generalizations where we can carefully
pick the suffixes.
In the presence of an (adaptive) distinguishing sequence one can take
$
W
$
to be a
single suffix, greatly reducing the test suite. Lee and Yannakakis
\cite
{
LYa94
}
describe an
algorithm (which we will refer to as the LY algorithm) to efficiently
construct this sequence, if it exists. In our case, unfortunately, most
hypotheses did not enjoy existence of an adaptive distinguishing sequence. In
these cases the incomplete result from the LY algorithm still contained a lot of
information which we augmented by pairwise separating sequences.
\begin{figure}
\centering
\includegraphics
[width=\textwidth]
{
hyp
_
20
_
partial
_
ds.pdf
}
\caption
{
A small part of an incomplete distinguishing sequence as produced by
the LY algorithm. Leaves contain a set of possible initial states, inner nodes
have input sequences and edges correspond to different output symbols (of
which we only drew some), where Q stands for quiescence.
}
\label
{
fig:distinguishing-sequence
}
\end{figure}
As an example we show an incomplete adaptive distinguishing sequence for one of
the hypothesis in Figure~
\ref
{
fig:distinguishing-sequence
}
. When we apply the
input sequence I46 I6.0 I10 I19 I31.0 I37.3 I9.2 and observe outputs O9 O3.3 Q ...
O28.0, we know for sure that the SUL was in state 788. Unfortunately not all
path lead to a singleton set. When for instance we apply the sequence I46 I6.0
I10 and observe the outputs O9 O3.14 Q, we know for sure that the SUL was in one
of the states 18, 133, 1287 or 1295. In these cases we have to perform more
experiments and we resort to pairwise separating sequences.
We note that this augmented DS-method is in the worst case not any better than
the classical Wp-method. In our case, however, it greatly reduced the test
suites.
Once we have our set of suffixes, which we call
$
Z
$
now, our test algorithm
works as follows. The algorithm first exhausts the set
$
P I
^{
\leq
1
}
Z
$
. If this
does not provide a counterexample, we will randomly pick test queries from
$
P
I
^
2
I
^
\ast
Z
$
, where the algorithm samples uniformly from
$
P
$
,
$
I
^
2
$
and
$
Z
$
(if
$
Z
$
contains more that
$
1
$
sequence for the supposed state) and with a geometric
distribution on
$
I
^
\ast
$
.
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment