Z3GI is a Python tool and library that uses the [Z3 SMT solver][z3] for learning minimal consistent state machine models from labeled strings or input/output taces.

[z3]:https://github.com/Z3Prover/z3

Introduction

------------

Grammatical inference is the research field that is concerned with learning the set of rules that describe the behavior of a system, such as a network protocol, a piece of software, or a (formal) language.

One of the best studied problems in grammatical inference is that of finding a deterministic finite automaton (DFA) of minimal size that accepts a given set of positive strings and rejects a given set of negative strings.

This problem is can be very hard, as it has been shown to be NP-complete.

Z3GI provides different ways of solving this (and similar) problem(s) using satisfiability modulo theories (SMT).

Installing with pip (recommended)

---------------------------------

The recommended way of installing Z3GI is with `pip`:

```

$ pip install z3gi

```

Installing from sources

-----------------------

Alternatively, you can install Z3GI by cloning this repository, and installing using `setuptools`:

Consider a deterministic finite automaton (DFA) that accepts strings of `0`s and `1`s in which the number of `0`s minus twice the number of `1`s is either 1 or 3 more than a multiple of 5 (such a DFA is described [here][dfa]).

[dfa]:http://abbadingo.cs.nuim.ie/dfa.html

A training file `train.txt` for this DFA could read (if you have the sources of this package, this file can be found at `docs/train.txt`):

```

16 2

1 4 1 0 0 0

1 4 0 1 0 0

1 4 0 0 1 0

1 5 1 0 1 1 1

1 6 1 1 1 1 0 1

1 6 0 1 0 0 0 0

1 6 1 0 0 0 0 0

1 7 0 0 0 1 1 0 1

1 7 0 0 0 0 1 0 1

0 3 1 0 1

0 4 0 0 0 0

0 4 1 1 0 1

0 5 0 0 0 0 0

0 5 0 0 1 0 1

0 6 0 1 0 1 1 1

0 7 1 0 0 0 1 1 1

```

In this [Abbadingo file][abbadingo], the first line is a header, giving the number of strings in the file (16) and the number of symbols (2).

Where label `1` means accepted, and the label `0` means rejected.

We can use Z3GI to learn a model for the strings in `train.txt` as follows:

```

$ python -m z3gi --model -f train.txt

```

This produces the following output:

```

Learned model:

[state3 = 3,

start = 0,

state0 = 0,

n = 5,

state2 = 2,

state4 = 4,

1 = INPUT!val!0,

state1 = 1,

0 = INPUT!val!1,

out = [3 -> True, 4 -> True, else -> False],

trans = [(0, INPUT!val!0) -> 3,

(0, INPUT!val!1) -> 4,

(4, INPUT!val!0) -> 2,

(3, INPUT!val!0) -> 4,

(3, INPUT!val!1) -> 2,

(2, INPUT!val!0) -> 1,

(1, INPUT!val!1) -> 3,

(4, INPUT!val!1) -> 1,

else -> 0]]

```

We can interpret this learned model as follows.

-`0 = INPUT!val!1` and `1 = INPUT!val!0` provide identifiers for `0` and `1` (notice that the values in the identifiers and the actual inputs are different!)

-`n = 5` indicates that the learned model has 5 states

-`state0 = 0` through `state4 = 4` provide the identifiers for these states

-`out` describes an output function for these states (`True` if it is accepting and `False` if it is rejecting)

-`trans` desribes a transition function for states and symbols to states (e.g. `(0, INPUT!val!0) -> 3)` describes a transition from `state0` with `1` to `state3`)

Using z3gi in Python

--------------------

Let's learn the same model (from `train.txt`) in Python:

1. Open your Python interpreter:

```

$ python

```

2. Let's use a different encoder this time:

```

>>> from z3gi.encoders import expressive

>>> encoder = expressive.Encoder()

```

3. Create a sample:

```

>>> from z3gi.sample import Sample

>>> sample = Sample(encoder)

```

4. Add constraints for strings in `train.txt` to the sample:

```

>>> from z3gi.parsers import abbadingo

>>> for string, label in abbadingo.read(open('train.txt', 'r'), header=1):