README.md 1.5 KB
Newer Older
Nils Golembiewski's avatar
Nils Golembiewski committed
1
2
# Natural Computing Project

3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
## Preliminaries

### Install rust
See: [https://www.rust-lang.org/tools/install](https://www.rust-lang.org/tools/install)

### Install anaconda
See: [https://docs.anaconda.com/anaconda/install/index.html](https://docs.anaconda.com/anaconda/install/index.html)

### (Recommended) install mamba for faster virtual environment installations
See: [https://github.com/mamba-org/mamba](https://github.com/mamba-org/mamba)

### Install and activate the virtual environment:
```bash
<mamba/conda> env create -f conda_environment.yml && \
conda activate natural_computing
```

## The dataset

### Obtain the raw data
Run the following commands
```bash
mkdir data
wget https://nilsgolembiewski.nl/public_files/uploads/2IGcXY8HeE69JlgFk1QLCvBh7NRxAV/full_export.txt.gz -O - | gunzip -c > data/raw_data.txt
```

### Generate dataset from raw data
```bash
cargo run --manifest-path=data_generation/Cargo.toml --release -- -d ./data/raw_data.txt -o ./data/dataset -l 20
```

### Dataset structure
Nils Golembiewski's avatar
Nils Golembiewski committed
35
36
37
38
Folder structure: `folder/<canvas_id>/<canvas_id>_<user_id>_<idx>_<label>_<info>.<data_type>`.

`mask_points.txt` columns: `x`, `y`. The first y (y=0) is the top of the image. 

39
40
41
42
43
`sequence.txt` columns: `canvas_id`, `user_id`, `x`, `y`, `r`, `g`, `b`, `timestamp`, `is_grief`

### Download
A pregenerated dataset can be downloaded here: [https://nilsgolembiewski.nl/public_files/uploads/fDhANiJtdVw7EZSoW3sFyunk6mRL9q/dataset.zip](https://nilsgolembiewski.nl/public_files/uploads/fDhANiJtdVw7EZSoW3sFyunk6mRL9q/dataset.zip).