38 lines
749 B
Markdown
38 lines
749 B
Markdown
# Instructions
|
|
## Step 1. Compile
|
|
Dependencies
|
|
|
|
```
|
|
Nim (>= 2.0.6)
|
|
Rust & Cargo
|
|
sqlite3
|
|
```
|
|
For nimble deps:
|
|
```
|
|
nimble install db_connector/db_sqlite tiny_sqlite
|
|
```
|
|
|
|
To init:
|
|
|
|
```
|
|
./compile.sh
|
|
./initdbs.sh
|
|
```
|
|
|
|
## Download wikipedia
|
|
|
|
Make a folder in ../wikimedia wikipedia, for the wikipedia data. On my machine, its system-linked to /mnt. I imagine many people want to do this, so it's not instantiated by default.
|
|
|
|
```./downloadWikipedia.sh```
|
|
|
|
## Build data
|
|
|
|
```bash
|
|
./parquet_thing --test_data
|
|
./parquet_thing --words
|
|
./geneticTraining --iterations=-20 --output_words=50
|
|
./compile.sh
|
|
./scoring
|
|
```
|
|
|
|
This will get the character occurrences, gets the word occurrences, and the test data, trains the words, re-compiles, and finally: scores it.
|