Zipfs-Law-Language-Detector/execenv/README.md
2024-10-13 15:18:16 -04:00

38 lines
749 B
Markdown

# Instructions
## Step 1. Compile
Dependencies
```
Nim (>= 2.0.6)
Rust & Cargo
sqlite3
```
For nimble deps:
```
nimble install db_connector/db_sqlite tiny_sqlite
```
To init:
```
./compile.sh
./initdbs.sh
```
## Download wikipedia
Make a folder in ../wikimedia wikipedia, for the wikipedia data. On my machine, its system-linked to /mnt. I imagine many people want to do this, so it's not instantiated by default.
```./downloadWikipedia.sh```
## Build data
```bash
./parquet_thing --test_data
./parquet_thing --words
./geneticTraining --iterations=-20 --output_words=50
./compile.sh
./scoring
```
This will get the character occurrences, gets the word occurrences, and the test data, trains the words, re-compiles, and finally: scores it.