.. | ||
.gitignore | ||
buildData.sh | ||
compile.sh | ||
downloadWikipedia.sh | ||
graph.py | ||
initdbs.sh | ||
README.md |
Instructions
Step 1. Compile
Dependencies
Nim (>= 2.0.6)
Rust & Cargo
sqlite3
For nimble deps:
nimble install db_connector/db_sqlite tiny_sqlite
To init:
./compile.sh
./initdbs.sh
Download wikipedia
Make a folder in ../wikimedia wikipedia, for the wikipedia data. On my machine, its system-linked to /mnt. I imagine many people want to do this, so it's not instantiated by default.
./downloadWikipedia.sh
Build data
./parquet_thing --test_data
./parquet_thing --words
./geneticTraining --iterations=-20 --output_words=50
./compile.sh
./scoring
This will get the character occurrences, gets the word occurrences, and the test data, trains the words, re-compiles, and finally: scores it.