Caroline/ArlongParkScrapeAndInterface

Fork 0

This repository includes code used to scrape data from Arlong Park Forums and a library built to power an API to scrape it. Programmed in HTML(JS+CSS), Nim, SQLITE, and an abandoned backend in C#

Find a file

Caroline e69750ad17 Delete src/nim/searching/cheaporm		2024-12-22 09:09:15 +02:00
execenv	forced the HTTP server to its own git	2024-12-02 16:49:48 -05:00
src	Delete src/nim/searching/cheaporm	2024-12-22 09:09:15 +02:00
README.md	corrected some typos	2024-11-28 09:00:15 -05:00

README.md

What is this project?

This project was a scraping of Arlong Park Forums (https://forums.arlongpark.net/). It was used as a dataset for to build a API and a web-based tool for. You can access it, hosted, here: https://albassort.github.io/CDN/ArlongPark/search.html

With the api and database itself is hosted at https://albassort.com/ArlongPark

DISCLAIMER

I request that you DO NOT use these tools to scrape ArlongParkForums. They have limited bandwidth, it is a small linode server. The data has already been gathered.

EndPoint map

"/query"
    Paramaters:
        // in unix times
        startTime[int64]
        endTime[int64]

        startChapter[int]
        endChapter[int]

        submitter[string]
        query[string]

        inPostId[int64]
        limit[int]
    result:
        queryTime[int] // the total time it took to query the database
        subposts[array[SubPost]]
        exception[bool]

All are optional, none are mandatory. If query == null then the first 100 posts are retrieved.

"/getInfo"
    Paramaters:
        none
    returns:
        //unix time -- the start of the first and last post
        firstPost[int64]
        lastPost[int64]
        // the min and max chapter
        firstChapter : int
        lastChapter : int

Gets the meta data of the tables for the frontend.

"/findUser"
    Paramaters:
        username[string]
    returns:
        //string and similarity score

        array[(string, int64)]

Compares the similarity of the username param and all the usernames in the database.

"/findPost"
    Paramaters:
        query[string]
    returns:
        //string and postid
        array[(string, int64)]

Looks for the query substring in all posts, and, if its in there, returns the post.

"/getPostId"
    Paramaters:
        postid[string]
    returns:
        (query response)

Gets a specific post, and only that.