ArlongParkScrapeAndInterface/README.md
2024-11-28 09:00:15 -05:00

1.9 KiB

What is this project?

This project was a scraping of Arlong Park Forums (https://forums.arlongpark.net/). It was used as a dataset for to build a API and a web-based tool for. You can access it, hosted, here: https://albassort.github.io/CDN/ArlongPark/search.html

With the api and database itself is hosted at https://albassort.com/ArlongPark

DISCLAIMER

I request that you DO NOT use these tools to scrape ArlongParkForums. They have limited bandwidth, it is a small linode server. The data has already been gathered.

EndPoint map

"/query"
    Paramaters:
        // in unix times
        startTime[int64]
        endTime[int64]

        startChapter[int]
        endChapter[int]

        submitter[string]
        query[string]

        inPostId[int64]
        limit[int]
    result:
        queryTime[int] // the total time it took to query the database
        subposts[array[SubPost]]
        exception[bool]

All are optional, none are mandatory. If query == null then the first 100 posts are retrieved.

"/getInfo"
    Paramaters:
        none
    returns:
        //unix time -- the start of the first and last post
        firstPost[int64]
        lastPost[int64]
        // the min and max chapter
        firstChapter : int
        lastChapter : int

Gets the meta data of the tables for the frontend.

"/findUser"
    Paramaters:
        username[string]
    returns:
        //string and similarity score

        array[(string, int64)]

Compares the similarity of the username param and all the usernames in the database.

"/findPost"
    Paramaters:
        query[string]
    returns:
        //string and postid
        array[(string, int64)]

Looks for the query substring in all posts, and, if its in there, returns the post.

"/getPostId"
    Paramaters:
        postid[string]
    returns:
        (query response)

Gets a specific post, and only that.