Help:API: Difference between revisions

Latest revision as of 22:10, 3 May 2023

This page requires expansion!
The MediaWiki API is vast and can even create edits. This page only has examples of how to get page contents and a list of pages.

Robowaifu.tech has an API your AI waifu can access. The site is on cheap hosting, so please cache results so it doesn't get hammered with too many requests. Once the wiki sufficiently grows I will provide a dataset download of the whole site.

Requirements

python -m pip install requests wikitextparser bs4

Get page contents

import requests
import wikitextparser as wtp

page = "Machine learning"
page = page.replace(' ', '_')
response = requests.get(f"https://robowaifu.tech/w/api.php?action=parse&page={page}&format=json&prop=wikitext&formatversion=2")
obj = response.json()["parse"]
plain_text = wtp.parse(obj["wikitext"]).plain_text()
print(plain_text)

Result:

Machine learning is a field of study on methods that allow computers to learn from data without explicit programming. Instead of using human coded variables to perform specific tasks...

For more information, see Mediawiki API:Parsing wikitext.

Search wiki

import requests
import urllib.parse
from bs4 import BeautifulSoup

search = "\"information theory\""
search = urllib.parse.quote(search)
response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&list=search&srwhat=text&srsearch={search}&utf8=&format=json")
for result in response.json()["query"]["search"]:
    print(f"= {result['title']} =")
    print(BeautifulSoup(result["snippet"], "lxml").text)
    print("---")

Result:

= Entropy =
...s. [[Claude Shannon]] was the first to introduce the concept of entropy in information theory, and his work laid the foundation for modern digital communication and cryp

Get list of pages

import requests

def get_pages(apmin=100):
    '''apmin - minimum amount of characters pages must have to be included'''
    pages = []
    response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&format=json&list=allpages&apmin={apmin}")
    obj = response.json()
    pages += obj["query"]["allpages"]
    while "continue" in obj:
        apcontinue = obj["continue"]["apcontinue"]
        response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&format=json&list=allpages&apcontinue={apcontinue}&apmin={apmin}")
        obj = response.json()
        pages += obj["query"]["allpages"]
    return pages

pages = get_pages(apmin=100)
for page in pages:
    print(f"{page['title']}")

Result:

3D printing
Animatronics
Anime
Arduino
Art and design
Artificial intelligence
...

For more information, see MediaWiki API:Allpages.

Help:API: Difference between revisions

Latest revision as of 22:10, 3 May 2023

Contents

Requirements

Get page contents

Search wiki

Get list of pages

Navigation menu