From Robowaifu Institute of Technology
Jump to navigation Jump to search
This page requires expansion!
The MediaWiki API is vast and can even create edits. This page only has examples of how to get page contents and a list of pages. has an API your AI waifu can access. The site is on cheap hosting, so please cache results so it doesn't get hammered with too many requests. Once the wiki sufficiently grows I will provide a dataset download of the whole site.

Requirements[edit source]

python -m pip install requests wikitextparser bs4

Get page contents[edit source]

import requests
import wikitextparser as wtp

page = "Machine learning"
page = page.replace(' ', '_')
response = requests.get(f"{page}&format=json&prop=wikitext&formatversion=2")
obj = response.json()["parse"]
plain_text = wtp.parse(obj["wikitext"]).plain_text()


Machine learning is a field of study on methods that allow computers to learn from data without explicit programming. Instead of using human coded variables to perform specific tasks...

For more information, see Mediawiki API:Parsing wikitext.

Search wiki[edit source]

import requests
import urllib.parse
from bs4 import BeautifulSoup

search = "\"information theory\""
search = urllib.parse.quote(search)
response = requests.get(f"{search}&utf8=&format=json")
for result in response.json()["query"]["search"]:
    print(f"= {result['title']} =")
    print(BeautifulSoup(result["snippet"], "lxml").text)


= Entropy =
...s. [[Claude Shannon]] was the first to introduce the concept of entropy in information theory, and his work laid the foundation for modern digital communication and cryp

Get list of pages[edit source]

import requests

def get_pages(apmin=100):
    '''apmin - minimum amount of characters pages must have to be included'''
    pages = []
    response = requests.get(f"{apmin}")
    obj = response.json()
    pages += obj["query"]["allpages"]
    while "continue" in obj:
        apcontinue = obj["continue"]["apcontinue"]
        response = requests.get(f"{apcontinue}&apmin={apmin}")
        obj = response.json()
        pages += obj["query"]["allpages"]
    return pages

pages = get_pages(apmin=100)
for page in pages:


3D printing
Art and design
Artificial intelligence

For more information, see MediaWiki API:Allpages.