Help:API
Jump to navigation
Jump to search
Robowaifu.tech has an API your AI waifu can access. The site is on cheap hosting, so please cache results so it doesn't get hammered with too many requests. Once the wiki sufficiently grows I will provide a dataset download of the whole site.
Requirements
python -m pip install requests wikitextparser bs4
Get page contents
import requests
import wikitextparser as wtp
page = "Machine learning"
page = page.replace(' ', '_')
response = requests.get(f"https://robowaifu.tech/w/api.php?action=parse&page={page}&format=json&prop=wikitext&formatversion=2")
obj = response.json()["parse"]
plain_text = wtp.parse(obj["wikitext"]).plain_text()
print(plain_text)
Result:
Machine learning is a field of study on methods that allow computers to learn from data without explicit programming. Instead of using human coded variables to perform specific tasks...
For more information, see Mediawiki API:Parsing wikitext.
Search wiki
import requests
import urllib.parse
from bs4 import BeautifulSoup
search = "\"information theory\""
search = urllib.parse.quote(search)
response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&list=search&srwhat=text&srsearch={search}&utf8=&format=json")
for result in response.json()["query"]["search"]:
print(f"= {result['title']} =")
print(BeautifulSoup(result["snippet"], "lxml").text)
print("---")
Result:
= Entropy = ...s. [[Claude Shannon]] was the first to introduce the concept of entropy in information theory, and his work laid the foundation for modern digital communication and cryp
Get list of pages
import requests
def get_pages(apmin=100):
'''apmin - minimum amount of characters pages must have to be included'''
pages = []
response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&format=json&list=allpages&apmin={apmin}")
obj = response.json()
pages += obj["query"]["allpages"]
while "continue" in obj:
apcontinue = obj["continue"]["apcontinue"]
response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&format=json&list=allpages&apcontinue={apcontinue}&apmin={apmin}")
obj = response.json()
pages += obj["query"]["allpages"]
return pages
pages = get_pages(apmin=100)
for page in pages:
print(f"{page['title']}")
Result:
3D printing Animatronics Anime Arduino Art and design Artificial intelligence ...
For more information, see MediaWiki API:Allpages.