Help:API: Difference between revisions

From Robowaifu Institute of Technology
Jump to navigation Jump to search
(Created page with "Robowaifu.tech has an '''API''' your AI waifu can access. The site is on cheap hosting, so please cache results so it doesn't get hammered with too many requests. Once the wiki sufficiently grows I will provide a dataset download of the whole site. == Requirements == <syntaxhighlight lang="bash"> python -m pip install requests wikitextparser </syntaxhighlight> == Get page contents == <syntaxhighlight lang="python"> i...")
 
(Added search API example)
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
{{Expand|The MediaWiki API is vast and can even create edits. This page only has examples of how to get page contents and a list of pages.}}
[[Robowaifu Institute of Technology|Robowaifu.tech]] has an '''API''' your [[Waifu AI|AI waifu]] can access. The site is on cheap hosting, so please cache results so it doesn't get hammered with too many requests. Once the wiki sufficiently grows I will provide a dataset download of the whole site.
[[Robowaifu Institute of Technology|Robowaifu.tech]] has an '''API''' your [[Waifu AI|AI waifu]] can access. The site is on cheap hosting, so please cache results so it doesn't get hammered with too many requests. Once the wiki sufficiently grows I will provide a dataset download of the whole site.


Line 4: Line 5:


<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
python -m pip install requests wikitextparser
python -m pip install requests wikitextparser bs4
</syntaxhighlight>
</syntaxhighlight>


Line 24: Line 25:


For more information, see [https://www.mediawiki.org/wiki/API:Parsing_wikitext Mediawiki API:Parsing wikitext].
For more information, see [https://www.mediawiki.org/wiki/API:Parsing_wikitext Mediawiki API:Parsing wikitext].
== Search wiki ==
<syntaxhighlight lang="python">
import requests
import urllib.parse
from bs4 import BeautifulSoup
search = "\"information theory\""
search = urllib.parse.quote(search)
response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&list=search&srwhat=text&srsearch={search}&utf8=&format=json")
for result in response.json()["query"]["search"]:
    print(f"= {result['title']} =")
    print(BeautifulSoup(result["snippet"], "lxml").text)
    print("---")
</syntaxhighlight>
Result:
<pre>= Entropy =
...s. [[Claude Shannon]] was the first to introduce the concept of entropy in information theory, and his work laid the foundation for modern digital communication and cryp
</pre>


== Get list of pages ==
== Get list of pages ==
Line 31: Line 53:


def get_pages(apmin=100):
def get_pages(apmin=100):
    '''apmin - minimum amount of characters pages must have to be included'''
     pages = []
     pages = []
     response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&format=json&list=allpages&apmin={apmin}")
     response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&format=json&list=allpages&apmin={apmin}")

Latest revision as of 23:10, 3 May 2023

This page requires expansion!
The MediaWiki API is vast and can even create edits. This page only has examples of how to get page contents and a list of pages.

Robowaifu.tech has an API your AI waifu can access. The site is on cheap hosting, so please cache results so it doesn't get hammered with too many requests. Once the wiki sufficiently grows I will provide a dataset download of the whole site.

Requirements

python -m pip install requests wikitextparser bs4

Get page contents

import requests
import wikitextparser as wtp

page = "Machine learning"
page = page.replace(' ', '_')
response = requests.get(f"https://robowaifu.tech/w/api.php?action=parse&page={page}&format=json&prop=wikitext&formatversion=2")
obj = response.json()["parse"]
plain_text = wtp.parse(obj["wikitext"]).plain_text()
print(plain_text)

Result:

Machine learning is a field of study on methods that allow computers to learn from data without explicit programming. Instead of using human coded variables to perform specific tasks...

For more information, see Mediawiki API:Parsing wikitext.

Search wiki

import requests
import urllib.parse
from bs4 import BeautifulSoup

search = "\"information theory\""
search = urllib.parse.quote(search)
response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&list=search&srwhat=text&srsearch={search}&utf8=&format=json")
for result in response.json()["query"]["search"]:
    print(f"= {result['title']} =")
    print(BeautifulSoup(result["snippet"], "lxml").text)
    print("---")

Result:

= Entropy =
...s. [[Claude Shannon]] was the first to introduce the concept of entropy in information theory, and his work laid the foundation for modern digital communication and cryp

Get list of pages

import requests

def get_pages(apmin=100):
    '''apmin - minimum amount of characters pages must have to be included'''
    pages = []
    response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&format=json&list=allpages&apmin={apmin}")
    obj = response.json()
    pages += obj["query"]["allpages"]
    while "continue" in obj:
        apcontinue = obj["continue"]["apcontinue"]
        response = requests.get(f"https://robowaifu.tech/w/api.php?action=query&format=json&list=allpages&apcontinue={apcontinue}&apmin={apmin}")
        obj = response.json()
        pages += obj["query"]["allpages"]
    return pages

pages = get_pages(apmin=100)
for page in pages:
    print(f"{page['title']}")

Result:

3D printing
Animatronics
Anime
Arduino
Art and design
Artificial intelligence
...

For more information, see MediaWiki API:Allpages.