hyperglot Provide more "how to" documentation for python library use

As per #28 and #86 — the library is useful for the CLI, but without documentation it is not useful standalone.

Jun 27 '22 08:06 kontur

And (for later inclusion in an aggregated list of examples) "How to get languages/language counts by validity":

from hyperglot import VALIDITYLEVELS
from hyperglot.languages import Languages

counts = {level: [] for level in VALIDITYLEVELS}

for iso, language in Languages(validity=VALIDITYLEVELS[0]).items():
    counts[language["validity"]].append(iso)

print({level: len(isos) for level, isos in counts.items()})

Jul 01 '22 08:07 kontur

And "How many scripts are in the Hyperglot data" (all validity levels, all orthographies):

from hyperglot import VALIDITYLEVELS
from hyperglot.languages import Languages
from hyperglot.language import Language

scripts = []

for iso, language in Languages(validity=VALIDITYLEVELS[0]).items():
    l = Language(language, iso)
    if "orthographies" in l:
        scripts.extend([o["script"] for o in l["orthographies"]])

print(len(set(scripts)), sorted(set(scripts)))

Jul 01 '22 09:07 kontur

To document: 0.4.2 now has the destinction between accessing the raw yaml data of a language, e.g.:

from hyperglot.languages import Languages
hg = Languages()

# the raw yaml for 'eng'
hg["eng"]

# a ready to use hyperglot.language.Language object
hg.eng

This is a lot more convenient than having to initialize Language objects with Language(Languages()["xxx"], "xxx").

Nov 25 '22 14:11 kontur