hyperglot
hyperglot copied to clipboard
Provide more "how to" documentation for python library use
As per #28 and #86 — the library is useful for the CLI, but without documentation it is not useful standalone.
And (for later inclusion in an aggregated list of examples) "How to get languages/language counts by validity":
from hyperglot import VALIDITYLEVELS
from hyperglot.languages import Languages
counts = {level: [] for level in VALIDITYLEVELS}
for iso, language in Languages(validity=VALIDITYLEVELS[0]).items():
counts[language["validity"]].append(iso)
print({level: len(isos) for level, isos in counts.items()})
And "How many scripts are in the Hyperglot data" (all validity levels, all orthographies):
from hyperglot import VALIDITYLEVELS
from hyperglot.languages import Languages
from hyperglot.language import Language
scripts = []
for iso, language in Languages(validity=VALIDITYLEVELS[0]).items():
l = Language(language, iso)
if "orthographies" in l:
scripts.extend([o["script"] for o in l["orthographies"]])
print(len(set(scripts)), sorted(set(scripts)))
To document: 0.4.2 now has the destinction between accessing the raw yaml data of a language, e.g.:
from hyperglot.languages import Languages
hg = Languages()
# the raw yaml for 'eng'
hg["eng"]
# a ready to use hyperglot.language.Language object
hg.eng
This is a lot more convenient than having to initialize Language objects with Language(Languages()["xxx"], "xxx").