mwparserfromhell
mwparserfromhell copied to clipboard
A Python parser for MediaWiki wikicode
- ~~`Comment`~~ - `ParserFunction` / `MagicWord` / `BehaviorSwitch` - ~~`Link`~~ - ~~`Table`~~ - `Redirect`
The strings "[[]]" and "[[|]]" (empty "wikilinks") are not treated as wikilinks by MediaWiki and shouldn't be treated as such here either.
```python text = ''' {{hello |foo={{A}, {B}} |bar=123 }} ''' import mwparserfromhell print(mwparserfromhell.parse(text).filter_templates()) ``` Expected output: ``` ['{{hello\n|foo={{A}, {B}}\n|bar=123\n}}'] ``` Received output: ``` ['{{hello\n|foo={{A}, {B}}'] ```
We've been doing some parsing of lots of historical Wikipedia revisions so are running into weird edge cases. One of these is the revision below that seems to somehow escape...
Would it be possible to add support for matching [Scribunto](https://www.mediawiki.org/wiki/Extension:Scribunto) modules that are invoked on a page? - Note: An invoked module looks like `{{#invoke:Foo|bar}}`, which _executes_ the function of...
There seem to be something weird going on when parsing Tables. I get attributes as Text nodes. Let's consider this example: ```python import mwparserfromhell text = """ {| class="wikitable" |-...
We can't load pickled SmartList properly This adds a failing test to demonstrate
This is a first draft of the smart list. It passes all of the existing smartlist tests (except for full-list reversal, whose behavior has changed), but the rest of the...
I'm trying to parse the wikitext in a live wikipedia page (https://vi.wikipedia.org/wiki/Apple_Inc.?action=raw) and get the template of the infobox but fail. After some inspection I find that the `` tags...
I followed https://coveralls-python.readthedocs.io/en/latest/usage/configuration.html#github-actions-support but I'm not exactly sure if it'll work. Maybe it needs a branch filter or something so it doesn't run for PRs that don't have access to...