Andrew Moore comments

Results 13 comments of


                                            Andrew Moore

Better batch processing

Hi, I have created this batch utility around Stanza that might solve some of the problem raised here: https://github.com/apmoore1/stanza-batch It allows you to give it an Iterable of texts and...

Multi Word Expressions

The first three tasks have been completed through the #32 pull request

Speed and memory enhancement

This would be best suited as perhaps a replacement to the [LexiconCollection class.](https://github.com/UCREL/pymusas/blob/main/pymusas/lexicon_collection.py)

Speed and memory enhancement

If any of the Lexicon Collections use the same string key for a dictionary it might be useful to use a [String Store](https://spacy.io/api/stringstore) from the spaCy project, whereby the String...

Finish the CONTRIBUTING Guidelines

Another good resource: https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/setting-guidelines-for-repository-contributors

Finish the CONTRIBUTING Guidelines

The contributing guidelines should also include a section on how to release a version of PyMUSAS, e.g. change the version number in the code, and in the CITATION.cff file and...

Add cross reference code to the pydoc-markdown script, an example of this can be found in the [AllenNLP code base](https://github.com/allenai/allennlp/blob/main/scripts/py2md.py#L251). By having this cross reference code it will allow users...

Documentation

This I think would be the best plugin for performing search on the documentation site: [https://github.com/lelouch77/docusaurus-lunr-search](https://github.com/lelouch77/docusaurus-lunr-search)

Add code coverage reporting in CI pipeline

It might be useful to only run the reporting workflow if there is a particular comment in the pull request, e.g. `/report` of which this could be done following this...

Neural and Hybrid Taggers

In the future to make the models more efficient/faster it would be good if we either used: 1. [ONNX](https://docs.pytorch.org/docs/2.9/onnx.html) - This could also reduce the size of PyMUSAS as we...