EasyOCR icon indicating copy to clipboard operation
EasyOCR copied to clipboard

WIP: Automate various maintainer tasks

Open JulianOrteil opened this issue 3 years ago • 0 comments

This is the first PR in @ystoll and my reworking of the library as discussed in #839. Sticking with our plan in that discussion, this PR is invisible to end users. This should not affect anything they are doing except perhaps remove the need to compile DBNet.

This PR is specifically related to automating various maintainer tasks such as building, testing, publishing, and linting EasyOCR. Some of these processes are not fully implemented (i.e. testing) and are added here in preparation for their use later.

A near exhaustive list of changes this PR implements:

  • Added automation tool tox. Here is a description of the available instructions for this tool:
  1. tox - Invoke pytest to run automated tests against Python 3.7, 3.8, 3.9, and 3.10 (no tests, will fail)
  2. tox -e build - Build the package in isolation according to PEP517. Also builds DBNet.
  3. tox -e clean - Remove old distribution files and temporary build artifacts (./build and ./dist)
  4. tox -e cleanall - Same as tox -e clean but also clears out the build artifacts in ./DBNet (log.txt, dcn_compile_success, etc)
  5. tox -e docs - Compile documentation for ReadTheDocs (no docs, will fail)
  6. tox -e doctests - Compile documentation and verify interactive documentation runs as expected (no docs, will fail)
  7. tox -e linkcheck - Check for broken links in the documentation (no docs, will fail)
  8. tox -e lint - Perform static analysis and style checks on the code
  9. tox -e publish - Publish the package to testpypi or pypi, depending on the arguments
  • Fully building the library with DBNet is only possible on Linux. I have not yet been able to get it to complete on Windows, hence this PR being a draft.
  • python setup.py install and python setup.py develop fail to compile DBNet. I believe this is because these commands spawn a temporary build environment that does not contain all components required for the successful building of DBNet. I am not sure if this is intended, but it fails nonetheless. Because of this, I have replaced the cmdclass instructions.
  • The cmdclass instructions have been replaced by tox -e build. This instruction should handle everything necessary to compile EasyOCR in its entirety, including DBNet. This instruction should be run before pip install -e . or pip install ., but is not required. Either way, as an added bonus, due to tox -e build, DBNet should now be able to ship with base EasyOCR, or as an extra (i.e. pip install easyocr[dbnet]). We can explore this option later.
  • The proper way to now build the library should be done by running the following:
  1. tox -e lint
  2. git tag -a vX.X.X -m "Production release vX.X.X" (use vX.X.XaX for alpha builds, bX for beta, devX, and rcX)
  3. tox -e cleanall
  4. tox -e build
  5. tox -e publish <--- only for dev, alpha, beta, and release candidate builds of EasyOCR OR
  6. tox -e publish -- --repository pypi <--- only for production releases of EasyOCR
  • Building code via setup.py has been deprecated for some time now. As such, I have moved all configurations to setup.cfg in favor of using tox instructions which call pytest, python -m build, and twine. Each of these use setup.cfg for configuration. pyproject.toml would be better; however, some tools have yet to add support for that standard, so we'll stick with setup.cfg for now.
  • Specifying versioning for EasyOCR now only requires tagging the release using git tag as seen above. No need to statically set the version in __init__.py.
  • Removed potential security vulnerabilities where subprocess.run(..., shell=True) exists.

I will update this PR as time allows, but please feel free to give it a try in the meantime. I am open to any and all feedback.

JulianOrteil avatar Sep 14 '22 03:09 JulianOrteil