EasyOCR
EasyOCR copied to clipboard
WIP: Automate various maintainer tasks
This is the first PR in @ystoll and my reworking of the library as discussed in #839. Sticking with our plan in that discussion, this PR is invisible to end users. This should not affect anything they are doing except perhaps remove the need to compile DBNet.
This PR is specifically related to automating various maintainer tasks such as building, testing, publishing, and linting EasyOCR. Some of these processes are not fully implemented (i.e. testing) and are added here in preparation for their use later.
A near exhaustive list of changes this PR implements:
- Added automation tool
tox. Here is a description of the available instructions for this tool:
-
tox- Invoke pytest to run automated tests against Python 3.7, 3.8, 3.9, and 3.10 (no tests, will fail) -
tox -e build- Build the package in isolation according to PEP517. Also builds DBNet. -
tox -e clean- Remove old distribution files and temporary build artifacts (./build and ./dist) -
tox -e cleanall- Same astox -e cleanbut also clears out the build artifacts in ./DBNet (log.txt, dcn_compile_success, etc) -
tox -e docs- Compile documentation for ReadTheDocs (no docs, will fail) -
tox -e doctests- Compile documentation and verify interactive documentation runs as expected (no docs, will fail) -
tox -e linkcheck- Check for broken links in the documentation (no docs, will fail) -
tox -e lint- Perform static analysis and style checks on the code -
tox -e publish- Publish the package totestpypiorpypi, depending on the arguments
- Fully building the library with DBNet is only possible on Linux. I have not yet been able to get it to complete on Windows, hence this PR being a draft.
-
python setup.py installandpython setup.py developfail to compile DBNet. I believe this is because these commands spawn a temporary build environment that does not contain all components required for the successful building of DBNet. I am not sure if this is intended, but it fails nonetheless. Because of this, I have replaced thecmdclassinstructions. - The cmdclass instructions have been replaced by
tox -e build. This instruction should handle everything necessary to compile EasyOCR in its entirety, including DBNet. This instruction should be run beforepip install -e .orpip install ., but is not required. Either way, as an added bonus, due totox -e build, DBNet should now be able to ship with base EasyOCR, or as an extra (i.e.pip install easyocr[dbnet]). We can explore this option later. - The proper way to now build the library should be done by running the following:
-
tox -e lint -
git tag -a vX.X.X -m "Production release vX.X.X"(usevX.X.XaXfor alpha builds,bXfor beta,devX, andrcX) -
tox -e cleanall -
tox -e build -
tox -e publish<--- only for dev, alpha, beta, and release candidate builds of EasyOCR OR -
tox -e publish -- --repository pypi<--- only for production releases of EasyOCR
- Building code via
setup.pyhas been deprecated for some time now. As such, I have moved all configurations tosetup.cfgin favor of using tox instructions which callpytest,python -m build, andtwine. Each of these usesetup.cfgfor configuration.pyproject.tomlwould be better; however, some tools have yet to add support for that standard, so we'll stick withsetup.cfgfor now. - Specifying versioning for EasyOCR now only requires tagging the release using
git tagas seen above. No need to statically set the version in__init__.py. - Removed potential security vulnerabilities where
subprocess.run(..., shell=True)exists.
I will update this PR as time allows, but please feel free to give it a try in the meantime. I am open to any and all feedback.