pydgraph icon indicating copy to clipboard operation
pydgraph copied to clipboard

feat: modernize project tooling and add comprehensive type annotations

Open mlwelles opened this issue 4 months ago โ€ข 4 comments

Summary

Comprehensive modernization of the pydgraph project with improved tooling, strict type safety, enhanced CI/CD infrastructure, and better developer experience. This PR adds type annotations to the entire codebase, migrates to modern Python tooling (uv, ruff, ty), establishes robust testing infrastructure, and provides comprehensive contribution guidelines.

Review Notes: While the overall diff is large (23k+ additions, 20k- deletions across 68 files), the 86 commits are organized logically around specific themes. The bulk of changes fall into five categories: type annotations, tooling setup, CI/CD infrastructure, documentation, and code quality fixes. Each commit follows conventional commit format with clear descriptions.

Related Issues

  • DGR-137 - Add type annotations to pydgraph
  • DGR-138 - Migrate to modern Python tooling (uv, ruff)
  • DGR-139 - Establish comprehensive CI/CD infrastructure
  • DGR-140 - Add Docker-based testing infrastructure
  • DGR-141 - Update Python version support (drop 3.7/3.8, add 3.13/3.14)
  • DGR-142 - Add comprehensive documentation (CONTRIBUTING.md)
  • DGR-143 - Fix code quality issues and linting errors
  • DGR-144 - Update protobuf generation with type stubs
  • DGR-151 - Deprecate Dgraph Cloud references

Key Changes

๐Ÿ› ๏ธ Project Tooling & Setup

  • Migrated to uv - Modern Python package manager (10-100x faster than pip)
    • Replaced traditional pip/virtualenv workflow
    • Added uv.lock for reproducible builds
    • All scripts and workflows use uv run and uv sync
  • Added comprehensive Makefile with intuitive targets:
    • make setup - One-command project setup (installs tools, hooks, syncs deps)
    • make check - Runs all pre-commit hooks (ruff, mypy, ty, trunk, shellcheck)
    • make test - Runs test suite with Docker infrastructure
    • make protogen - Regenerates protobuf files with mypy stubs
    • make build / make publish - Build and publish releases
    • Automatic dependency checking with INSTALL_MISSING_TOOLS=true flag
    • Supports uv, trunk, and Docker installation
  • Pre-commit hooks - 24 hooks across 8 categories run on every commit:
    • File validation (large files, YAML/TOML/JSON syntax, EOF/whitespace)
    • Shell linting (shellcheck)
    • YAML formatting (yamlfmt)
    • Python quality (ruff lint + format, blanket noqa checks, type annotations enforced)
    • Type checking (mypy, ty)
    • Trunk integration (trunk fmt, trunk check)

โœจ Type Safety & Code Quality

  • Added comprehensive type annotations to all Python files (23+ files, 300+ functions)
    • Core modules: client.py, async_client.py, txn.py, async_txn.py
    • Stub modules: client_stub.py, async_client_stub.py
    • Utilities: util.py, errors.py, convert.py
    • All test files (15 files)
    • Build scripts: protogen.py
  • Enabled strict mypy type checking
    • disallow_untyped_defs = true
    • disallow_incomplete_defs = true
    • check_untyped_defs = true
    • All functions have proper type signatures
  • Modern type annotation syntax
    • from __future__ import annotations for forward references
    • PEP 604 union syntax: X | None instead of Optional[X]
    • -> None for all void functions
  • Added ruff - Extremely fast Python linter/formatter (10-100x faster than Flake8/Black)
    • Comprehensive ruleset: 800+ rules across 20+ categories
    • Configured in pyproject.toml with sensible defaults
    • Automatically fixes issues in pre-commit
  • Added ty - Modern type checker with advanced error diagnostics
    • Runs alongside mypy for comprehensive type coverage
    • Better error messages than traditional type checkers

๐Ÿงน Code Quality Improvements

  • Fixed all ruff linting issues across source and test files:
    • Builtin shadowing (A001/A002/A004): Added strategic # noqa for intentional public API (open, ConnectionError)
    • Exception handling (TRY200/TRY300/TRY301): Refactored for correctness
      • Moved return statements outside try blocks
      • Use bare raise to preserve stack traces
      • Moved validation outside exception handlers
    • Security (S311): Replaced random.choice() โ†’ secrets.choice()
    • Code patterns (SIM105): Use contextlib.suppress() for clarity
  • Improved test assertions (PT011/PT017)
    • Added match parameters to 20+ pytest.raises calls
    • Replaced try-except-assert with proper pytest.raises
  • Removed Python 2 compatibility code
    • Removed urlparse fallback imports
    • Removed basestring and long type checks
    • Simplified string type checking
  • Fixed code quality issues
    • Fixed undefined variable references
    • Removed wildcard imports, added explicit __all__
    • Exported pydgraph.open function
    • Fixed variable shadowing in exception handlers

๐Ÿ”„ CI/CD Infrastructure

Separated workflows for better organization:

  1. ci-pydgraph-tests.yml - Matrix testing across Python versions

    • Test (Python 3.9-3.14 / DGraph Latest) - 6 parallel jobs testing against latest Dgraph release
    • Test (Python 3.9-3.14 / DGraph HEAD) - 6 parallel jobs testing against Dgraph main branch
    • Uses setup-python-and-tooling composite action
    • Explicit make setup and make sync steps
    • Tests 12 Python/Dgraph combinations in parallel
  2. ci-pydgraph-code-quality.yml - Code quality checks

    • Runs on Python 3.13 (canonical development version)
    • Protobuf verification (ensures generated files are current)
    • make check - All pre-commit hooks (ruff, mypy, ty, trunk, shellcheck)
    • Runs with SKIP=trunk-check,trunk-fmt to avoid duplication with trunk workflow
  3. ci-pydgraph-trunk.yml - Trunk code quality checks

    • Uses dgraph-io/.github reusable trunk workflow
    • Provides inline PR comments for issues
    • Separate from other checks for clear separation of concerns
  4. cd-pydgraph.yml - Release workflow

    • Workflow dispatch for manual releases
    • Runs full test suite before publishing
    • PyPI publishing with UV_PUBLISH_USERNAME and UV_PUBLISH_PASSWORD
    • Uses uv version and uv publish commands

Shared infrastructure:

  • setup-python-and-tooling composite action (renamed from setup-runner)
    • Sets up specified Python version with caching
    • Installs uv package manager
    • Used by all workflows for consistency

๐Ÿณ Testing Infrastructure

  • Docker-based test setup via scripts/local-test.sh
    • Automatic Dgraph cluster startup/teardown
    • Dynamic port allocation prevents conflicts
    • Isolated test environments
    • Supports DGRAPH_IMAGE_TAG for testing different Dgraph versions
  • Matrix testing - All tests run on Python 3.9, 3.10, 3.11, 3.12, 3.13, 3.14
  • Test results: โœ… 125 passed
  • Docker dependency checking in Makefile
    • Validates Docker 20.10.0+ and Docker Compose v2
    • Auto-install support on macOS and Linux

๐Ÿ”„ Dependency Updates

  • Flexible protobuf support: >=4.23.0,<7.0.0
    • Supports protobuf 4.x, 5.x, and 6.x
    • Users can pin to specific versions for compatibility
    • Modern environments (Python 3.13+) use 6.x by default
  • Updated grpcio: >=1.65.0,<2.0.0 (loosened from strict 1.65.1)
    • Addresses build issues on modern systems
    • Older versions fail to compile with recent Xcode/gcc
  • Added comprehensive dev dependencies:
    • Type checking: mypy>=1.14.1, ty>=0.0.8, grpc-stubs>=1.53.0.6
    • Type stubs: types-grpcio>=1.0.0, types-protobuf>=6.32.1
    • Linting: ruff>=0.8.4
    • Testing: pytest>=8.3.3, pytest-asyncio>=0.23.0
    • Tooling: pre-commit>=3.5.0, shellcheck-py>=0.10.0.1

โ˜๏ธ Dgraph Cloud Deprecation (DGR-151)

  • Deprecated cloud-specific functionality (deprecated in 25.1.0, removal planned for 26.0.0):
    • from_cloud() static methods in DgraphClientStub and AsyncDgraphClientStub
    • parse_host() helper methods
    • Methods restored with deprecation warnings and migration guidance
  • Updated documentation:
    • Removed "Connecting To Dgraph Cloud" section from README.md
    • Removed apikey parameter from connection string documentation
    • Removed cloud-specific connection string examples
  • Simplified examples:
    • Updated examples/embeddings/computeEmbeddings.py to use standard connection
    • Updated RAG notebook to use standard Dgraph connection
    • Added deprecation warnings to cloud-dependent notebooks (dgraph-episode1.ipynb, dgraph-ai-classification.ipynb)
  • Rationale: Dgraph Cloud no longer exists, so cloud-specific features (Lambda functions, cerebro API, cloud authentication) are no longer applicable
  • Migration path: Users should migrate to standard grpc.ssl_channel_credentials() pattern (see method docstrings for examples)

๐Ÿ Python Version Support

  • Added support: Python 3.13 and 3.14
  • Dropped support: Python 3.7 and 3.8
    • Both versions reached end of life
    • No longer supported by critical dependencies
    • Required for modern type annotation features
  • New minimum: Python 3.9
  • Development version: Python 3.13 (for protobuf generation)
  • CI testing: All versions 3.9-3.14 tested in parallel

๐Ÿ“š Documentation

  • Added CONTRIBUTING.md - Comprehensive contribution guide
    • Development setup with make setup
    • Code style and standards (SPDX headers, type hints, ruff formatting)
    • Testing procedures and infrastructure
    • PR requirements and conventional commits
    • Makefile command reference
    • Protobuf generation requirements
    • grpcio version compatibility notes
  • Added CODE_OF_CONDUCT.md - Contributor Covenant
  • Updated README.md
    • Simplified to focus on usage
    • Links to CONTRIBUTING.md for development
    • Updated examples to use uv run python
    • Removed duplicate development content
    • Removed Dgraph Cloud references
  • Updated example READMEs - All use modern uv run workflow
  • Updated PUBLISHING.md - New protogen command reference

๐Ÿ“ฆ Project Metadata

  • Updated license to SPDX format: license = "Apache-2.0"
    • Fixes setuptools deprecation warning
    • Removed deprecated license classifier
  • Updated author/maintainer: Istari Digital, Inc.
    • Updated __author__ and __maintainer__ in 18+ Python files
    • Updated email to [email protected]
  • Updated Homepage URL: https://github.com/dgraph-io/pydgraph
  • Updated classifiers: Python 3.9-3.14
  • Updated requires-python: >=3.9

๐Ÿ”ง Protobuf Generation

  • Enhanced scripts/protogen.py:
    • Version validation (requires Python 3.13+, grpcio-tools 1.66.2+)
    • Generates mypy type stubs (.pyi files)
    • Explicit error messages for version mismatches
    • Documents canonical development environment
  • Generated files updated:
    • api_pb2.py, api_pb2_grpc.py - Protobuf implementations
    • api_pb2.pyi, api_pb2_grpc.pyi - Type stubs for better IDE support
  • CI verification: Code quality workflow ensures generated files are current

Breaking Changes

  • Minimum Python version increased from 3.7 to 3.9
    • Python 3.7 and 3.8 reached end of life
    • Required for dependency compatibility and modern typing features

Deprecations

  • Dgraph Cloud-specific functionality (DGR-151)
    • DgraphClientStub.from_cloud() method deprecated (removal planned for v26.0.0)
    • AsyncDgraphClientStub.from_cloud() method deprecated (removal planned for v26.0.0)
    • DgraphClientStub.parse_host() method deprecated (removal planned for v26.0.0)
    • AsyncDgraphClientStub.parse_host() method deprecated (removal planned for v26.0.0)
    • Methods still functional but emit DeprecationWarning
    • Migration guidance provided in method docstrings
    • Users should migrate to standard connection methods with grpc.ssl_channel_credentials()

Backwards Compatibility

  • Protobuf version flexibility: Wide version support (4.23.0 - 6.x) ensures compatibility
    • Users can pin to older versions if needed:
      pip install pydgraph "protobuf>=4.23.0,<5.0.0"  # protobuf 4.x
      pip install pydgraph "protobuf>=5.0.0,<6.0.0"   # protobuf 5.x
      
  • grpcio flexibility: Supports 1.65.0+ for broad compatibility
  • Deprecated methods remain functional: Cloud-specific methods still work but emit warnings
  • No other API changes: All existing public APIs remain unchanged

Benefits

  • Type Safety: Comprehensive annotations catch errors at development time
  • Better IDE Support: Full autocomplete and type hints in all editors
  • Maintainability: Type signatures serve as inline documentation
  • Code Quality: Strict linting and formatting ensure consistency
  • Modern Tooling: uv provides 10-100x faster dependency management
  • Test Isolation: Docker-based tests prevent port conflicts
  • Comprehensive CI/CD: Automated quality checks on every PR across all Python versions
  • Developer Experience: One-command setup (make setup), clear documentation
  • Dependency Flexibility: Wide protobuf/grpcio support ensures compatibility
  • Graceful Deprecation: Cloud functionality deprecated with clear migration path

Testing

  • โœ… All pre-commit hooks pass (ruff, mypy, ty, trunk, shellcheck, yamlfmt)
  • โœ… All tests pass: 125 passed in ~80s
  • โœ… Tests verified on Python 3.9, 3.10, 3.11, 3.12, 3.13, 3.14
  • โœ… Protobuf generation works with Python 3.13
  • โœ… Type stubs generated successfully
  • โœ… Docker test infrastructure verified on macOS and Ubuntu
  • โœ… Deprecation warnings verified for cloud methods
  • โœ… CI workflows passing:
    • CodeQL security analysis
    • Code quality checks (pre-commit hooks)
    • Matrix tests (12 Python/Dgraph combinations)
    • Trunk checks

Tool References

mlwelles avatar Dec 28 '25 20:12 mlwelles