Implements Token Federation for Python Driver
What type of PR is this?
- [ ] Refactor
- [x] Feature
- [ ] Bug Fix
- [ ] Other
Description
This PR adds token federation support to the Databricks SQL Python connector, which allows using external identity provider tokens (like GitHub Actions OIDC tokens) with Databricks SQL.
Key Changes
Core Implementation
- Added token federation as a new auth type with supporting classes and methods
- Implemented token exchange mechanism to convert external tokens to Databricks tokens
Code Architecture
- Added
DatabricksTokenFederationProviderclass to handle token federation - Added
Tokenclass to manage token lifecycle and expiry - Implemented timezone-aware datetime handling to prevent comparison issues
- Added IdP detection to support various identity providers (Azure AD, GitHub, Google, AWS)
API & Configuration
- Added
identity_federation_client_idparameter for token federation - Added proper OIDC discovery for finding token endpoints
- Added fallback mechanisms for error handling
Testing
- Added unit tests with mocking for token federation components
- Added end-to-end test for GitHub OIDC tokens
Future Improvements
- Token federation should be refactored as a feature that works with different auth types instead of being an auth type itself
- OAuthProvider should be integrated with token federation to allow token exchange for OAuth-acquired tokens
- Use a standardized approach for feature flags across the codebase
This PR enables Databricks SQL connector users to leverage external identity providers for authentication, particularly useful in CI/CD environments like GitHub Actions.
How is this tested?
- [x] Unit tests
- [x] E2E Tests
- [x] Manually (via CI/CD)
- [ ] N/A
Related Tickets & Documents
Notes for reviewers:
Token Federation Flow
1. Client Initialization
- User creates a SQL connection with
auth_type="token-federation"and provides an external token - Can be initialized either with
access_tokenor a customcredentials_provider - LIMITATION: Currently implemented as a standalone auth type, not a feature that can be combined with other auth types
-
TODO: Refactor to make token federation a feature that works with any auth type via a
use_token_federationflag
2. Auth Provider Selection
-
get_auth_provider()inauth.pydetects token federation auth type - Creates a
DatabricksTokenFederationProviderwrapper around the credential source -
TODO: Remove
TOKEN_FEDERATIONas an auth_type while maintaining backward compatibility -
TODO: Allow wrapping of existing providers (
DatabricksOAuthProvider,AccessTokenAuthProvider, etc.)
3. Token Evaluation
- When headers are requested, the federation provider:
- Gets external token from underlying provider
- Parses JWT claims to check token issuer
- Determines if token needs exchange based on issuer comparison
- The token evaluation works with any valid JWT, regardless of how it was obtained
- TODO: Design interfaces to wrap any auth provider with token federation capability
4. Token Exchange
- If token is from a different issuer than the target Databricks host:
- Uses OIDC discovery to find token endpoint
- Exchanges external token for Databricks token via token exchange protocol
- Stores exchanged token and original external token for future reference
- If token is from same issuer, uses original token without exchange
- This process works correctly for any token regardless of source
5. Token Refresh
- Before token expiry (controlled by
TOKEN_REFRESH_BUFFER_SECONDS = 10):- Requests fresh external token from underlying provider
- Exchanges this fresh token for a new Databricks token
- Updates stored tokens and headers
- LIMITATION: Relies on underlying provider for fresh tokens
6. Fallback Handling
- If token exchange or refresh fails, falls back to original external token
- Logs appropriate warnings/errors
Future Provider Integration Plan
To properly integrate token federation with all auth providers in authenticators.py:
-
Decorator Pattern Implementation:
- Create a wrapper class that can decorate any
AuthProviderwith token federation capabilities - Allow wrapping of
DatabricksOAuthProvider,AccessTokenAuthProvider, etc.
- Create a wrapper class that can decorate any
-
Configuration Changes:
- Add a
use_token_federationboolean flag to connection parameters - Modify
get_auth_provider()to apply token federation wrapper when flag is set
- Add a
-
Provider Interface Enhancement:
- Update
CredentialsProviderinterface to expose necessary token information - Ensure
DatabricksOAuthProviderproperly implements this interface for token access
- Update
-
Backward Compatibility:
- Maintain support for existing
auth_type="token-federation"during transition - Add deprecation warnings and migration guidance
- Maintain support for existing
The core token exchange functionality works well for any token, but the current architecture limits token federation to being a separate auth type. The primary improvement needed is architectural - enabling token federation to work with other auth types (including OAuth-based ones) while maintaining backward compatibility.