dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[DSIP-88][Auth] Enhancing Apache DolphinScheduler with Generalized OIDC Authentication

Open Gallardot opened this issue 8 months ago • 10 comments

Search before asking

  • [x] I had searched in the DSIP and found no similar DSIP.

Motivation

From GSoC2025 https://issues.apache.org/jira/browse/GSOC-284

Apache DolphinScheduler is a distributed and extensible workflow scheduler platform designed to orchestrate complex data processing tasks. It provides a user-friendly interface for defining, scheduling, and monitoring workflows, making it easier to manage and automate data pipelines. DolphinScheduler supports various types of tasks, including shell scripts, SQL queries, and custom scripts, and integrates seamlessly with popular big data ecosystems.

Currently, the Apache DolphinScheduler system supports user login via Password, LDAP, Casdoor SSO, and OAuth. However, as a data platform, it frequently needs to integrate with enterprise - internal user accounts to achieve unified identity authentication, which is crucial for ensuring system security and unified user account management. The existing implementation of Casdoor has a high degree of dependence on the Casdoor project, and the OAuth implementation lacks universality and flexibility.

Our objective is to implement a more generalized OIDC (OpenID Connect) login authentication mechanism. This will enable users to make better use of unified login authentication. Moreover, popular open source login authentication projects like Dexidp, Keycloak, and OAuthProxy all support OIDC. By supporting OIDC, users can integrate with both internal and third-party login authentication methods, such as Feishu Login and WeChat Work Login.

cc: @tusaryan

Design Detail

No response

Compatibility, Deprecation, and Migration Plan

No response

Test Plan

No response

Code of Conduct

Gallardot avatar May 11 '25 14:05 Gallardot

What is a gsoc project? How can I contribute?

lancemao avatar May 18 '25 07:05 lancemao

1. Design / Description of Work

Implement generalized OIDC support by leveraging existing libraries and adhering to DolphinScheduler standards.

1.1 Abbreviations Used:

  • Idp: Identity Provider
  • DS role: DolphinScheduler roles

1.2 Backend Configuration ( api-server/conf/application.yaml ) - [Essential]:

  • Add OIDC as a valid value for security.authentication.type.

  • Add a top-level security.authentication.oidc.enable: true/false flag.

  • Define a structure for multiple OIDC provider configurations under security.authentication.oidc.providers:

    security:
      authentication:
        type: oidc # Add OIDC here
        oidc:
          enable: true
          providers:
            #registrationId used in URLs and internal mapping
            keycloak:
              display-name: "Login with Keycloak" # Text for UI Button
              issuer-uri: https://my-keycloak.example.com/auth/realms/myrealm
              client-id: dolphinscheduler-client
              client-secret: 7dgvJS724_saj9$VHVsb9_Very_Secret
              # Optional: Specify client auth method (e.g., client_secret_basic, client_secret_post). Defaults if omitted.
              # client-authentication-method: client_secret_basic
              scope: openid, profile, email, groups # Default: openid, profile, email
              user-name-attribute: preferred_username #Claim to use as username (e.g., sub, email, preferred_username)
              groups-claim: groups # Optional: Claim containing user groups/roles
            # Add more providers here (e.g., okta, azure, google)
            # okta: ....
            user: # Settings for auto-provisioning OIDC users
              auto-create: true # Create DS user if not found? Default: false
              default-tenant-code: "default" # Tenant code for auto-created users
              default-queue: "default" # Queue for auto-created users (if needed by permissions)
              # default-roles: [USER, DEVELOPER] # Optional: Default internal DS roles (if needed, needs design discussion)
    

    (Schema includes essential OIDC client details and user provisioning controls)

1.3 Backend OIDC Implementation ( dolphinscheduler-api ) - [Essential]:

  • Leverage Existing Libraries: Utilize com.nimbusds:oauth2-oidc-sdk for OIDC protocol interactions (discovery, token exchange, userinfo endpoint). Ensure Apache License v2 headers are included in all new files and adhere to coding specs and Spotless formatting. Check Nimbus SDK license (MIT) compatibility.

  • Configuration Loading: Implement a configuration properties class (OidcConfigProperties) to load the security.authentication.oidc structure from application.yaml.

  • Spring Security Integration:

    • Create a custom Filter (e.g., DynamicOidcAuthenticationFilter) registered in the Spring Security SecurityFilterChain. This filter will intercept requests potentially related to OIDC callbacks (e.g., /login/oauth2/code/*).
    • Inside the filter/associated components:
      • Dynamically resolve the OIDC provider configuration based on the request (e.g., using the registrationId from the path).
      • Use Nimbus SDK classes (OIDCProviderMetadata.resolve, AuthorizationCodeGrant, TokenRequest, UserInfoRequest, etc.) to handle the OIDC Authorization Code Flow:
        • Generate the OIDC Authorization Request URI, including client_id, scope, response_type=code, redirect_uri, and a securely generated state parameter for CSRF protection and potentially a nonce parameter for replay protection within the ID Token..
        • Upon receiving the callback, validate the received state parameter against the value associated with the user's session.
        • Exchange the authorization code for tokens (ID Token, Access Token).
        • Validate the ID Token (signature, issuer, audience, expiry, and the nonce value if used). Validate the state parameter against the original request.
        • Fetch user information from the UserInfo endpoint using the access token.
  • User Mapping/Provisioning Service:

    • Implement a service (e.g., OidcUserProcessingService) responsible for handling the authenticated OIDC user details.
    • Input: Takes the validated ID Token claims and UserInfo response.
    • Logic:
      • Extract the unique identifier based on the configured user-name-attribute.
      • Extract email and potentially group information (from groups-claim).
      • Interact with the existing UserService/DAO (Mybatis Plus based) to:
        • Find a t_ds_user record by querying the user_name column with the value derived from the configured OIDC user-name-attribute.
        • If a user is not found and oidc.user.auto-create is enabled, create a new user record by populating existing fields (user_name, email, default tenant_id, default queue, etc.) using the OIDC claims and configured defaults.
      • (Design Decision Required): Determine how to map OIDC groups (if provided via groups-claim) to internal DolphinScheduler authorities/roles. This might involve configuration or a predefined mapping strategy.
      • Create an internal Authentication object representing the logged-in user.
      • Crucially, this approach requires no modifications to the existing t_ds_user database schema i.e. no ALTER TABLE required. User mapping relies on querying existing fields, and user provisioning populates the existing table structure using standard INSERT operation.
  • Authentication Success:

    • On successful OIDC authentication and user mapping, use Spring Security's SecurityContextHolder to set the Authentication object.
    • This integrates with Spring Security's standard HTTP Session management, ensuring the user's login state is maintained across subsequent requests, consistent with how Password and LDAP logins are handled. Downstream components can then access user details via standard Spring Security mechanisms (e.g., SecurityContextHolder, potentially session attributes).
    // Conceptual snippet - Showing Nimbus usage idea within a Filter/Service
    // dolphinscheduler-api/src/main/java/.../security/oidc/
    // Adheres to Java coding standards (ref: Code of Conduct)
    
    // @Component or configured in SecurityFilterChain
    public class DynamicOidcAuthenticationFilter extends OncePerRequestFilter { // Or similar mechanism
    
        // @Autowired OidcConfigProperties oidcConfig;
        // @Autowired OidcUserProcessingService userProcessingService;
    
        @Override
        protected void doFilterInternal(HttpServletRequest request, HttpServletResponse response, FilterChain filterChain) throws ServletException, IOException {
    
            // 1. Check if request matches OIDC callback pattern (e.g., /login/oauth2/code/{registrationId})
            // 2. Extract registrationId and authorization code
            // 3. Get provider config: OidcProviderConfig providerConf = oidcConfig.getProviders().get(registrationId);
            // 4. Use Nimbus SDK:
            //  - OIDCProviderMetadata metadata = OIDCProviderMetadata.resolve(new URI(providerConf.getIssuerUri()));
            //  - ClientID clientID = new ClientID(providerConf.getClientId());
            //  - Secret clientSecret = new Secret(providerConf.getClientSecret());
            //  - AuthorizationCode code = new AuthorizationCode(request.getParameter("code"));
            //  - URI redirectURI = ... // Construct callback URI
            //  - AuthorizationCodeGrant codeGrant = new AuthorizationCodeGrant(code, redirectURI);
            //  - TokenRequest tokenRequest = new TokenRequest(metadata.getTokenEndpointURI(), clientID, codeGrant, null, clientSecret); // Simplified
            //  - TokenResponse tokenResponse = OIDCTokenResponseParser.parse(tokenRequest.toHTTPRequest()).send();
            //  - Validate ID Token (Nonce, Signature, Claims - using Nimbus `IDTokenValidator`)
            //  - Fetch UserInfo
            //  - UserInfo userInfo = UserInfo.parse(userInfoRequest.toHTTPRequest().send()); // Simplified
            // 5. Process user: Authentication auth = userProcessingService.processUser(idTokenClaims, userInfo, registrationId);
            // 6. Set security context: SecurityContextHolder.getContext().setAuthentication(auth);
            // 7. Redirect user (e.g., to UI homepage)
    
            // else: pass request down the chain
            filterChain.doFilter(request, response);
        }
    }
    

    (Conceptual Code Snippet: Using Nimbus SDK)

Proposed OIDC Components within API Architecture:

The following diagrams illustrates how the proposed OIDC components integrate within the DolphinScheduler API server architecture:

Image Fig 1.

Image Fig 2.

Fig 1 & 2: Demonstrates Complete Workflow of Proposed OIDC Components within API Architecture and with Existing Services (Click here for link)

1.4 Frontend UI Changes ( dolphinscheduler-ui - Vue.js) - [Essential]:

  • API Endpoint: Add a simple backend API endpoint (e.g., /api/auth/oidc-providers) that returns a list of enabled OIDC providers configured in application.yaml (specifically their registrationId and display-name). Follow REST standards.
  • Login Page Modification: Modify the Vue.js login component (Login.vue or similar):
    • Fetch the list of enabled OIDC providers from the new /api/auth/oidc-providers endpoint when the login page loads.
    • Use v-for to dynamically render login buttons for each enabled OIDC provider.
    • Each button should link/redirect the user to the OIDC Authorization Endpoint. The backend needs to handle constructing the correct authorization URL (including client_id, scope, redirect_uri, state, nonce) for the selected provider. A common pattern is linking the button to an intermediate backend endpoint like /oauth2/authorization/{registrationId} which then performs the redirect to the IdP.
  • Adhere to documented frontend standards [ref: Frontend Dev Guide].
Image

Fig 3: Illustration of Tentative Modified UI (Click here to View) (dynamically renders based on configuration).

1.5 Testing (Unit & E2E) - [Essential]:

  • Unit Tests: Use JUnit/Mockito to test:
    • Configuration loading (OidcConfigProperties).
    • Core logic within the custom Filter/Services (e.g., correct Nimbus SDK usage mocks, claim extraction).
    • User mapping and provisioning logic (OidcUserProcessingService interaction with mocked UserService/DAO, verifying no schema changes needed).
    • Follow AIR(Automatic, Independent, Repeatable)/BCDE(Border, Correct, Design, Error) principles, aim for >60% delta coverage.
  • E2E Tests: Enhance dolphinscheduler-e2e using Selenium/Page Object Model and Testcontainers:
    • Spin up the DolphinScheduler stack (API, DB, ZK).
    • Add a Keycloak container (or mock OIDC server) as the IdP. Configure DS and Keycloak to communicate.
    • Test the full login flow:
      • User clicks the OIDC provider button on the login page.
      • User is redirected to Keycloak login.
      • User logs into Keycloak.
      • User is redirected back to DolphinScheduler.
      • Verify user is logged into DolphinScheduler UI.
      • Verify user auto-provisioning (if enabled, checking correct fields in existing schema).
      • Verify session persistence across requests.
    • Follow the E2E guide. Ensure tests run via ./mvnw test -P e2e.

Image

Fig 4: Demonstrate E2E Test Architecture for OIDC Login

1.6 Documentation ( dolphinscheduler website ) - [Essential]:

  • Create a new documentation page (docs/docs/en/guide/security/oidc.md) explaining:
    • How to enable the OIDC feature.
    • Detailed explanation of all application.yaml configuration parameters under security.authentication.oidc, including the optional client-authentication-method parameter (e.g., client_secret_basic, client_secret_post) which specifies how DolphinScheduler authenticates to the token endpoint.
    • How user provisioning works (populating existing fields, no schema changes).
    • How group/role mapping is handled (based on final design).
  • Provide clear configuration examples for popular OIDC providers (e.g., Keycloak, Okta, Google, Azure AD).
  • Follow the contribution process for the dolphinscheduler repository (Node.js v10+, docsite build, PR to dev).

(Optional Deliverable / Future Work Consideration):

  • Document or briefly investigate how OIDC Bearer Access Tokens obtained via this flow could potentially be used for direct stateless API authentication in the future (as an alternative to session cookies or existing API tokens).
Image

Fig 5: OIDC Authorization Code Flow for DolphinScheduler Login (Link)

tusaryan avatar May 18 '25 10:05 tusaryan

Hi @tusaryan , Thanks for such a detailed design. Before PR, there are still some points that need to be discussed.

Proposed OIDC Components within API Architecture

What is the function of the alert module in the architecture diagram?

Frontend UI Changes ( dolphinscheduler-ui - Vue.js) - [Essential]

I think we should keep the style of the current login page and avoid over-design. Just add the authentication channel under the login button and avoid unnessnary change.

Testing (Unit & E2E) - [Essential]

  1. UT is a necessity for every PR.
  2. E2E test or API test is required in the first PR. You can choose any way to realize it, just aim at one of the authentication channels. API test will be simpler, I suggest you choose this one.

E2E-Test API-Test

SbloodyS avatar May 18 '25 11:05 SbloodyS

What is a gsoc project?

Google summer of code.

How can I contribute?

Like I said in https://github.com/apache/dolphinscheduler/pull/17193#issuecomment-2888361160 . If you want to contribute, please find another issue with label good first issue or hepe wanted.

SbloodyS avatar May 18 '25 11:05 SbloodyS

Hi @SbloodyS , first of all Thank you very much for taking the time to review my proposal and provide this valuable feedback. I really appreciate your insights and suggestions.

Proposed OIDC Components within API Architecture

What is the function of the alert module in the architecture diagram?

This is part of the existing architecture you can find the reference in this docs: https://dolphinscheduler.apache.org/en-us/docs/3.1.2/architecture/design, https://dolphinscheduler.apache.org/en-us/docs/3.1.2/introduction-to-functions_menu/alert_menu

the Alert module (representing the AlertServer/Alert Component) in the architecture diagram provides alarm services for workflow and task executions within dolphinscheduler. The logical assessment of the task/workflow execution status in relation to the configured alarm policy, this is what determines whether or not to send an alert. And the database contains this status data. And this might be the explanation for the relationship depicted in the diagram.

tusaryan avatar May 19 '25 07:05 tusaryan

the Alert module (representing the AlertServer/Alert Component) in the architecture diagram provides alarm services for workflow and task executions within dolphinscheduler. The logical assessment of the task/workflow execution status in relation to the configured alarm policy, this is what determines whether or not to send an alert. And the database contains this status data. And this might be the explanation for the relationship depicted in the diagram.

Yes. I know what this module is. What I mean is OIDC doesn't seem to have any interaction with this module. Feel free to correct me if I'm wrong.

SbloodyS avatar May 19 '25 08:05 SbloodyS

Yes. I know what this module is. What I mean is OIDC doesn't seem to have any interaction with this module. Feel free to correct me if I'm wrong.

You are absolutely right @SbloodyS ; the proposed OIDC authentication flow does not have a direct functional interaction with the existing Alert module in this design. My apologies if the diagram implied otherwise.

tusaryan avatar May 19 '25 08:05 tusaryan

Thanks for your clarification. I'm ok with it. @tusaryan

SbloodyS avatar May 19 '25 08:05 SbloodyS

Frontend UI Changes ( dolphinscheduler-ui - Vue.js) - [Essential]

I think we should keep the style of the current login page and avoid over-design. Just add the authentication channel under the login button and avoid unnessnary change.

✅Will follow your suggestion to simply add the OIDC authentication options under the existing login button, ensuring unnecessary changes are avoided.

Testing (Unit & E2E) - [Essential]

  1. UT is a necessity for every PR.

✅ I fully understand will provide unit test in every PR.

  1. E2E test or API test is required in the first PR. You can choose any way to realize it, just aim at one of the authentication channels. API test will be simpler, I suggest you choose this one. E2E-Test API-Test

✅ Agree with your approach and will proceed as you suggested. I will implement API tests, focusing on one authentication channel as you advised.

tusaryan avatar May 19 '25 08:05 tusaryan

@SbloodyS Thank you again, i really appreciate your suggestions! Please feel free to share any other feedback or thoughts on the proposed OIDC design or related areas.

tusaryan avatar May 19 '25 08:05 tusaryan