flytekit icon indicating copy to clipboard operation
flytekit copied to clipboard

Extend flytekit version hash calculation to be pluggable

Open ddl-rliu opened this issue 1 year ago • 3 comments

Tracking issue

Closes https://github.com/flyteorg/flyte/issues/5364

Why are the changes needed?

Extends https://github.com/flyteorg/flytekit/pull/2039/files – that PR gives the minimum API surface for configuring the API via FlytekitPlugin.

This PR increases how pyflyte is pluggable by external libraries, specifically for controlling the additional context used to generate the version string hash.

What changes were proposed in this pull request?

This can be used with a plugin like:

class MyPlugin(FlytekitPlugin):
    @staticmethod
    def get_additional_context(entity: Union[PythonAutoContainerTask, WorkflowBase]) -> List[str]:
        """Get additional context to be used for calculating the version hash."""
        if isinstance(entity, PythonTask):
            return [str(entity.task_config)]
        if isinstance(entity, WorkflowBase):
            task_configs = []
            for n in entity.nodes:
                task_configs.extend(DominoJobPlugin.get_additional_context(n.flyte_entity))
            return task_configs
        return []

This will add the task config to the calculation of the version hash.

How was this patch tested?

Setup process

Screenshots

Check all the applicable boxes

  • [ ] I updated the documentation accordingly.
  • [ ] All new and existing tests passed.
  • [x] All commits are signed-off.

Related PRs

Docs link

ddl-rliu avatar May 16 '24 23:05 ddl-rliu

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 72.09%. Comparing base (e6e08f9) to head (a744848). Report is 15 commits behind head on master.

:exclamation: Current head a744848 differs from pull request most recent head d4cda94

Please upload reports for the commit d4cda94 to get more accurate results.

Files Patch % Lines
flytekit/remote/remote.py 0.00% 2 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #2428       +/-   ##
===========================================
+ Coverage   42.77%   72.09%   +29.31%     
===========================================
  Files         185      181        -4     
  Lines       18677    18397      -280     
  Branches     2665     3601      +936     
===========================================
+ Hits         7990    13264     +5274     
+ Misses      10599     4508     -6091     
- Partials       88      625      +537     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar May 17 '24 02:05 codecov[bot]

@pingsutw this is a similar "extension" as #2661

Should we try to get these 2 PRs lined up for the new release of flytekit @ddl-rliu ?

ddl-ebrown avatar Aug 16 '24 19:08 ddl-ebrown

@thomasjpfan Thanks, it's an interesting idea, and I never thought about that possibility! With a modest refactor of _version_from_hash, particularly to accept entity as a parameter, I think it will work exactly like you say. Proof of concept refactor: https://github.com/flyteorg/flytekit/pull/2688/files

To explain more – the parent issue deals with adding entity.task_config to the version hash calculation, therefore, the refactor would expose entity via a parameter to _version_from_hash.

ddl-rliu avatar Aug 16 '24 23:08 ddl-rliu