flyte icon indicating copy to clipboard operation
flyte copied to clipboard

[Housekeeping] Decouple `flytekit.remote` to `flytekit.remote` and `flytekit.remote_utils`

Open austin362667 opened this issue 1 year ago • 5 comments

Describe the issue

We have the needs of porting FlytekitRemote into RustFlyteRemote with same high level behaviours recently.

We might have to decouple flytekit.remote into two separated modules flytekit.remote and flytekit.remote_utils, because these imports requiring the initialization of remote.py from flytekit.remote.

remote.py uses grpc clients. To completely remove Python grpc packages. We need refactoring remote_utils in advance. So RustFlyteRemote can only import flytekit.remote_utils without triggering flytekit/remote/__init__.py stuffs.

We should be able only import remote_utils like from flytekit.remote.backfill import create_backfill_workflow, etc, without the needs of importing flytekit.remote.remote.

For example, the following file system layout defines a top level parent package with three subpackages:

parent/
    __init__.py
    one/
        __init__.py
    two/
        __init__.py
    three/
        __init__.py

Importing parent.one will implicitly execute parent/__init__.py and parent/one/__init__.py.

FYI

What if we do not do this?

remote.py depends on grpc in SynchronousFlyteClient. So if we pip uninstall grpcio grpcio-status and only use RustFlyteRemote still causes issues like following screenshot.

Related component(s)

Screenshot 2024-04-22 at 9 17 11 PM

Are you sure this issue hasn't been raised already?

  • [X] Yes

Have you read the Code of Conduct?

  • [X] Yes

austin362667 avatar Apr 22 '24 13:04 austin362667

I don't like the current structure of FlyteRemote at all. It definitely needs to be refactored, heavily. But I'm not sure I understand this ticket. Why is raw.py still here? That file should be deleted, and all auth handling (which is the tricky part) moved over to the rust core.

wild-endeavor avatar Apr 22 '24 17:04 wild-endeavor

That makes sense to me the Python raw.py should be deleted eventually. @wild-endeavor I'm curious about your thoughts on the interim.

Why is raw.py still here?

Will there be no transition stage during which two raw.py files exist at the same time? We can choose which to use, and will help us test whether high-level behaviors are the same.

austin362667 avatar Apr 23 '24 06:04 austin362667

My initial thoughts on the way of refactoring are,

flytekit/
    remote_utils/
          backfill.py
          data.py
          entities.py
          executions.py
          interface.py
          lazy_entity.py
          remote_callable.py
          remote_fs.py
    remote/
          remote.py

Both of

flytekit/
    remote/
          remote.py

and

flyrs/
    remote/
          remote.py

can use flytekit.remote_utils

cc @pingsutw

austin362667 avatar Apr 23 '24 06:04 austin362667

We can have an interim period if we think it's warranted, but during that time, if we have it, we'll just keep the grpc library around.

I'd rather not have both of them at the same time, but @eapolinario had some concerns about our ability to test all the various oauth setups out there, so that's a valid concern. But I still think refactoring remote is a different (needed, but different) issue from removing grpc and eventually protobuf dependencies.

wild-endeavor avatar Apr 23 '24 19:04 wild-endeavor

I would think that we should only keep rust and let people add the python one back as an extra. I want us to test the rust one well before we release / merge.

Are we also supporting all auth implementations in the rust Grpc channel interceptor?

kumare3 avatar Apr 24 '24 03:04 kumare3