sqlflow icon indicating copy to clipboard operation
sqlflow copied to clipboard

Add local mode for SQLFlow

Open lhw362950217 opened this issue 5 years ago • 2 comments

As discussed before, we'd better add a local mode for SQLFlow, because:

  1. users may want a more light weight installation way, for just try out SQLFlow on her/his PC
  2. developer may want a simpler way to debug the compiler and the generated code

Some thought about the local mode:

  1. We can exclude Kubernetes from the framework, just generate a python file to run docker at local
  2. We need to provide a mechanism to get track of the jobs and their log like in workflow mode
    • user submit a SQL program from Jupyter Notebook/CLI
    • SQLFlow server compile the program and generate a python file
    • SQLFlow server starts a process to run the python file
    • SQLFlow server returns a process id or a log file name as an identifier of the current task(the same way as workflow mode)
    • client poll the log using the identifier returned by SQLFlow server
  3. If we go to an extreme of the local mode, we may only provide a compiler binary, which generate a python file, users can run the python code to do ml tasks, like discussed in this issue

lhw362950217 avatar Oct 09 '20 06:10 lhw362950217

  1. We can exclude Kubernetes from the framework, just generate a python file to run docker at local

Do we need Minikube here?

  1. We need to provide a mechanism to get track of the jobs and their log like in workflow mode

    • user submit a SQL program from Jupyter Notebook/CLI

    • SQLFlow server compile the program and generate a python file

    • SQLFlow server starts a process to run the python file

    • SQLFlow server returns a process id or a log file name as an identifier of the current task(the same way as workflow mode)

    • client poll the log using the identifier returned by SQLFlow server

We need a sqlflowserver here, I believe it runs in a pod in Minikube.

brightcoder01 avatar Oct 09 '20 23:10 brightcoder01

  1. We can exclude Kubernetes from the framework, just generate a python file to run docker at local

Do we need Minikube here?

  1. We need to provide a mechanism to get track of the jobs and their log like in workflow mode

    • user submit a SQL program from Jupyter Notebook/CLI
    • SQLFlow server compile the program and generate a python file
    • SQLFlow server starts a process to run the python file
    • SQLFlow server returns a process id or a log file name as an identifier of the current task(the same way as workflow mode)
    • client poll the log using the identifier returned by SQLFlow server

We need a sqlflowserver here, I believe it runs in a pod in Minikube.

SQLFlow server can run directly on a native machine, or in a Docker container. So, we may get rid of Minikube.

lhw362950217 avatar Oct 10 '20 07:10 lhw362950217