Add local mode for SQLFlow
As discussed before, we'd better add a local mode for SQLFlow, because:
- users may want a more light weight installation way, for just try out SQLFlow on her/his PC
- developer may want a simpler way to debug the compiler and the generated code
Some thought about the local mode:
- We can exclude Kubernetes from the framework, just generate a python file to run docker at local
- We need to provide a mechanism to get track of the jobs and their log like in workflow mode
- user submit a SQL program from Jupyter Notebook/CLI
- SQLFlow server compile the program and generate a python file
- SQLFlow server starts a process to run the python file
- SQLFlow server returns a process id or a log file name as an identifier of the current task(the same way as workflow mode)
- client poll the log using the identifier returned by SQLFlow server
- If we go to an extreme of the
local mode, we may only provide a compiler binary, which generate a python file, users can run the python code to do ml tasks, like discussed in this issue
- We can exclude Kubernetes from the framework, just generate a python file to run docker at local
Do we need Minikube here?
We need to provide a mechanism to get track of the jobs and their log like in workflow mode
user submit a SQL program from Jupyter Notebook/CLI
SQLFlow server compile the program and generate a python file
SQLFlow server starts a process to run the python file
SQLFlow server returns a process id or a log file name as an identifier of the current task(the same way as workflow mode)
client poll the log using the identifier returned by SQLFlow server
We need a sqlflowserver here, I believe it runs in a pod in Minikube.
- We can exclude Kubernetes from the framework, just generate a python file to run docker at local
Do we need Minikube here?
We need to provide a mechanism to get track of the jobs and their log like in workflow mode
- user submit a SQL program from Jupyter Notebook/CLI
- SQLFlow server compile the program and generate a python file
- SQLFlow server starts a process to run the python file
- SQLFlow server returns a process id or a log file name as an identifier of the current task(the same way as workflow mode)
- client poll the log using the identifier returned by SQLFlow server
We need a sqlflowserver here, I believe it runs in a pod in Minikube.
SQLFlow server can run directly on a native machine, or in a Docker container. So, we may get rid of Minikube.