databricks-vscode icon indicating copy to clipboard operation
databricks-vscode copied to clipboard

Question - Cluster execution on a unique cell in a .py file

Open KiDayz opened this issue 1 year ago • 9 comments

Hey, I'm using the databricks extension in its pre-release version (v2.0.1)

I'm trying to execute a cell in a .py file (not the all file) with a cluster on Databricks (so no local kernel).

It seems that I have correclty setup the connection, my cluster is enabled & I see the 'Databricks Connect enabled' in VSCode. I thought, reading kartikgupta-db comment on this thread : https://github.com/databricks/databricks-vscode/issues/472, that I could run indiviual cell using my cluster but it seems that the only options available are the 'upload & Run file' & 'Run file as workflow'.

Am I doing something wrong here ? Or is the option not available yet ? Many thanks !

KiDayz avatar Apr 03 '24 10:04 KiDayz

If you have Databricks Connect enabled, then the notebooks should already be sending all the spark and dbutil commands to a Databricks cluster. We do not support executing full cell on Databricks yet.

kartikgupta-db avatar Apr 03 '24 14:04 kartikgupta-db

Hey, Thanks for your response. Do you have by any chance a roadmap or an ETA for the support of Python code execution by cell on a databricks cluster ?

KiDayz avatar Apr 03 '24 16:04 KiDayz

I want this feature as well. E.g., when using native python to read files from UC Volums, it failes when running in VSCode as it runs locally, while it works in Databricks Workspace as it runs on the cluster. We have other examples as well - leading to our developers wanting to work in Databricks as apposed to working from VSCode.

pernilak avatar Apr 15 '24 10:04 pernilak

+1 on this. This is actually one of the few blockers I have to developing solely in VSCode rather than the UI.

@kartikgupta-db - If you have a rough understanding of what would need to change for this to be implemented and would accept a PR, I'd be willing to have a go. Just need some guidance on getting started

MrTeale avatar May 02 '24 22:05 MrTeale

Pluss 1 on this one!

We struggle with API calls to mlflow starts to track locally. Also when python runs locally it can't find the path to UC Volumes (of course).

Would be very nice to get this feature and a ETA as well!

antonlindahl-sb1u avatar May 07 '24 15:05 antonlindahl-sb1u

Hello everyone ! Hope you all are well :)

Any news on an ETA on this feature from the Databricks dev team ?

Thanks

KiDayz avatar Jul 01 '24 12:07 KiDayz

plus one here to keep an eye on it!

kupalinka-lis avatar Aug 26 '24 08:08 kupalinka-lis

If you have Databricks Connect enabled, then the notebooks should already be sending all the spark and dbutil commands to a Databricks cluster. We do not support executing full cell on Databricks yet.

@kartikgupta-db any update?

pernilak avatar Sep 05 '24 11:09 pernilak

Hey @kartikgupta-db , Any news on the development of this feature yet ?

Many thanks !

KiDayz avatar Oct 01 '24 14:10 KiDayz

Is this feature on the roadmap? The whole point of writing a databricks notebook is I want to run it cell-by-cell (or one cell at a time) and see the output. It's kinda inconvenient to run the full notebook as a job (which is what I have to do today with the extension) every time I make a change to a single cell.

I know I can write and run databricks notebooks using the databricks UI on the browser but as an engineer, I'd really like to do my databricks notebook development in the IDE.

I would greatly benefit from this feature, thanks

deep-kotadia2-zocdoc avatar Jan 24 '25 22:01 deep-kotadia2-zocdoc

Hey, we currently don't have remote cell-by-cell execution on the roadmap.

You can execute notebooks cell by cell with dbconnect (where spark code will be executed remotely, but everything else locally), but that indeed can still be inconvenient in some cases.

ilia-db avatar Mar 04 '25 09:03 ilia-db

Closing this in favor of https://github.com/databricks/databricks-vscode/issues/472

ilia-db avatar Mar 04 '25 09:03 ilia-db