sqlmesh icon indicating copy to clipboard operation
sqlmesh copied to clipboard

Add support for Microsoft Fabric

Open mattiasthalen opened this issue 1 year ago • 18 comments

Microsoft Fabric is becoming more and more popular with our clients. And most of the time it's new projects or migrations. I.e., the perfect time to implement SQLMesh.

I know that there are some issues with their lakehouse entity and creating schemas / tables. And that's why dbt only work on their warehouse entity.

Anyway, I hope this is something you'd be willing to look into!

mattiasthalen avatar Nov 14 '24 12:11 mattiasthalen

Hello and thanks for the suggestion. Unfortunately, Fabric seems to have a number of limitations relative to SQL Server, and some of those are used in core SQLMesh operations. If there is sufficient demand, it may be possible to add Fabric once those limitations are addressed.

treysp avatar Nov 14 '24 15:11 treysp

@treysp which limitations do you think are critical? it seems like if the user doesn't do certain operations, we won't hit these limitations?

tobymao avatar Nov 14 '24 16:11 tobymao

The limits on ALTER TABLE ADD/ALTER/DROP COLUMN were the main ones that concerned me.

But you're right that those limitations would be feature-specific (primarily on forward-only), so there's no inherent problem if we just tell people the affected features aren't supported.

Sorry for the premature closing @mattiasthalen!

treysp avatar Nov 14 '24 16:11 treysp

@treysp No worries!

I'm curious though, are those there for the warehouse object as well? I was under the impression that it only was on the lakehouse.

mattiasthalen avatar Nov 14 '24 16:11 mattiasthalen

Totally possible I'm misinterpreting that page - this text made me think it was platform-wide:

Applies to: ✅ SQL analytics endpoint and Warehouse in Microsoft Fabric

This article covers the T-SQL language syntax capabilities of Microsoft Fabric, when querying the SQL analytics endpoint or Warehouse.

treysp avatar Nov 14 '24 16:11 treysp

@treysp you are absolutely right, I missed that part!

It sure does have it quirks, e.g., supporting UUID as a data type in the lakehouse/warehouse, but not in PowerBI. So much for consistency 😅

mattiasthalen avatar Nov 14 '24 16:11 mattiasthalen

+1

our company has to use fabric, and if a connector for fabric is created we will use sqlmesh.

pycal avatar Dec 31 '24 15:12 pycal

we're planning to work on this soon after the holidays

tobymao avatar Dec 31 '24 17:12 tobymao

Update on this:

SQLMesh needs some fundamental constructs in order to work properly, namely - schemas, tables and views.

It turns out the Lakehouse entity in Fabric does not support schemas by default. However, if we enable the preview feature for schemas - it stops supporting views. This is documented as a limitation in the public preview of schemas.

We need schemas for the physical layer and views for the virtual layer, so SQLMesh cannot work with Lakehouse until Microsoft resolves this. According to their documentation:

They'll be resolved in the coming releases before General Availability.

So we can revisit this after schema-enabled Lakehouses becomes generally available and theoretically support views.

Of course, the Warehouse endpoints that are TSQL-only are unaffected.

I'm curious - is there any demand for Warehouse / TSQL or are people mostly wanting to use Lakehouse / Spark ?

erindru avatar Jan 08 '25 20:01 erindru

IIRC, dbt uses Warehouse for this reason. And you can reference the lakehouse tables as sources (external table)...

So our workflow is that the bronze layer is the lakehouse, where the EL lands. Then we do all our silver/gold (transform) in the warehouse.

That's how I'd want to use SQLMesh as well.

And tbh, while being able to use a lakehouse would be nice, it's not a deal breaker. A deal breaker would be not to be able to use SQLMesh at all 😉

So to answer your question, (pretty) please proceed with supporting the warehouse endpoint! ❤️

mattiasthalen avatar Jan 08 '25 20:01 mattiasthalen

+1 agree with @mattiasthalen above

We also would like to use SQLMesh with Warehouse. Lakehouse is a nice to have. Our workflow is also using Lakehouse as a landing zone, and Warehouse for transforms.

cseHdz avatar Jan 09 '25 02:01 cseHdz

And for reference, you can target the lakehouse via the warehouse endpoint, it's "just" a different database.

E.g.: lakehouse__landing_zone.dbo.raw__table warehouse.silver.stg__table

mattiasthalen avatar Jan 09 '25 07:01 mattiasthalen

+1 also agree with @mattiasthalen above.

we’ll be transitioning to fabric architecture for primary data products in the coming year, and would like to take advantage of sqlmesh across our multi-cloud enterprise. Will be fine to start with warehouse endpoint. As Microsoft irons out the limitations you mention @erindru , support for Lakehouse/Spark will be great long term!

odikia avatar Jan 12 '25 21:01 odikia

+1 with @mattiasthalen here!

For our clients we offer either a Databricks or Fabric centric approach often with DBT. It would be great if SQLMesh would support the warehouse endpoint, as you advertise to solve some of the limitations/annoyances we faced with DBT.

Fabric (Microsoft/Azure) has quite a dominant presence here in the Netherlands for data engineering capabilities. I figure it would be an interesting userbase.

With regards to the lakehouse vs. warehouse discussion, ideally SQLMesh would support both eventually (when technically feasible). However, the warehouse endpoint is currently by far the most resource efficient transform solution within Fabric, and would always have my preference for cost reasons.

mhofmeester avatar Jan 13 '25 06:01 mhofmeester

+1 for warehouse

toninator avatar Jan 17 '25 18:01 toninator

our implementation is, lakehouse for bronze, warehouse for silver and gold

we could definitely live with the limitation

pycal avatar Jan 17 '25 18:01 pycal

Any news yet? ☺️

mattiasthalen avatar Feb 06 '25 19:02 mattiasthalen

@mattiasthalen this is not a priority for us at the moment. however, if you're interested, we do offer feature prioritization as a part of our enterprise / paid offering.

tobymao avatar Feb 06 '25 19:02 tobymao

It recently dawned on me, the SQL Database object in Fabric is a regular mssql server. I've previously been talking about it in regards to storing state, but why not use it for the full warehouse?

Said and done, I spent the day to add pyodbc support to the mssql engine and have a draft PR for that.

I just tried it out in fabric, and it works!

Image

mattiasthalen avatar May 08 '25 21:05 mattiasthalen

+1 for Fabric Data Warehouse as well!

crazy-treyn avatar May 30 '25 14:05 crazy-treyn

We can now use sqlmesh with the sql db object: https://github.com/TobikoData/sqlmesh/pull/4686

I've tried it with a warehouse object, but since it's not pure tsql, there are still some stuff to take care of. Now it's complaining about merge.

I'll probably have a look soonish.

mattiasthalen avatar Jun 12 '25 06:06 mattiasthalen

@mattiasthalen @tobymao how about orchestrating sqlmesh with fabric python notebooks?

you could save state (sqlmesh/ducklake) in "sql database" and warehouse it in duckdb/ducklake

I can do %init successfully but %context fails at

Creating new DuckDB adapter for data files: {db.db} os._exit called with value 0. The interpreter will be restarted.

jojayaro avatar Jun 17 '25 23:06 jojayaro

We can now use sqlmesh with the sql db object: https://github.com/TobikoData/sqlmesh/pull/4686

I've tried it with a warehouse object, but since it's not pure tsql, there are still some stuff to take care of. Now it's complaining about merge.

I'll probably have a look soonish.

For what it's worth, MERGE isn't currently supported with the Fabric Warehouse but it's estimated to land sometime in the next 3 months: Fabric Warehouse Roadmap

crazy-treyn avatar Jun 21 '25 14:06 crazy-treyn

Linked to PR in #4751

fresioAS avatar Jun 26 '25 07:06 fresioAS

We'd also consider Fabric for future implementations, though the sqlmesh component is more important in a warehouse/sql database context more than the lakehouse.

That said there are issues with sqlmesh compatibility with MS SQL, and Azure SQL, and therefore presumably Fabric DWH + Fabric SQL DB. Specifically relating to macro syntax compatibility (all) and multi-part object naming (Azure SQL).

travispaice avatar Jul 27 '25 11:07 travispaice

We'd also consider Fabric for future implementations, though the sqlmesh component is more important in a warehouse/sql database context more than the lakehouse.

That said there are issues with sqlmesh compatibility with MS SQL, and Azure SQL, and therefore presumably Fabric DWH + Fabric SQL DB. Specifically relating to macro syntax compatibility (all) and multi-part object naming (Azure SQL).

Me and @fresioAS are working on an engine adapter specifically using he sql endpoint for the warehouse and have come quite far, just some integration tests that needs to pass. And those are mostly because we added some new config inputs (workspace id IIRC) and that's not setup in the CI env. But I think @erindru was to take a look next week or the week after.

I'm also vibe coding a spark adapter for the Lakehouse using the Livy API.

I'm curious, what are the macro issues? Could you perhaps log an issue and tag me in it?

The multipart thing is wonky in fabric. Db switching doesn't really work, MS recommends three part identifiers. bBut drop schema and create view doesn't support multi part identifiers. 🙈

mattiasthalen avatar Jul 27 '25 11:07 mattiasthalen

Hey @mattiasthalen , I've linked you into https://github.com/TobikoData/sqlmesh/issues/5034 for more context on the macro syntax issue.

We're not in a rush for Fabric in production due to the numerous bugs and limitations we have encountered in PoC's, but believe these will be resolved in 6-12 months.

Awesome you are working towards it though, are you considering Fabric SQL DB as well or just Fabric DWH? In theory it should be the same end point, but the limitations/features of SQL DB may be different than DWH.

travispaice avatar Jul 27 '25 22:07 travispaice

The SQL DB already works via mssql-odbc ☺️

mattiasthalen avatar Jul 28 '25 06:07 mattiasthalen

Initial support has been merged in https://github.com/TobikoData/sqlmesh/pull/4751 and should make it to a release soon.

Thanks to @mattiasthalen and @fresioAS !

erindru avatar Aug 20 '25 02:08 erindru