Python Language Generation
What kind of change does this PR introduce?
Feature
What is the current behavior?
Does not generate python,
What is the new behavior?
Practicing TDD, designing tests first and asking the community for feedback.
Additional Comments
I'd like everyones feedback on this format for the python types. I'm using pydantic and mimicing the go structure, since its closest to python dataclasses (basically structs).
I'm not really sure what a lot of these things other than "Tables" are. Materialized Views? Why are there different types for different operations? CompositeTypes?
Anyway, if we agree on the TDD output, I'll make it work ๐
I'd also like to do this again, but for rust.
Man I'm looking forward to that :)
happy to be early tester!
I've dropped my work on it atm if someone wants to take it the final mile, or wants to answer my questions on this pr, that would be great! Eventually i'll get back to working on supabase.
This is really cool @ryanpeach! Not a supabase team member, just a supabase user who'd love to have python types generated by the CLI. Just wondering, what questions or todos remain before making this an open PR (instead of draft)?
I need someone to check the tests. I know Iโve not covered all cases, but I donโt understand some of the test cases in the other code generators. Basically I need a code review.
Hey, y'all ๐ just wanted to join the convo here
I'm working on a project with a Next.js frontend, Python FastAPI backend, and Supabase, and recently realized there was no python type gen support :(
I came across #795 when I was Googling around and then saw this PR here. Hoping to spark the convo and get the ball rolling again here! I did a quick look through, and the changes seem on the right track to me.
I'm def a little out of my domain here, but I saw @sweatybridge you commented on the original #795 saying you'd be more than happy to review a PR for this issue. Seems like @ryanpeach would love to get your or someone else from Supabase's eyes on this!
Hi @ryanpeach, thanks a bunch for all the hard work you put into this PR!
Iโd love to chat about this and figure out how we want to approach Python type generation. The way you implemented it, as raw Python models, sounds great, but we might want to take it a step further and try something more aligned with what we have in JS.
In postgrest-js, the generated types are integrated into the PostgrestClient as you can see on https://github.com/supabase/postgrest-js/blob/master/src/PostgrestClient.ts#L18-L24.
This makes types available during runtime usage of the library.
Iโm not a Python expert, so Iโm not sure how far Python types can go.
@grdsdev I did see that, but I think we should go for language availability before we go for complexity. That's why I emulated the go structs. Maybe all that can go in a V2.
In fact, I think most languages can extend your go examples. Rust for instance. As it has no inheritance and no complex typing as far as I remember. I'd like to do a rust PR next, as I'm a python/rust primary dev.
I think the JS way would be very unpythonic, but I understand the reasons for it and we should try to emulate those reasons in a pythonic way in a V2.
Either way, I think the best solution is this:
- Create a canonical "schema" for the database in TDD. Release a readme describing all the system features it describes.
- Create an agreed upon "test" for the language generation in a PR. Evaluate it and vote on it.
- Write the code to generate the test (the easiest part). Release.
Right now I don't understand all the system features described in the database schema, so I have a hard time implementing a code generator for the ones I don't understand.
we should go for language availability before we go for complexity. That's why I emulated the go structs
I'll have to disagree on this - the typegen needs to be well integrated into and narrowly scoped into the use case of the SDK. I don't see how generating structs/interfaces on its own would improve the Supabase DX.
I'd actually use the Go types as a reason against adding this, because to this day supabase-go still doesn't use the Go types (cmiiw), and it's not clear how you're meant to use the two together.
To elaborate further: some of the reasons we added typings to supabase-js are because it affords us:
- autocompletion (table name in
.from(), column name in filters) - static type checking for missing columns in
.select() - typed result based on the query in
.select()
I'm not convinced of the benefits of adding typings if it doesn't afford us these things.
@soedirgo you aren't convinced that having structs which match your tables, and having those structs autogenerated, is useful?
I'm sorry but I can't even begin to see the logic in that perspective... It's a basic serialization/deserialization paradigm. It's 1000x better than having an untyped object without data validation, static type checking, or IDE autocompletion.
And I think the interest in this issue is clear that people care about it. How many of the tools developed here ad-hoc by developers to fix this issue just for python go beyond struct generation?
The more advanced options of getting certain types from certain parameterizations of from or select, I'm not sure that's even possible in python typing. If so it's very complicated, using typing.overload and the use of typing.Literal. But it's definitely a v2 IMO.
I see where you're coming from; if Python's typings doesn't support these features then it is what it is. I'll leave it to the supabase-py maintainers to decide if we want to add this in, because if we do want to add these features in v2 they'd be responsible for integrating the types with supabase-py. cc @juancarlospaco @silentworks @grdsdev
What would also help is if you could show an example of how you'd use these types in your code, because certain approaches end up making the code much more verbose and it gets unclear whether it's a DX improvement or not (example here)
I'd just use pydantics deserialization and from dict capabilities after running a query, just to associate a type with the return and get data validation.
Any future complexity addition would need this capability first anyway.
Great work here @ryanpeach. Can you provide a code example of how this would be used in a Python project?
import PublicUsersSelect, PublicUsersInsert, PublicUsersUpdate from generated_table_classes
import supabase
# Select
selected = [PublicUsersSelect(x) from x in supabase.table("users").select("*").execute().data]
# Insert
to_insert = PublicUsersInsert(name="foo", status="bar")
inserted = [PublicUsersInsert(x) for x in supabase.table("users").insert(to_insert.as_dict()).execute().data]
# Update
to_update = selected.as_dict()
id = to_update.pop("id")
updated = [PublicUsersUpdate(x) for x in supabase.table("users").update(selected.as_dict()).eq("id", id).execute().data]
Things I don't understand are like:
- Why is that table called public? Remember I copied from Go.
- Could we combine "PublicUsers*" responses into one?
From this exercise I think this DX is ideal:
import User, Users, UsersUpdate, UserInsert from generated_table_classes
import supabase
# Select
selected = Users(supabase.table("users").select("*").execute().data)
# Insert
to_insert = UserInsert(name="foo", status="bar")
inserted = Users(supabase.table("users").insert(to_insert.as_dict()).execute().data)
# Update
to_update = UsersUpdate(name="bar")
updated = Users(supabase.table("users").update(to_update.as_dict()).eq("id", selected.id).execute().data)
The difference between User, UsersUpdate, and UsersInsert would be:
# Nothing is optional
class User(BaseModel):
id: Annotated[int, Field(alias="id")]
name: Annotated[str, Field(alias="name")]
status: Annotated[UserStatus, Field(alias="status")]
# Easily handle data lists
Users = lambda users: [User(x) from x in users] # For now
# The id field is missing but all the others are non optional
class UserInsert(BaseModel):
name: Annotated[str, Field(alias="name")]
status: Annotated[UserStatus, Field(alias="status")]
# Every field is optional, no id field
class UsersUpdate(BaseModel):
name: Annotated[str | None, Field(alias="name")]
status: Annotated[UserStatus | None, Field(alias="status")]
From the above comment, adding these functions to each object would make the DX even better:
import User, Users, UsersUpdate, UserInsert from generated_table_classes
# import supabase not needed
# Select
selected = Users(Users.select("*").execute().data)
# Insert
to_insert = UserInsert(name="foo", status="bar")
inserted = Users(to_insert.insert().execute().data)
# Update
to_update = UsersUpdate(name="bar")
updated = Users(to_update.update().eq("id", selected.id).execute().data)
Very roughly pseudocoded:
import supabase
from pydantic import BaseModel
# Nothing is optional
class User(BaseModel):
id: Annotated[int, Field(alias="id")]
name: Annotated[str, Field(alias="name")]
status: Annotated[UserStatus, Field(alias="status")]
# Easily handle data lists
class Users(list):
def __init__(self, users: list[dict]):
super().__init__(User(x) for x in users)
@staticmethod
def select(self, *args, **kwargs):
return supabase.table("users").select(*args, **kwargs)
# The id field is missing but all the others are non optional
class UserInsert(BaseModel):
name: Annotated[str, Field(alias="name")]
status: Annotated[UserStatus, Field(alias="status")]
def insert(self, *args, **kwargs):
return supabase.table("users").insert(self.as_dict(), *args, **kwargs)
# Every field is optional, no id field
class UsersUpdate(BaseModel):
name: Annotated[str | None, Field(alias="name")]
status: Annotated[UserStatus | None, Field(alias="status")]
def update(self, *args, **kwargs):
return supabase.table("users").update(self.as_dict(), *args, **kwargs)
Much further and I think it would be rather complicated, basically rewriting the features of the python library.
I think this would be a great addition to the CLI for sure.
Yeah I think this looks great! In terms of what's useful/valuable and would improve DX, I'd have to agree with @ryanpeach.
Essentially, even just simple auto-generated pydantic model classes that help you automatically stay up to date with your supabase schema and give you static type-checked, validated data to work with is enough of a DX improvement to make this effort worth it in my opinion ๐
From your last example, maybe we could do this to make it even cleaner and clearer? Let me know your guys' thoughts and if you guys think something like this would be doable! cc: @silentworks
Usage:
from generated_table_classes import Users, UsersInsert, UsersUpdate
insert_user_1 = Users.insert(UsersInsert(name="Ryan", email="[email protected]"))[0]
insert_user_2 = Users.insert(UsersInsert(name="Mike"))[0]
select_user_1 = Users.select("*", filters={"id": insert_user_1.id})[0]
select_all_users = Users.select("*")
update_user_1 = Users.update(
UsersUpdate(email="[email protected]"), filters={"id": insert_user_1.id}
)[0]
update_user_2 = Users.update(
UsersUpdate(name="Michael"), filters={"id": insert_user_2.id}
)[0]
Implementation:
# generated_table_classes.py
# User schema, required name, optional email
class User(BaseModel):
id: Annotated[int, Field(alias="id")]
name: Annotated[str, Field(alias="name")]
email: Annotated[str | None, Field(alias="email")] = None
# UsersInsert schema, required name, optional email
class UsersInsert(BaseModel):
name: Annotated[str, Field(alias="name")]
email: Annotated[str | None, Field(alias="email")] = None
# UsersUpdate schema, optional name, optional email
class UsersUpdate(BaseModel):
name: Annotated[str | None, Field(alias="name")] = None
email: Annotated[str | None, Field(alias="email")] = None
# Single Users class used to interact with the users table
class Users:
def __init__(self, users: list[dict]):
self.users = [User(**x) for x in users]
def __iter__(self):
return iter(self.users)
@staticmethod
def select(
*columns: str,
filters: Dict[str, Any] | None = None,
limit: int | None = None,
order_by: str | None = None,
descending: bool = True,
) -> List[User]:
query = supabase.table("users").select(*columns)
if filters:
for column, value in filters.items():
if column == "or":
query = query.or_(value)
else:
query = query.eq(column, value)
if order_by:
query = query.order(order_by, desc=descending)
if limit:
query = query.limit(limit)
response = query.execute().data
return [User(**x) for x in response]
@staticmethod
def insert(user: UsersInsert) -> List[User]:
response = supabase.table("users").insert(user.model_dump()).execute().data
return [User(**x) for x in response]
@staticmethod
def update(
user: UsersUpdate,
filters: Dict[str, Any] | None = None,
) -> List[User]:
query = supabase.table("users").update(user.model_dump())
if filters:
for column, value in filters.items():
if column == "or":
query = query.or_(value)
else:
query = query.eq(column, value)
response = query.execute().data
return [User(**x) for x in response]
^ from testing locally, this seems to get the static type checking.
However, I will say that I have the "Mypy Type Checker" VSCode extension installed, and I don't believe by default Pydantic is able to catch the type errors of missing required fields in the UsersInsert class. For example, the Mypy extension picks this up, but if I disable the extension I don't get this red squiggly anymore:
Everything else seems good and working, though with intellisense picking up on it.
Yeah I think this looks great! In terms of what's useful/valuable and would improve DX, I'd have to agree with @ryanpeach.
Essentially, even just simple auto-generated pydantic model classes that help you automatically stay up to date with your supabase schema and give you static type-checked, validated data to work with is enough of a DX improvement to make this effort worth it in my opinion ๐
From your last example, maybe we could do this to make it even cleaner and clearer? Let me know your guys' thoughts and if you guys think something like this would be doable! cc: @silentworks
Usage:
from generated_table_classes import Users, UsersInsert, UsersUpdate insert_user_1 = Users.insert(UsersInsert(name="Ryan", email="[email protected]"))[0] insert_user_2 = Users.insert(UsersInsert(name="Mike"))[0] select_user_1 = Users.select("*", filters={"id": insert_user_1.id})[0] select_all_users = Users.select("*") update_user_1 = Users.update( UsersUpdate(email="[email protected]"), filters={"id": insert_user_1.id} )[0] update_user_2 = Users.update( UsersUpdate(name="Michael"), filters={"id": insert_user_2.id} )[0]Implementation:
# generated_table_classes.py # User schema, required name, optional email class User(BaseModel): id: Annotated[int, Field(alias="id")] name: Annotated[str, Field(alias="name")] email: Annotated[str | None, Field(alias="email")] = None # UsersInsert schema, required name, optional email class UsersInsert(BaseModel): name: Annotated[str, Field(alias="name")] email: Annotated[str | None, Field(alias="email")] = None # UsersUpdate schema, optional name, optional email class UsersUpdate(BaseModel): name: Annotated[str | None, Field(alias="name")] = None email: Annotated[str | None, Field(alias="email")] = None # Single Users class used to interact with the users table class Users: def __init__(self, users: list[dict]): self.users = [User(**x) for x in users] def __iter__(self): return iter(self.users) @staticmethod def select( *columns: str, filters: Dict[str, Any] | None = None, limit: int | None = None, order_by: str | None = None, descending: bool = True, ) -> List[User]: query = supabase.table("users").select(*columns) if filters: for column, value in filters.items(): if column == "or": query = query.or_(value) else: query = query.eq(column, value) if order_by: query = query.order(order_by, desc=descending) if limit: query = query.limit(limit) response = query.execute().data return [User(**x) for x in response] @staticmethod def insert(user: UsersInsert) -> List[User]: response = supabase.table("users").insert(user.model_dump()).execute().data return [User(**x) for x in response] @staticmethod def update( user: UsersUpdate, filters: Dict[str, Any] | None = None, ) -> List[User]: query = supabase.table("users").update(user.model_dump()) if filters: for column, value in filters.items(): if column == "or": query = query.or_(value) else: query = query.eq(column, value) response = query.execute().data return [User(**x) for x in response]
I think you need to keep it returning the Insert object and the Select object, otherwise you can't follow up with filters and modifiers which would be complicated to reimplement.
curious if there is a reason this hasn't been merged yet?
Just need to mark this PR as ready for review and a +1 from @silentworks, and then we can merge this.
@ryanpeach thanks for the contribution! what help do you need to get this PR merged?
It's been a while, I will work on it again if I need a supabase project in the future, but I never got anyone to comment on the unit testing.
test/server/typegen.ts
We just need to finalize what we want the code generation to look like. Implementation is not difficult. I'd need a maintainer to greenlight a comment in this thread as to the exact implementation we are looking at, then I suppose I could wrap it up.
@silentworks -- per @soedirgo, any chance you could approve or request changes?
We're really hurting for python typed support.
EDIT: It looks like ryan needs feedback on his implementation.
@ryanpeach -- any specific points you're concerned about with impl that you'd want to discuss?