Consider Datalog-like logic variable based JOINs
I don't have the time to write a detailed proposal at the moment, but I think you should seriously consider added Datalog-like logic variables as an alternative syntax for JOIN. The datalog style can present a substantial simplification of graph traversal use-case compared to SQL.
There's an example of a traversal in Datalog, versus the equivalent SQL (SQLite dialect):
(Generated by my toy Datalog to SQL compiler [github])
Related concepts on the subject:
- The existing SPARQL query language tries to mash up SQL-like syntax with Datalog's join syntax
- Logica is a Datalog-like language from Google that compiles to several SQL dialects: https://logica.dev/
- Mentat is a Datalog -> SQLite compiler for a Datomic-like Datalog diaglect. It was originally developed by Mozilla, now independent: https://github.com/qpdb/mentat
- Well-publicized tutorial on the Clojure/Datomic dialect: http://www.learndatalogtoday.org/
Thanks for the issue! I'm a big fan of Datalog and have followed logica for a while.
How would you see this working for simple joins? Do you think it's possible to design something that is familiar enough for users with only relational experience? What would a design without access to WITH RECURSIVE look like?
My sense is that analytical tables are increasingly de-normalized, and so highly complex joins ("find 1st cousins given parent-child relationships") are less important than making more standard joins easy, and allowing for more complex join conditions ("join orders & addresses, without exploding on duplicate addresses").
I'll check out Percival — that looks really cool. I just saw the page was live — awesome!
I don't know Datalog, but this seems quite concise way of expressing joins?
@justjake Could you help me understand by expressing your example about in a form of a function that takes a table an input?
In pseudo-python:
edge = { 'a': [...], 'b': [...] }
def join_path(table):
?
join_path(edge)
We need this, because ultimately, our join must be defined as function that is applied to a whole table.