sqlite_orm icon indicating copy to clipboard operation
sqlite_orm copied to clipboard

The WITH clause

Open neevek opened this issue 7 years ago • 19 comments

This library has very cool API design, thank you @fnc12 for the hard work.

I am looking for the WITH clause support in sqlite_orm so I can put subqueries aside and reuse them in my main SELECT statement. What I am trying to do is to calculate user retention of a user_activity table as the following:

WITH
register_user AS (
  SELECT
    uid, MIN(DATE(user_activity.timestamp)) register_date
  FROM user_activity
  GROUP BY uid
  HAVING register_date >= DATE('now', '-7 days')
),
register_user_count AS (
  SELECT
    R.register_date,
    COUNT(DISTINCT R.uid) AS user_count
  FROM register_user R
  GROUP BY R.register_date
)
SELECT
  R.register_date,
  CAST(julianday(DATE(A.timestamp)) AS INT) - CAST(julianday(R.register_date) AS INT) AS ndays,
  COUNT(DISTINCT A.uid) AS retention,
  C.user_count
FROM user_activity A
LEFT JOIN register_user R ON A.uid = R.uid
LEFT JOIN register_user_count C ON R.register_date = C.register_date
GROUP BY R.register_date, ndays
HAVING DATE(A.timestamp) >= DATE('now', '-7 days');

with the following sample output:

register_date  ndays       retention   user_count
-------------  ----------  ----------  ----------
2018-12-02     0           1           1
2018-12-02     6           1           1
2018-12-06     0           4           4
2018-12-06     5           3           4
2018-12-06     6           3           4
2018-12-10     0           2           2
2018-12-10     1           1           2
2018-12-10     2           1           2
2018-12-11     0           3           3
2018-12-11     1           1           3

I think queries of this complexity may not need to be constructed in the ORM way, raw SQL statements may be clearer and easier to understand in such cases.

neevek avatar Dec 14 '18 06:12 neevek

Hello. WITH is a great feature, I was thinking about it before. It is available to add it as an ORM API. I need to think how to perform it better. The main advantage of sqlite_orm is not to uses raw string queries but use some kind of compiled SQL in C++. So let me think. Thanks

fnc12 avatar Dec 15 '18 06:12 fnc12

@neevek your example is perfect. I'd like to reproduce it with sqlite_orm in pure c++. But I need one thing from you - I need your schema and some data to reproduce it with sqlite3 shell first and with C++ later. Thanks in advance

fnc12 avatar Dec 15 '18 17:12 fnc12

@fnc12 Hello, the table user_activity is quite simple, it only has 3 columns, CREATE statements is as following:

CREAT TABLE user_activity (
id integer primary key,
uid integer,
timestamp datetime
);

The MIN datetime of multiple activity records for a uid is treated as the register time of that user, I am sorry I can't provide sample data because I am typing on a phone now, you can just insert a few activity records for each user with different timestamp.

neevek avatar Dec 16 '18 03:12 neevek

first of all I need to add column alias feature. I suppose it to have a syntax like this:

//    SELECT id userId
//    FROM users
auto rows = storage.select(make_alias(&User::id, "userId"));

that returns the same as

//    SELECT id
//    FROM users
auto rows = storage.select(&User::id);

fnc12 avatar Dec 17 '18 07:12 fnc12

please check out column aliases in https://github.com/fnc12/sqlite_orm/pull/226 in dev branch

fnc12 avatar Dec 18 '18 18:12 fnc12

It's great to see the progress update, but there's one thing, usage of custom alias, is too verbose to me, could it be possible to name and reference the alias with simple raw string, I am not sure if that would contradict with the design philosophy of the library, but that would make the query less verbose to some extent.

neevek avatar Dec 19 '18 06:12 neevek

@neevek I've thought about it. I was able to make syntax like this:

auto rowsWithColumnAliases = storage.select(columns(as(&Employee::id, "EMP_ID"),
                                                    &Employee::name,
                                                    &Employee::age,
                                                    &Department::dept),
                                                  where(is_equal(alias("EMP_ID"), &Department::empId)));

but this options has string repeats which makes API vulnerable for literal errors. One of the main advantages of the lib is a static linkage with all query elements so you cannot make any literal misspelling errors. Other options is not to use same literal but declare constant before query and use it:

auto myAlias = make_alias(&Employee::id, EMP_ID");
auto rowsWithColumnAliases = storage.select(columns(myAlias.as(),
                                                    &Employee::name,
                                                    &Employee::age,
                                                    &Department::dept),
                                                  where(is_equal(myAlias.get(), &Department::empId)));

This option is not bad but requires two statements for one query. So I decided to move every string to a dedicated class and users can be sure that there can not be any misspelling mistake.

Probably I forgot something and there is better way to perform aliases for sqlite_orm API. Please tell me that is your opinion in this case. Thanks

fnc12 avatar Dec 19 '18 13:12 fnc12

@fnc12 I am OK with option one. I am no professional on this issue, one thing I can think of to avoid spelling errors is to perform string literal comparison at compile time when referencing aliases, if an alias doesn't exist, referencing it yields compile time error. For option two, I think it is not elegant for this task as you said, it requires two statements, also it is not intuitive that one has to call as() to name the alias and get() to reference it. Maybe a single overloaded alias function is enough, one with (const char *alias, member_fun_ptr), another with (const char *alias).

neevek avatar Dec 19 '18 14:12 neevek

  1. One cannot compare string literals at compile time if literals are not declared as static explicitly. That is why I map every alias to a dedicated class. Of course compile time string comparison would be great. I miss it. But it is what it is(
  2. Overload is not an option cause both cases take template argument T. So what you give that will be taken. I think as and get ok cause in raw SQL statements are different: in the first case you write COLUMN as COLUMN_ALIAS, in the second you write COLUMN_ALIAS only

fnc12 avatar Dec 19 '18 14:12 fnc12

I'm trying to reproduce your initial query. Looks like you have an error here:

  SELECT
    uid, MIN(DATE(user_activity.timestamp)) register_date
  FROM user_activity
  GROUP BY uid
  HAVING register_date >= DATE('now', '-7 days')

user_activity.timestamp doesn't exist cause the schema your provided looks like

CREAT TABLE user_activity (
id integer primary key,
uid integer,
timestamp datetime
);

So I guess you wanted to say user_activity. datetime

fnc12 avatar Dec 30 '18 10:12 fnc12

What's the error? I cannot run SQL on phone now. I added a raw_query method to the library and used the SQL in my code, it run fine and had expected result.

-- edit Oh sorry, I must have changed de coloum name when pasting the schema.

neevek avatar Dec 30 '18 10:12 neevek

Anyway I'm still working on adding with feature one of main goals of sqlite_orm is that raw query is nor required at all

fnc12 avatar Dec 30 '18 10:12 fnc12

I totally understand that, but I needed to get the work done immediately, and that was the easiest and not so invasive way, that's why I didn't send pull request.

I am expecting the with clause, thanks for the hard work.

neevek avatar Dec 30 '18 10:12 neevek

@neevek it's ok. This lib is totally open source and has open license so you're free to modify it as you want. First of all sqlite_orm must make working with SQLite easier for developers like you so it is ok.

fnc12 avatar Dec 30 '18 10:12 fnc12

I've added CAST feature https://github.com/fnc12/sqlite_orm/pull/233

fnc12 avatar Dec 31 '18 10:12 fnc12

I've added julianday function in https://github.com/fnc12/sqlite_orm/pull/234

fnc12 avatar Dec 31 '18 12:12 fnc12

@fnc12 Has it been implemented? I don't see it in the TODO, but I also don't find it in the source code. So I assume that it's not a priority at the moment.

Anyways, I just wanted to pop in and say thanks for the great library!

In case you were wondering, I wanted to use the with clause for a recursive self-join.

Something like this:

WITH    q AS 
        (
        SELECT  *
        FROM   TASKS
        WHERE   PARENT_ID = 7
        UNION ALL
        SELECT  m.*
        FROM    TASKS m
        JOIN    q
        ON      m.PARENT_ID = q.ID
        )
SELECT  *
FROM    q
ORDER by  q.PARENT_ID

with a table like this:

CREATE TABLE 'TASKS' ( 'ID' INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL , 'DESCRIPTION' TEXT NOT NULL , 'STATUS' INTEGER NOT NULL , 'MODIFICATION' TEXT NOT NULL , PARENT_ID INTEGER)

EricTheMagician avatar Jan 18 '22 06:01 EricTheMagician

@thejinx0r it has not been implemented. It doesn't exist in TODO list but it doesn't meant that it will not be implemented cause issue list is a TODO list #2. Thanks for a helpful example. I'll try to implement WITH as fast as I can

fnc12 avatar Jan 18 '22 08:01 fnc12

@neevek @thejinx0r Perhaps you would like to try https://github.com/FireDaemon/sqlite_orm/tree/CTEs, for which there's also a draft PR.

There is an example there specifically for @neevek's query, along with a few others. I didn't have time to create good test data, but the query should work as expected.

trueqbit avatar Feb 08 '23 13:02 trueqbit