gravitino icon indicating copy to clipboard operation
gravitino copied to clipboard

[#3282] improvement: Support sort order when create the Doris table.

Open yuqi1129 opened this issue 1 year ago • 2 comments

What changes were proposed in this pull request?

  1. Move check sortorder logic into the corresponding implementation from JdbcTableOperation
  2. Support SortOrder when create a Doris table.

Why are the changes needed?

Doris supports creating a sorted table, we need to implement it in Gravitino API.

Close: #3282

Does this PR introduce any user-facing change?

N/A.

How was this patch tested?

Add test testSortOrderTable

yuqi1129 avatar May 06 '24 09:05 yuqi1129

@zhoukangcn Could you kindly review it for me?

yuqi1129 avatar May 09 '24 07:05 yuqi1129

After discussion with @zhoukangcn , this PR should be holds until we confirm that the syntax of DUPLICATE KEY equals to Sort order. It seems that SR does not have this issue.

yuqi1129 avatar May 10 '24 01:05 yuqi1129

After discussion with @zhoukangcn , this PR should be holds until we confirm that the syntax of DUPLICATE KEY equals to Sort order. It seems that SR does not have this issue.

@zhoukangcn According to the document from 2.0, the sort order syntax in Doirs is:

image

https://doris.apache.org/docs/2.0/table-design/index/prefix-index/

I'm wondering if we should proceed with this PR.

yuqi1129 avatar May 27 '24 01:05 yuqi1129

@yuqi1129 I had notice this Doris document when we discuss last time.

I believe this document only describes that data will be sorted and stored when using the Aggregate, Unique, and Duplicate data models.

In this document https://doris.apache.org/docs/1.2/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE

- keys_type

Data model.

key_type(col1, col2, ...)

key_type supports the following models:

DUPLICATE KEY (default): The subsequent specified column is the sorting column.

AGGREGATE KEY: The specified column is the dimension column.

UNIQUE KEY: The subsequent specified column is the primary key column.

So, I think DUPLICATE, AGGREGATE, UNIQUE should be regarded as data models, not sort keys.

zhoukangcn avatar May 27 '24 15:05 zhoukangcn

doris.apache.org/docs/2.0/table-design/index/prefix-index

@yuqi1129 I had notice this Doris document when we discuss last time.

I believe this document only describes that data will be sorted and stored when using the Aggregate, Unique, and Duplicate data models.

In this document doris.apache.org/docs/1.2/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE

- keys_type

Data model.

key_type(col1, col2, ...)

key_type supports the following models:

DUPLICATE KEY (default): The subsequent specified column is the sorting column.

AGGREGATE KEY: The specified column is the dimension column.

UNIQUE KEY: The subsequent specified column is the primary key column.

So, I think DUPLICATE, AGGREGATE, UNIQUE should be regarded as data models, not sort keys.

I see, there seems to be a big difference between version 1.2 and 2.X. In 1.2 DUPLICATE, AGGREGATE, UNIQUE are reviewed as data models, however in 2.X, as the documents described, it can also affect the data storage format and can be regarded as the sort key.

In all, we will postpone the PR until we have a full understanding of it.

yuqi1129 avatar May 28 '24 01:05 yuqi1129