datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Support `map_keys` for MAP type

Open dharanad opened this issue 1 year ago • 2 comments

Which issue does this PR close?

Closes #12147

Rationale for this change

What changes are included in this PR?

  • Added map_keys scalar function. Ref : https://duckdb.org/docs/sql/functions/map.html#map_valuesmap

Are these changes tested?

Are there any user-facing changes?

dharanad avatar Aug 27 '24 10:08 dharanad

Encounter a bug. Fixed it by changing field name to item. But why ?

Internal error: Failed due to a difference in schemas, original schema: DFSchema { inner: Schema { fields: [Field { name: "map_keys(map(make_array(Int64(1),Int64(2),Int64(3)),make_array(Int64(1),Int64(2),Int64(3))))", data_type: List(Field { name: "keys", data_type: Int64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, field_qualifiers: [None], functional_dependencies: FunctionalDependencies { deps: [] } }, new schema: DFSchema { inner: Schema { fields: [Field { name: "map_keys(map(make_array(Int64(1),Int64(2),Int64(3)),make_array(Int64(1),Int64(2),Int64(3))))", data_type: List(Field { name: "item", data_type: Int64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, field_qualifiers: [None], functional_dependencies: FunctionalDependencies { deps: [] } }.

dharanad avatar Aug 27 '24 10:08 dharanad

Encounter a bug. Fixed it by changing field name to item. But why ?

Internal error: Failed due to a difference in schemas, original schema: DFSchema { inner: Schema { fields: [Field { name: "map_keys(map(make_array(Int64(1),Int64(2),Int64(3)),make_array(Int64(1),Int64(2),Int64(3))))", data_type: List(Field { name: "keys", data_type: Int64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, field_qualifiers: [None], functional_dependencies: FunctionalDependencies { deps: [] } }, new schema: DFSchema { inner: Schema { fields: [Field { name: "map_keys(map(make_array(Int64(1),Int64(2),Int64(3)),make_array(Int64(1),Int64(2),Int64(3))))", data_type: List(Field { name: "item", data_type: Int64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, field_qualifiers: [None], functional_dependencies: FunctionalDependencies { deps: [] } }.

I think it is caused by the pre-defined field name in make_array.

https://github.com/apache/datafusion/blob/7e9ea3ad59071d56093c197c5ecd5c50021deb94/datafusion/functions-nested/src/map.rs#L33-L39

Weijun-H avatar Aug 27 '24 12:08 Weijun-H

@dharanad could you add some doc for this function?

Weijun-H avatar Aug 29 '24 14:08 Weijun-H

Sorry for the mess, i thought since the change was similar for both map_keys && map_values why not club it.

dharanad avatar Aug 29 '24 20:08 dharanad

After fixing CI, it is ready to go

Weijun-H avatar Aug 31 '24 08:08 Weijun-H

After fixing CI, it is ready to go

@Weijun-H Any idea, how do i fix the docs formatting issue

dharanad avatar Aug 31 '24 08:08 dharanad

After fixing CI, it is ready to go

@Weijun-H Any idea, how do i fix the docs formatting issue

You could check here https://datafusion.apache.org/contributor-guide/howtos.html#how-to-format-md-document

Weijun-H avatar Aug 31 '24 08:08 Weijun-H

ship it!

Weijun-H avatar Aug 31 '24 09:08 Weijun-H

Thanks @dharanad contribution, and @Blizzara and @jayzhan211 for reviewing.

Weijun-H avatar Sep 01 '24 08:09 Weijun-H

Thank You @Weijun-H @jayzhan211 @Blizzara

dharanad avatar Sep 03 '24 06:09 dharanad