datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Blog post with DataFusion Jan - June 2024

Open alamb opened this issue 1 year ago • 5 comments

Is your feature request related to a problem or challenge?

We have had good luck writing up quarterly updates for DataFusion, most recently: https://arrow.apache.org/blog/2024/01/19/datafusion-34.0.0/

(see https://github.com/apache/arrow-datafusion/issues/6780)

Describe the solution you'd like

Write a blog post

Describe alternatives you've considered

No response

Additional context

No response

alamb avatar Mar 13 '24 17:03 alamb

I am starting to collect a list of things to highlight here

  • New trait based APIs
  • Pluggable handler for CREATE FUNCTIONL https://github.com/apache/arrow-datafusion/pull/9333 (thanks @milenkovicm)
  • DataFusion Comet blog published: https://arrow.apache.org/blog/2024/03/06/comet-donation/
  • Move of DataFusion to a top level Apache project was approved by the community: https://github.com/apache/arrow-datafusion/discussions/6475
  • Large scale "extract scalar functions from the core" continues at good pace way https://github.com/apache/arrow-datafusion/issues/9285

Performance improvements:

  • specialized group values for strings/binary https://github.com/apache/arrow-datafusion/pull/8827

  • Meetup

  • Agenda for DataFusion meetup 2024 is looking good https://github.com/apache/arrow-datafusion/discussions/8522

  • DataFusion SIGMOD paper about DataFusion was https://github.com/apache/arrow-datafusion/issues/8373#issuecomment-1925913783

SQL to String features from @devinjdangelo / @backkem

  • https://github.com/apache/arrow-datafusion/issues/9494

Maybe REcursive CTEs:

  • Hardening Recursive CTEs https://github.com/apache/arrow-datafusion/issues/462 with @matthewgapp and @jonahgao

alamb avatar Mar 13 '24 17:03 alamb

CTE support: https://github.com/apache/arrow-datafusion/pull/9619

alamb avatar Mar 16 '24 11:03 alamb

Just a few things that come to mind:

Functions:

  • New: nv1, nvl2
  • Improvements: unnest, null handling improvements for lead/lag

Omega359 avatar Mar 18 '24 13:03 Omega359

WASM from @waynexia https://github.com/apache/arrow-datafusion/discussions/9834

alamb avatar Apr 01 '24 19:04 alamb

I am now officially out of time and excuses -- I need to write this post soon

alamb avatar Jul 01 '24 09:07 alamb

Started gathering ideas https://github.com/apache/datafusion-site/pull/6

alamb avatar Jul 09 '24 12:07 alamb

I plan one more round of copyediting and then posting in 2 days. Please leave comments if you have any: https://github.com/apache/datafusion-site/pull/6

alamb avatar Jul 22 '24 17:07 alamb

Blog post is live: https://datafusion.apache.org/blog/2024/07/24/datafusion-40.0.0/

alamb avatar Jul 24 '24 11:07 alamb

Filed https://github.com/apache/datafusion/issues/11631 to track the next one

alamb avatar Jul 24 '24 11:07 alamb