orc icon indicating copy to clipboard operation
orc copied to clipboard

Cleanup unused members

Open pavibhai opened this issue 2 years ago • 4 comments

Looking through the ORC code we seem to have some members that are not used from within the ORC project. It is likely that some of these are being used externally.

We will have to go through this list and decide on the following:

  • Which ones can be safely deleted as they are not being used anywhere?
  • Which ones shall have to be migrated as extension/injection and see if we can move this functionality into the consuming project
  • For ones that will still remain within ORC we should add unit tests

Here are some examples that come readily to mind. We will build a comprehensive list of these as part of this task.

  • PhysicalFSWriter.getOptions
  • RecordReaderImpl.encodeTranslatedSargColumn
  • RecordReaderImpl.mapTranslatedSargColumns
  • RecordReaderUtils.readRanges
  • SchemaEvolution.getPositionalColumns
  • SerializationUtils.parseDateFromString

pavibhai avatar May 24 '23 17:05 pavibhai

Thank you, @pavibhai !

dongjoon-hyun avatar May 27 '23 16:05 dongjoon-hyun

BTW, I want to propose to broaden the definition of Unused.

At least, we need to check a subset of the popular downstream projects. For example, the following.

  • Apache Spark
  • Apache Arrow
  • Apache Hive
  • Apache Flink

dongjoon-hyun avatar May 27 '23 21:05 dongjoon-hyun

Started looking into it. After running the inspection in Intellij, it gave 184 warning. Perhaps a lot of noise in here. Trying to see what would be the best way to create the list and validate against the repo's mentioned above.

Suggestions?

paliwalashish avatar Nov 27 '23 05:11 paliwalashish

This is removed from Milestone 2.0.0.

dongjoon-hyun avatar Dec 27 '23 19:12 dongjoon-hyun