remove `org.eclipse.birt.me.prettyprint.hector` from runtime
The runtime-libraries include classes from:
- org.slf4j:slf4j-api
- org.slf4j:slf4j-nop
- commons-lang:commons-lang
- org.apache.cassandra:cassandra-thrift
- org.apache.thrift:libthrift
- com.google.guava:guava
These libraries originate from org.eclipse.birt.me.prettyprint.hector.
They cannot be excluded and can cause runtime-errors like:
SLF4J: Class path contains multiple SLF4J bindings.
Furthermore if maven dependencies are used, these dependencies can not be checked by the maven-enforcer-plugin.
I would be desireable to resolve those dependencies transitively via maven, to also be able to check dependencies via maven-enforcer-plugin and in general handle dependencies the maven way.
On top hector-core is dead. See Hector-Repo
I maintain a slightly different fork of BIRT at https://github.com/triestram-partner/birt. Yesterday I created a branch ora_only where I removed (hopefully) everything related to other databases except Apache Derby and Oracle (because we only need those two). In particular, I removed Apache Cassandra / Hector.
This branch is based on the tup_main branch, not on the master branch. But there are only two commits in there, both from 2023-0213.
Thus, by looking at the changes from these commits, it should be possible to see what's necessary to remove Cassandra/Hector.
The main work was to remove lines from various *.xml files (feature.xml, pom.xml and others). Apart form that, only minor changes to *.java files were necessary.
Note that the two commits cannot be copied as-is to the master branch, because I removed support for more than just Cassandra: Hive, SQLite (?), PostgreSQL, MySQL, Sybase, Informix, DB2, ...
I maintain a slightly different fork of BIRT at https://github.com/triestram-partner/birt.
I integrated the ora_only branch into tup_main and added another commit to remove DB projects for MongoDB and Hive to my tup_main branch.
Why would you remove support and not just put an option to hide it?
Because I have no idea how to do that, and because our clients just don't need the other DBs.
And I doubt that it would help to use an option: Security scanners scan the files on disk. They don't know if a vulnerable library is actually used. So the goal is to avoid delivering these libs at all. If we managed to use a build option to not include this or that lib, then the number of artifacts to build would increase exponential.
Thus I see only two options for birt/eclipse:
-
The full build as we do now, with built-in support for several different database systems, which seems anachronistic these days from a security point of view.
-
A minimal build (just like mine, but also with Oracle helper libs removed).
For option 2), all those DB support libs should be built as separate artifacts, such that if I want to use BIRT with e.g. Oracle, I need to download a BIRT-Oracle-support.jar artifact which contains those plugins etc. which are needed for Oracle.
I would be nice if we could get drivers dynamically as they do in DBeaver.
BTW: In my fork, I removed several helper libraries for various database. I have no idea what their purpose is. Does anybody know more about it? I'm concerned about BIRT's dependency on the DataTools Platform (DTP) project which seems abandoned. Luckily, it's mature and does its job...
+1 for removing DTP (see also https://github.com/eclipse/birt/issues/762#issuecomment-1083325128)
Any new insights on that topic whether hector can/will be removed in 4.14? In my case, we are using Birt only with XML-Datasource, so could i remove the hector-plugin (remove folder and entry in bundles.info) from report engine? As long as i don't use scripted datasource etc. in my understanding it should still work?!
I'm not setup to push to Birt's repo, but here is a patch file that removes the cassandra integration. It's fairly straight forward. You could also remove the hector jar, then open the runtime .jar and prune the relevant classes if you don't want to build it yourself.
Naive_removal_of_Hector_project_which_provided_cassandra_integration__2_TODOs_added_in_-_D.patch
commit notes: Naive removal of Hector project which provided cassandra integration
2 TODOs added in
- DataSourceSelectionPage.java
Didn't investigate if further cleanup possible. eg: if ScriptDataSourceAdapter.java / DataSourceSelectionPage.java can be further cleaned up.
The Cassandra Scripted Data Source incl. Hector-runtime is removed with PR https://github.com/eclipse-birt/birt/pull/1419 and will be available with BIRT 4.14