Document how to configure RocksDB as Flink state backend
Is your feature request related to a problem? Please describe. From Oleg Myagkov @OBenner on Gitter: "Hi! I use RocksDB stateBackend, which storage in hdfs . I have error - Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies. Is it necessary to extend the docker image of cloudflow with the necessary libraries?"
Is your feature request related to a specific runtime of cloudflow or applicable for all runtimes? Only applies to Flink runtime
Describe the solution you'd like
In the legacy single-image setup, Hadoop libraries are excluded from Flink classpath in config.sh script to avoid conflict with Spark. In the new multi-image setup, this shouldn't be necessary.
This seems similar to allowing for instance azure-blob storage as state backend, is that correct @blublinsky ? Maybe it is now possible since we support flink config in config files?
I think this is as easy as ensuring that your project include the libraries. We tested this approach with Azure blob storage and it works. On another hand Chaoran's proposal of adding them to an image works as well. I would rather use the first approach so that base image does not have additional libraries and they are added only for the ones that require them
I agree, would be good to document how to do this.
Changed issue name