chunjun icon indicating copy to clipboard operation
chunjun copied to clipboard

[flinkx with fluxdb][metrics reporter] java.lang.NoSuchMethodError: org.apache.commons.math3.stat.descriptive.rank.Percentile.withNaNStrategy

Open tencentemr opened this issue 3 years ago • 0 comments

Search before asking

  • [X] I had searched in the issues and found no similar question.

  • [X] I had googled my question but i didn't get any help.

  • [X] I had read the documentation: ChunJun doc but it didn't help me.

Description

基础环境:flinkx on yarn 集群,提交任务到oozie调度最后 on yarn运行。

flinkx yaml文件加入fluxdb reporter相关的参数或者 prometheus pushgateway相关的参数,

metrics.reporter.influxdb.class: org.apache.flink.metrics.influxdb.InfluxdbReporter metrics.reporter.influxdb.host: xxx metrics.reporter.influxdb.port: xxx metrics.reporter.influxdb.db: flink_metrics

新提交flink 任务,运行中都会出现如下错误。未加入上报参数以前一切正常。 java.lang.NoSuchMethodError: org.apache.commons.math3.stat.descriptive.rank.Percentile.withNaNStrategy(Lorg/apache/commons/math3/stat/ranking/NaNStrategy;)Lorg/apache/commons/math3/stat/descriptive/rank/Percentile;

完整错误信息: 2022-09-19 16:13:17.830 [Flink-MetricRegistry-thread-1] WARN org.apache.flink.runtime.metrics.MetricRegistryImpl - Error while reporting metrics java.lang.NoSuchMethodError: org.apache.commons.math3.stat.descriptive.rank.Percentile.withNaNStrategy(Lorg/apache/commons/math3/stat/ranking/NaNStrategy;)Lorg/apache/commons/math3/stat/descriptive/rank/Percentile; at org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics$CommonMetricsSnapshot.(DescriptiveStatisticsHistogramStatistics.java:96) at org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics$CommonMetricsSnapshot.(DescriptiveStatisticsHistogramStatistics.java:90) at org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics.(DescriptiveStatisticsHistogramStatistics.java:40) at org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogram.getStatistics(DescriptiveStatisticsHistogram.java:49) at org.apache.flink.metrics.influxdb.MetricMapper.map(MetricMapper.java:50) at org.apache.flink.metrics.influxdb.InfluxdbReporter.buildReport(InfluxdbReporter.java:149) at org.apache.flink.metrics.influxdb.InfluxdbReporter.report(InfluxdbReporter.java:127) at org.apache.flink.runtime.metrics.MetricRegistryImpl$ReporterTask.run(MetricRegistryImpl.java:494) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

1.15 1.12 的flink社区版经测试暂未发现该问题。

已知的类似案例提供的方案有 exclude,relocation,或者移除 jar包------但是这些方案都不太适合。

排除或者重置需要调整应用,而我们有大量的生产环境app,无法一一调整。 移除jar包也不行,还有其他服务可能调用。

这种情况如何解决

Code of Conduct

tencentemr avatar Sep 19 '22 08:09 tencentemr