DataX icon indicating copy to clipboard operation
DataX copied to clipboard

修复oceanbasev10writer遇到带小版本号的OB处理异常

Open jeromexx1 opened this issue 3 years ago • 2 comments

com.alibaba.datax.plugin.writer.oceanbasev10writer.task.ConcurrentTableWriterTask的136行。遇到有小版本好的OB会抛异常例如3.2.3.1 建议需改为: String version = config.getString(Config.OB_VERSION); String[] versionArray = version.split("\."); String majorVersion = versionArray[0] + versionArray[1]; if ((Integer.valueOf(majorVersion) >= 21)) { isOb2 = true; }

jeromexx1 avatar Sep 05 '22 04:09 jeromexx1

这个问题已经修复,最新的datax已经适配了4位版本号的ob

johnrobbet avatar Oct 11 '22 12:10 johnrobbet

Error:使用DataX中oceanbasev10writer插件向OceanBase3.2.3.0中写数据报错:java.lang.NumberFormatException: multiple points

使用版本:DataX-datax_v202209

使用的配置文件:job.json

{
    "job": {
        "setting": {
            "speed": {
                "byte":10485760
            },
            "errorLimit": {
                "record": 0,
                "percentage": 0.02
            }
        },
        "content": [
            {
                "reader": {
                    "name": "streamreader",
                    "parameter": {
                        "column" : [
                            {
                                "value": "DataX",
                                "type": "string"
                            },
                            {
                                "value": 19890604,
                                "type": "long"
                            },
                            {
                                "value": "1989-06-04 00:00:00",
                                "type": "date"
                            },
                            {
                                "value": true,
                                "type": "bool"
                            },
                            {
                                "value": "test",
                                "type": "bytes"
                            }
                        ],
                        "sliceRecordCount": 100000
                    }
                },
                "writer": {
                    "name": "oceanbasev10writer",
                    "parameter": {
                        "obWriteMode": "insert",
                        "column": [
                            "*"
                        ],
                        "preSql": [
                            "truncate table testfordatax"
                        ],
                        "connection": [
                            {
                                "jdbcUrl": "||_dsc_ob10_dsc_||obcluster:test_tenant||_dsc_ob10_dsc_||jdbc:oceanbase://192.168.100.15:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true",
                                "table": [
                                    "testfordatax"
                                ]
                            }
                        ],
                        "username": "test",
                        "password":"test#",
                        "writerThreadCount":10,
                        "batchSize": 1000,
                        "memstoreThreshold": "0.9"
                    }
                }
            }
        ]
    }
}

错误日志:

Class:com.alibaba.datax.plugin.writer.oceanbasev10writer.task.ConcurrentTableWriterTask
16:57:02.080 [0-0-1-reader] DEBUG c.a.d.c.t.runner.ReaderRunner - task reader starts to do prepare ...
16:57:02.082 [0-0-1-reader] DEBUG c.a.d.c.t.runner.ReaderRunner - task reader starts to read ...
16:57:02.085 [0-0-1-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - configure url is unavailable, use obclient for connections.
16:57:02.208 [0-0-4-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - this is oracle compatible mode, change database to SHORTINSAPP, table to TESTFORDATAX
16:57:02.209 [0-0-4-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - Disable partition calculation feature.
16:57:02.327 [0-0-1-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - this is oracle compatible mode, change database to SHORTINSAPP, table to TESTFORDATAX
16:57:02.329 [0-0-1-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - Disable partition calculation feature.
16:57:02.434 [0-0-3-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - this is oracle compatible mode, change database to SHORTINSAPP, table to TESTFORDATAX
16:57:02.434 [0-0-3-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - Disable partition calculation feature.
16:57:02.535 [0-0-2-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - this is oracle compatible mode, change database to SHORTINSAPP, table to TESTFORDATAX
16:57:02.536 [0-0-2-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - Disable partition calculation feature.
16:57:03.291 [0-0-0-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - this is oracle compatible mode, change database to SHORTINSAPP, table to TESTFORDATAX
16:57:03.292 [0-0-0-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - Disable partition calculation feature.
16:57:03.701 [0-0-2-writer] INFO  c.a.d.p.r.w.CommonRdbmsWriter$Task - write mode: insert
16:57:03.701 [0-0-2-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - writeRecordSql :INSERT INTO TESTFORDATAX (toolname,appid,OPERATEDATE,isfinished) VALUES(?,?,?,?)
16:57:03.711 [0-0-2-writer] ERROR c.a.d.c.t.runner.WriterRunner - Writer Runner Received Exceptions:
java.lang.NumberFormatException: multiple points
	at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1914) ~[na:1.8.0_282]
	at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122) ~[na:1.8.0_282]
	at java.lang.Float.parseFloat(Float.java:451) ~[na:1.8.0_282]
	at java.lang.Float.valueOf(Float.java:416) ~[na:1.8.0_282]
	at com.alibaba.datax.plugin.writer.oceanbasev10writer.task.ConcurrentTableWriterTask.init(ConcurrentTableWriterTask.java:136) ~[oceanbasev10writer-0.0.1-SNAPSHOT.jar:na]
	at com.alibaba.datax.plugin.writer.oceanbasev10writer.OceanBaseV10Writer$Task.init(OceanBaseV10Writer.java:222) ~[oceanbasev10writer-0.0.1-SNAPSHOT.jar:na]
	at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:44) ~[classes/:na]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_282]
16:57:03.711 [0-0-2-writer] DEBUG c.a.d.c.t.runner.WriterRunner - task writer starts to do destroy ...
Exception in thread "taskGroup-0" com.alibaba.datax.common.exception.DataXException: Code:[Framework-13], Description:[DataX插件运行时出错, 具体原因请参看DataX运行结束时的错误诊断信息 .].  - java.lang.NumberFormatException: multiple points
	at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1914)
	at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
	at java.lang.Float.parseFloat(Float.java:451)
	at java.lang.Float.valueOf(Float.java:416)
	at com.alibaba.datax.plugin.writer.oceanbasev10writer.task.ConcurrentTableWriterTask.init(ConcurrentTableWriterTask.java:136)
	at com.alibaba.datax.plugin.writer.oceanbasev10writer.OceanBaseV10Writer$Task.init(OceanBaseV10Writer.java:222)
	at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:44)
	at java.lang.Thread.run(Thread.java:748)
 - java.lang.NumberFormatException: multiple points
	at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1914)
	at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
	at java.lang.Float.parseFloat(Float.java:451)
	at java.lang.Float.valueOf(Float.java:416)
	at com.alibaba.datax.plugin.writer.oceanbasev10writer.task.ConcurrentTableWriterTask.init(ConcurrentTableWriterTask.java:136)
	at com.alibaba.datax.plugin.writer.oceanbasev10writer.OceanBaseV10Writer$Task.init(OceanBaseV10Writer.java:222)
	at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:44)
	at java.lang.Thread.run(Thread.java:748)

	at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:48)
	at com.alibaba.datax.core.taskgroup.TaskGroupContainer.start(TaskGroupContainer.java:195)
	at com.alibaba.datax.core.taskgroup.runner.TaskGroupContainerRunner.run(TaskGroupContainerRunner.java:24)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NumberFormatException: multiple points
	at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1914)
	at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
	at java.lang.Float.parseFloat(Float.java:451)
	at java.lang.Float.valueOf(Float.java:416)
	at com.alibaba.datax.plugin.writer.oceanbasev10writer.task.ConcurrentTableWriterTask.init(ConcurrentTableWriterTask.java:136)
	at com.alibaba.datax.plugin.writer.oceanbasev10writer.OceanBaseV10Writer$Task.init(OceanBaseV10Writer.java:222)
	at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:44)
	... 1 more

问题分析/猜想:

通过查看报错日志,在WriterRunner.java:44->OceanBaseV10Writer.java:222->ConcurrentTableWriterTask.java:136等关键位置打断点追踪异常,最终定位到代码136行抛出了异常:java.lang.NumberFormatException: multiple points。

查看源代码136行:

if ((Float.valueOf(version.substring(0, pIdx)) >= 2.1f)) {
			isOb2 = true;
		}

观察上下文代码,这段代码目的是要比较OceanBase的版本号是否>=2.1,进而将OBflag重置为ture。猜想Float.valueOf(String a)并不能将类似3.2.3.0这样包含多个点的字符串转换成一个Float。此处使用Float.valueOf(String a)进行转换,未考虑到参数可能包含多个点,会触发multiple points 异常,进而引起无法进行OceanBase版本号比较的错误。(正确使用方法见文末参考链接)

image-20221024233124115

复现/验证:

为了验证猜想,将if语句中的类型转换语句提取并做了问题复现及验证:

image-20221024234737667

建议/修复:

确认问题是类型参数转换问题后,尝试采用自定义版本号比较方法进行修复 image-20221025000551634

image-20221025000708567

修复验证结果:

算法验证:

image-20221025002155201

修复后验证:

18:55:29.647 [0-0-0-writer] INFO  c.a.d.p.w.o.task.ColumnMetaCache - fetch columnMeta of table TESTFORDATAX success
18:55:29.648 [0-0-0-writer] DEBUG c.a.d.p.w.o.t.ConcurrentTableWriterTask - fail to calculate parition id, just put into the default buffer.
18:55:29.810 [0-0-0-writer] INFO  c.a.d.p.r.w.CommonRdbmsWriter$Task - isMemstoreFull=false
18:55:29.810 [0-0-0-writer] INFO  c.a.d.p.w.o.t.ConcurrentTableWriterTask - ConcurrentTableWriter has put all task in queue, queueSize = 0,  total = 1, finished = 1
18:55:29.810 [0-0-0-writer] DEBUG c.a.d.p.w.o.t.ConcurrentTableWriterTask - wait all InsertTask finished ...
18:55:30.532 [0-insertTask-21] DEBUG c.a.d.p.w.o.task.InsertTask - not more task, thread exist ...
18:55:30.532 [0-insertTask-21] DEBUG c.a.d.p.w.o.task.InsertTask - Thread exist...
18:55:32.326 [0-0-0-writer] DEBUG c.a.d.c.t.runner.WriterRunner - task writer starts to do post ...
18:55:43.912 [0-0-0-writer] DEBUG c.a.d.c.t.runner.WriterRunner - task writer starts to do destroy ...
18:55:43.912 [job-0] DEBUG c.a.d.c.j.s.AbstractScheduler - com.alibaba.datax.core.statistics.communication.Communication@1725dc0f[
  counter={writeSucceedRecords=2, readSucceedRecords=1, totalErrorBytes=0, writeSucceedBytes=24, byteSpeed=0, totalErrorRecords=0, recordSpeed=0, waitReaderTime=0, writeReceivedBytes=24, waitWriterTime=50700, percentage=0.0, totalReadRecords=1, writeReceivedRecords=2, readSucceedBytes=24, totalReadBytes=24}
  state=RUNNING
  throwable=<null>
  timestamp=1666608943911
  message={}
]
18:55:43.912 [job-0] INFO  c.a.d.c.s.c.c.j.StandAloneJobContainerCommunicator - Total 1 records, 24 bytes | Speed 1B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 0.00%
18:55:44.014 [taskGroup-0] INFO  c.a.d.c.taskgroup.TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[71243]ms
18:55:44.014 [taskGroup-0] INFO  c.a.d.c.taskgroup.TaskGroupContainer - taskGroup[0] completed it's tasks.
18:55:53.927 [job-0] DEBUG c.a.d.c.j.s.AbstractScheduler - com.alibaba.datax.core.statistics.communication.Communication@3911c2a7[
  counter={writeSucceedRecords=2, readSucceedRecords=1, totalErrorBytes=0, writeSucceedBytes=24, byteSpeed=0, totalErrorRecords=0, recordSpeed=0, waitReaderTime=0, writeReceivedBytes=24, stage=1, waitWriterTime=50700, percentage=1.0, totalReadRecords=1, writeReceivedRecords=2, readSucceedBytes=24, totalReadBytes=24}
  state=SUCCEEDED
  throwable=<null>
  timestamp=1666608953927
  message={}
]
18:55:53.929 [job-0] INFO  c.a.d.c.s.c.c.j.StandAloneJobContainerCommunicator - Total 1 records, 24 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 100.00%
18:55:53.929 [job-0] INFO  c.a.d.c.j.s.AbstractScheduler - Scheduler accomplished all tasks.
18:55:53.929 [job-0] DEBUG c.a.datax.core.job.JobContainer - jobContainer starts to do post ...
18:55:53.930 [job-0] INFO  c.a.datax.core.job.JobContainer - DataX Writer.Job [oceanbasev10writer] do post work.
18:55:53.931 [job-0] INFO  c.a.datax.core.job.JobContainer - DataX Reader.Job [oceanbasev10reader] do post work.
18:55:53.931 [job-0] DEBUG c.a.datax.core.job.JobContainer - jobContainer starts to do postHandle ...
18:55:53.932 [job-0] INFO  c.a.datax.core.job.JobContainer - DataX jobId [0] completed successfully.
18:55:53.936 [job-0] INFO  c.a.d.c.container.util.HookInvoker - No hook invoked, because base dir not exists or is a file: F:\Code\DataX-datax_v202209\target\datax\datax\hook
18:55:53.939 [job-0] INFO  c.a.datax.core.job.JobContainer - 
	 [total cpu info] => 
		averageCpu                     | maxDeltaCpu                    | minDeltaCpu                    
		-1.00%                         | -1.00%                         | -1.00%
                        

	 [total gc info] => 
		 NAME                 | totalGCCount       | maxDeltaGCCount    | minDeltaGCCount    | totalGCTime        | maxDeltaGCTime     | minDeltaGCTime     
		 PS MarkSweep         | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s             
		 PS Scavenge          | 1                  | 1                  | 1                  | 0.005s             | 0.005s             | 0.005s             

18:55:53.939 [job-0] INFO  c.a.datax.core.job.JobContainer - PerfTrace not enable!
18:55:53.940 [job-0] INFO  c.a.d.c.s.c.c.j.StandAloneJobContainerCommunicator - Total 1 records, 24 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 100.00%
18:55:53.942 [job-0] INFO  c.a.datax.core.job.JobContainer - 
任务启动时刻                    : 2022-10-24 18:54:31
任务结束时刻                    : 2022-10-24 18:55:53
任务总计耗时                    :                 82s
任务平均流量                    :                0B/s
记录写入速度                    :              0rec/s
读出记录总数                    :                   1
读写失败总数                    :                   0

Disconnected from the target VM, address: '127.0.0.1:58873', transport: 'socket'

Process finished with exit code 0

参考:

rt.jar:com.java.long.Float.valueOf(String a)使用说明:

https://www.runoob.com/java/number-valueof.html

http://www.manongjc.com/detail/30-wdowdwvkfxfqolx.html

Java版本号比较算法:

https://stackoverflow.com/questions/198431/how-do-you-compare-two-version-strings-in-java#

https://www.baeldung.com/java-comparing-versions

wistwill avatar Oct 24 '22 17:10 wistwill