DataX icon indicating copy to clipboard operation
DataX copied to clipboard

DataX是阿里云DataWorks数据集成的开源版本。涛思数据基于DataX,开发了TDengine的Writer和Reader插件,为用户提供ETL和数据迁移的工具。

Results 16 DataX issues
Sort by recently updated
recently updated
newest added

postgres测试数据 ![image](https://github.com/taosdata/DataX/assets/163239463/89af778b-f6ec-40d1-9414-e933d3d67c78) 配置文件 ![image](https://github.com/taosdata/DataX/assets/163239463/27012e2c-3175-4000-b894-ceab9a2429d6) 错误信息: ![image](https://github.com/taosdata/DataX/assets/163239463/ffdcd9f5-3e6f-428b-873f-12822b59e9ac)

2024-01-23 13:40:27.865 [0-0-0-writer] ERROR StdoutPluginCollector - 脏数据: {"exception":"TDengine ERROR (0x80003002): Invalid data format","record":[{"byteSize":8,"index":0,"rawData":1705770149000,"type":"DATE"},{"byteSize":1,"index":1,"rawData":2,"type":"LONG"},{"byteSize":10,"index":2,"rawData":1705770318,"type":"LONG"},{"byteSize":3,"index":3,"rawData":169,"type":"LONG"},{"byteSize":9,"index":4,"rawData":"4.7264E-4","type":"DOUBLE"},{"byteSize":10,"index":5,"rawData":"0.00571764","type":"DOUBLE"},{"byteSize":23,"index":6,"rawData":"id_15","type":"STRING"},{"byteSize":5,"index":7,"rawData":"15","type":"STRING"}],"type":"writer"} 行数据:[balanced_state,tname=id_15,device_id=15 battery_state=3,end_time=1705770742,duration=424,energy=6.4E-7f64,capacity=6.4E-7f64 1705770318000] 2024-01-23 13:40:27.865 [0-0-0-writer] ERROR DefaultDataHandler - TDengine ERROR (0x80003002): Invalid data format 读取没有问题,但是写入的时候会报数据格式错误,是因为已经在3.2.1.0的库中创建了超级表,在进行无模式插入时,例如battery_state=3表示的是double类型,实际应该插入battery_state=3u8的形式,最终导致格式错误 有什么解决方案吗?应该是Datax将数据转为Long类型,但是转不回老版本的taos数据库类型?

Describe the bug 描述你遇到的问题 通过datax把mongodb中多个集合数据导入到tdengine中的同一个超级表时报OOM,可复现的现象是导入第一个mongodb集合成功,第二个集合开始就失败报OOM 使用的数据库和datax版本 Mongodb:4.0.3 Tdengine:3.0.5.0 Datax:mongodbreader,tdengine30writer To Reproduce 如何重现问题 1:tdengine新建数据库 2:mongo中有待迁移集合N个,每个集合上亿条数据 3:当tdengine待迁移的超级表中无数据时,迁移任意一个mongo集合到tdengine中都可以成功 4:当待迁移tdengine库的超级表中已有上亿条数据后,再通过datax迁移mongodb任意一个集合(包括之前迁移成功的集合)数据时datax发生OOM 问题排查过程 1:调大datax内存到6G,一样发生OOM 2:排除datax配置问题 3:datax发生OOM期间tdengine数据库cpu高启,源端mongodb无导出流量显示,判断为在mongodb数据导出前datax发生的OOM 系统监控截图 ![image](https://github.com/taosdata/DataX/assets/53860974/ef421de3-4a37-428c-b74f-a0ac44e10334) 追踪hprof文件后,定位到datax问题源码的截图: ![image](https://github.com/taosdata/DataX/assets/53860974/4f57a169-cf6f-4d47-bcc6-b0c2c3059879) 直接在tdengine中执行sql,复现了一样的问题,判断是datax把所有tagid给加载到了datax的内存中,导致OOM ![image](https://github.com/taosdata/DataX/assets/53860974/01b8cc12-bb96-44b5-9f40-61c9f8fae155) Expected behavior 期待修复的效果 不是很确定为什么datax需要执行下面的代码,感觉意义不大,是否可以屏蔽掉或者只抓取每个子表tagid就行了,不需要加载具体明细tagid数据

[INFO] --------------------------------[ jar ]--------------------------------- Downloading from central: https://maven.aliyun.com/repository/central/com/alibaba/datax/tdenginewriter/tdenginewriter/0.0.1-SNAPSHOT/maven-metadata.xml Downloading from central: https://maven.aliyun.com/repository/central/com/alibaba/datax/tdenginewriter/tdenginewriter/0.0.1-SNAPSHOT/tdenginewriter-0.0.1-SNAPSHOT.pom [WARNING] The POM for com.alibaba.datax.tdenginewriter:tdenginewriter:jar:0.0.1-SNAPSHOT is missing, no dependency information available Downloading from central: https://maven.aliyun.com/repository/central/com/alibaba/datax/tdenginewriter/tdenginewriter/0.0.1-SNAPSHOT/tdenginewriter-0.0.1-SNAPSHOT.jar [INFO] [INFO] ------------------------------------------------------------------------...

https://github.com/taosdata/DataX/blob/4c498354a166ff55e60e5da65f51e3a5cb6b449c/tdengine30writer/src/main/java/com/alibaba/datax/plugin/writer/tdengine30writer/SchemaManager.java#L115 https://github.com/taosdata/DataX/blob/4c498354a166ff55e60e5da65f51e3a5cb6b449c/tdengine30writer/src/main/java/com/alibaba/datax/plugin/writer/tdengine30writer/Schema3_0Manager.java#L146 column name shuld Enclose with back quotes ,to solve the uppercase column name .

java.lang.UnsatisfiedLinkError: Native Library C:\Windows\System32\taos.dll already loaded in another classloader at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1900) ~[na:1.8.0_261] at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1850) ~[na:1.8.0_261] at java.lang.Runtime.loadLibrary0(Runtime.java:871) ~[na:1.8.0_261] at java.lang.System.loadLibrary(System.java:1122) ~[na:1.8.0_261] at com.taosdata.jdbc.TSDBJNIConnector.(TSDBJNIConnector.java:28) ~[taos-jdbcdriver-2.0.42.jar:na] at com.taosdata.jdbc.TSDBDriver.connect(TSDBDriver.java:162) ~[taos-jdbcdriver-2.0.42.jar:na] at java.sql.DriverManager.getConnection(DriverManager.java:664) ~[na:1.8.0_261]...

### 对应错误 2023-06-08 17:45:35.245 [job-0] INFO StandAloneJobContainerCommunicator - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s |...

DataX TDengine,可以多个超级表一起操作吗,column可以不写吗【会有字段重复的问题】,或者可以直接数据库同步吗?

com.alibaba.datax.common.exception.DataXException: Code:[TDengineWriter-02], Description:[runtime exception]. - cannot find col: ts in columns: [ts, i_a, i_b, i_c, i_sum, elc, u_a, u_b, u_c, power, corp_id, equipid, line_id] at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:30) ~[datax-common-0.0.1-SNAPSHOT.jar:na] at com.alibaba.datax.plugin.writer.tdenginewriter.DefaultDataHandler.indexOf(DefaultDataHandler.java:552) [tdenginewriter-0.0.1-SNAPSHOT.jar:na]...

![微信图片_20220928175703](https://user-images.githubusercontent.com/39253273/192750009-c68e755e-191d-466c-836c-ad67decd4711.png) Exception in thread "main" java.lang.NoSuchMethodError: com.alibaba.fastjson.JSONArray.getTimestamp(I)Ljava/lang/Object; at com.taosdata.jdbc.rs.RestfulResultSet.parseTimestampColumnData(RestfulResultSet.java:255) at com.taosdata.jdbc.rs.RestfulResultSet.parseColumnData(RestfulResultSet.java:183) at com.taosdata.jdbc.rs.RestfulResultSet.(RestfulResultSet.java:98) at com.taosdata.jdbc.rs.RestfulStatement.execute(RestfulStatement.java:88) at com.taosdata.jdbc.rs.RestfulStatement.executeQuery(RestfulStatement.java:37) at Test.main(Test.java:16)