seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

When the hive table storage type is orc, data sink fails to be written to the hive[Bug] [Module Name] Bug title

Open gaotong521 opened this issue 2 years ago • 1 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

When the hive table storage type is orc, data sinks to the hive. Some fields in hive need to be passed NULL values, so you need to configure an SQL transform and set the null field to null. After the SQL transform is configured, task execution errors are reported

SeaTunnel Version

2.3.4

SeaTunnel Config

{
    "env": {
        "parallelism": 3,
        "job.mode": "BATCH",
        "checkpoint.interval": 30000,
        "job.name": "seatunnel_1712823979630"
    },
    "source": [
        {
            "plugin_name": "Jdbc",
            "result_table_name": "table_source",
            "user": "postgres",
            "password": "C3kk4v5_b4f2Jr",
            "driver": "org.postgresql.Driver",
            "url": "jdbc:postgresql://10.188.15.91:5434/gis",
            "query": "select event_id,event_type,event_radius,event_source,start_time,end_time,priority,latitude,longitude,elevation,node_ids,create_time,update_time from ghcloud.gh_traffic_event_info"
        }
    ],
    "transform": [
        {
            "plugin_name": "FieldMapper",
            "source_table_name": "table_source",
            "result_table_name": "table_source_FieldMapper",
            "field_mapper": {
                "event_id": "event_id",
                "event_type": "event_type",
                "event_radius": "event_radius",
                "event_source": "event_source",
                "start_time": "start_time",
                "end_time": "end_time",
                "priority": "priority",
                "latitude": "latitude",
                "longitude": "longitude",
                "elevation": "elevation",
                "node_ids": "node_ids",
                "create_time": "create_time",
                "update_time": "update_time"
            }
        },
        {
            "plugin_name": "Sql",
            "source_table_name": "table_source_FieldMapper",
            "result_table_name": "table_source_FieldMapper_Sql",
            "query": "select event_id,event_type,event_radius,event_source,start_time,end_time,priority,latitude,longitude,elevation,node_ids,create_time,update_time,null as region_code,null as pub_mode,null as end_latitude,null as end_longitude,null as end_elevation,null as custom_event_name,null as comb_mode,null as ref_event_id,null as lane_id from table_source_FieldMapper"
        }
    ],
    "sink": [
        {
            "plugin_name": "Hive",
            "source_table_name": "table_source_FieldMapper_Sql",
            "table_name": "gh_cloud_data_model.dwd_pub_traffic_event",
            "metastore_uri": "thrift://cloudera-hadoop-61:9083"
        }
    ]
}

Running Command

Executed by dolphin scheduler

Error Exception

SHUTDOWN
	2024-04-12 11:24:16,507 INFO  [s.c.s.s.c.ClientExecuteCommand] [main] - Closed SeaTunnel client......
	2024-04-12 11:24:16,507 INFO  [s.c.s.s.c.ClientExecuteCommand] [main] - Closed metrics executor service ......
	2024-04-12 11:24:16,507 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - 
	
	===============================================================================
	
	
	2024-04-12 11:24:16,507 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Fatal Error, 
	
	2024-04-12 11:24:16,507 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Please submit bug report in https://github.com/apache/seatunnel/issues
	
	2024-04-12 11:24:16,507 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Reason:SeaTunnel job executed failed 
	
	2024-04-12 11:24:16,508 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
		at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
		at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
		at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
	Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: java.lang.RuntimeException: org.apache.seatunnel.common.exception.SeaTunnelRuntimeException: ErrorCode:[COMMON-23], ErrorDescription:[FileConnector write SeaTunnelRow failed, the SeaTunnelRow value is 'SeaTunnelRow{tableId=default.default.default, kind=+I, fields=[121169, 20409913, null, 1, 2021-06-05T05:47:13.379, 2021-06-05T05:47:23.379, 7, 29.777297, 107.437978, 0.0, 337,342, 2021-06-05T05:46:40.505368, 2021-06-05T05:46:40.505368, null, null, null, null, null, null, null, null, null]}'.]
		at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:257)
		at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:66)
		at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
		at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
		at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:75)
		at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:50)
		at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:51)
		at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
		at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
		at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
		at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:648)
		at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:949)
		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
		at java.lang.Thread.run(Thread.java:748)
	Caused by: org.apache.seatunnel.common.exception.SeaTunnelRuntimeException: ErrorCode:[COMMON-23], ErrorDescription:[FileConnector write SeaTunnelRow failed, the SeaTunnelRow value is 'SeaTunnelRow{tableId=default.default.default, kind=+I, fields=[121169, 20409913, null, 1, 2021-06-05T05:47:13.379, 2021-06-05T05:47:23.379, 7, 29.777297, 107.437978, 0.0, 337,342, 2021-06-05T05:46:40.505368, 2021-06-05T05:46:40.505368, null, null, null, null, null, null, null, null, null]}'.]
		at org.apache.seatunnel.common.exception.CommonError.writeSeaTunnelRowFailed(CommonError.java:86)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:136)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:46)
		at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:247)
		... 16 more
	Caused by: org.apache.seatunnel.connectors.seatunnel.file.exception.FileConnectorException: ErrorCode:[COMMON-07], ErrorDescription:[Unsupported data type] - Orc file not support this type [NULL]
		at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.buildFieldWithRowType(OrcWriteStrategy.java:188)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.buildSchemaWithRowType(OrcWriteStrategy.java:196)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.getOrCreateWriter(OrcWriteStrategy.java:116)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.write(OrcWriteStrategy.java:75)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:134)
		... 18 more
	
		at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
		... 2 more
	 
	2024-04-12 11:24:16,509 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - 
	===============================================================================
	
	
	
	Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
		at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
		at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
		at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
	Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: java.lang.RuntimeException: org.apache.seatunnel.common.exception.SeaTunnelRuntimeException: ErrorCode:[COMMON-23], ErrorDescription:[FileConnector write SeaTunnelRow failed, the SeaTunnelRow value is 'SeaTunnelRow{tableId=default.default.default, kind=+I, fields=[121169, 20409913, null, 1, 2021-06-05T05:47:13.379, 2021-06-05T05:47:23.379, 7, 29.777297, 107.437978, 0.0, 337,342, 2021-06-05T05:46:40.505368, 2021-06-05T05:46:40.505368, null, null, null, null, null, null, null, null, null]}'.]
		at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:257)
		at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:66)
		at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
		at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
		at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:75)
		at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:50)
		at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:51)
		at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
		at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
		at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
		at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:648)
		at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:949)
		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
		at java.lang.Thread.run(Thread.java:748)
	Caused by: org.apache.seatunnel.common.exception.SeaTunnelRuntimeException: ErrorCode:[COMMON-23], ErrorDescription:[FileConnector write SeaTunnelRow failed, the SeaTunnelRow value is 'SeaTunnelRow{tableId=default.default.default, kind=+I, fields=[121169, 20409913, null, 1, 2021-06-05T05:47:13.379, 2021-06-05T05:47:23.379, 7, 29.777297, 107.437978, 0.0, 337,342, 2021-06-05T05:46:40.505368, 2021-06-05T05:46:40.505368, null, null, null, null, null, null, null, null, null]}'.]
		at org.apache.seatunnel.common.exception.CommonError.writeSeaTunnelRowFailed(CommonError.java:86)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:136)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:46)
		at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:247)
		... 16 more
	Caused by: org.apache.seatunnel.connectors.seatunnel.file.exception.FileConnectorException: ErrorCode:[COMMON-07], ErrorDescription:[Unsupported data type] - Orc file not support this type [NULL]
		at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.buildFieldWithRowType(OrcWriteStrategy.java:188)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.buildSchemaWithRowType(OrcWriteStrategy.java:196)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.getOrCreateWriter(OrcWriteStrategy.java:116)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.write(OrcWriteStrategy.java:75)
		at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:134)
		... 18 more
	
		at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
		... 2 more
	2024-04-12 11:24:16,510 INFO  [s.c.s.s.c.ClientExecuteCommand] [ForkJoinPool.commonPool-worker-2] - run shutdown hook because get close signal
[INFO] 2024-04-12 11:24:16.913 +0800 - FINALIZE_SESSION

Zeta or Flink or Spark Version

No response

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

gaotong521 avatar Apr 12 '24 03:04 gaotong521

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar May 13 '24 00:05 github-actions[bot]

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.

github-actions[bot] avatar May 21 '24 00:05 github-actions[bot]