rickyhuo comments

Results 11 comments of


                                            rickyhuo

支持flush_interval

这部分目前不支持，还在做。如果flush_interval是必要选项。可以试试 https://github.com/RickyHuo/teleport

TabSeparated format Nested嵌套格式支持问题

@LyStormrage 我们线上没有使用嵌套结构，所以这个对于TabSeparated还不支持。感谢反馈，我会尽快支持

TabSeparated format Nested嵌套格式支持问题

@LyStormrage 一些高并发的JSON数据，或者单条数据量特别大的场景我们使用了Teleport。在需要正则解析的场景下，当前Teleport的性能不见得有Hangout好。目前Hangout的使用比Teleport多

TabSeparated format Nested嵌套格式支持问题

但是Teleport也是不支持嵌套格式的噢

TabSeparated format Nested嵌套格式支持问题

@LyStormrage 最后这1万条数据如果没有新数据进来或者进程没有结束，可能一直没法进入数据库中。你说的这个需求我之前一直考虑过，但一直没有放到master分支上。如果要加入计时器，由于Hangout架构问题，如果要实现需要涉及到多线程和锁的一些东西，这会影响到数据的写入效率，我测试过有比较明显的下降。 https://github.com/RickyHuo/hangout-output-clickhouse/tree/rickyhuo.fea.timer 这个分支是用Vector替代ArrayList实现消息存储。

TabSeparated format Nested嵌套格式支持问题

是的，之前我们也有从本地导数据到ClickHouse的操作，大概是 cat file.txt | ./bin/hangout -f conf.yaml Hangout在捕获到文件结束符的时候就会结束进程也会自动把最后剩下的数据写入ClickHouse

TabSeparated format Nested嵌套格式支持问题

> Teleport项目还在吗 Teleport 目前不打算开源，有类似功能可以参考 [gohangout](https://github.com/childe/gohangout)

running with json configuration file not work

AFAIK, the follow config can works: ``` { "env" : { "spark.app.name" : "SeaTunnel", "spark.executor.instances" : 2, "spark.executor.cores" : 1, "spark.executor.memory" : "1g" }, "source" : { "fake": { "result_table_name":...

[Feature]support latest version of Spark 3.2.0

https://spark.apache.org/docs/latest/sql-migration-guide.html#upgrading-from-spark-sql-24-to-30

[Feature]support latest version of Spark 3.2.0

@ashmeet-kandhari Thanks for your contribution. I want to know the version compatibility with Spark lower version or Scala because I know Spark 3.X built with Scala 2.12. If we can...