novisfff
novisfff
and this is the code of build writeConfig of writeClient. enableEmbeddedTimelineService is false `public static HoodieWriteConfig getHoodieClientConfig( Configuration conf, boolean enableEmbeddedTimelineService, boolean loadFsViewStorageConfig) { ... .withEmbeddedTimelineServerEnabled(enableEmbeddedTimelineService) .withEmbeddedTimelineServerReuseEnabled(true) // make write...
> > and this is the code of build writeConfig of writeClient. enableEmbeddedTimelineService is false `public static HoodieWriteConfig getHoodieClientConfig( Configuration conf, boolean enableEmbeddedTimelineService, boolean loadFsViewStorageConfig) { ... .withEmbeddedTimelineServerEnabled(enableEmbeddedTimelineService) .withEmbeddedTimelineServerReuseEnabled(true) //...
> 多模态模型不要使用 lazy_tokenize false > > 可以参考这里提高训练速度:https://github.com/modelscope/ms-swift/blob/main/examples/train/padding_free/sft.sh 但是改 lazy_tokenize true之后,整个dataset的map过程过程也异常的慢,还有就是map过程只能使用单线程吗,我尝试修改多线程后map过程会卡死然后报错:RuntimeError: One of the subprocesses has abruptly died during map operation.To debug the error, disable multiprocessing.