XingweiChen
XingweiChen
我在spark集群上提交了word2vec配置如下(spark-submit): spark.ps.instances=10 spark.ps.cores=2 spark.ps.memory=35g spark.driver.memory=35g spark.driver.memory=35g 集群核数为189,内存总量为2T+ 模型参数为batch_size=4096,negative=10,window=5,**embedding=20** 我的数据集大小为400+G,节点个数为1.4+亿。    增加了ps内存是根据文档介绍来设的,再加一些还是报这个错 请问,这个问题有啥建议吗
I am trying the code in https://github.com/Angel-ML/angel/blob/branch-2.2.0/docs/programmers_guide/spark_on_angel_programing_guide_en.md . I copy paste those code for LR example. I run it with command: spark-submit --master yarn-cluster --conf spark.ps.jars=$SONA_ANGEL_JARS --conf spark.ps.instances=1 --conf spark.ps.cores=1...
spark-submit --master yarn-cluster --conf spark.ps.jars=hdfs:///user/brook/sona-0.1.0-bin/lib/fastutil-7.1.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/htrace-core-2.05.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/sizeof-0.3.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/kryo-shaded-4.0.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/minlog-1.3.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/memory-0.8.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/commons-pool-1.6.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netty-all-4.1.17.Final.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/hll-1.6.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/jniloader-1.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/native_system-java-1.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/arpack_combined_all-0.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/core-1.1.2.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_ref-linux-armhf-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_ref-linux-i686-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_ref-linux-x86_64-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_system-linux-armhf-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_system-linux-i686-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_system-linux-x86_64-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/jettison-1.4.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/json4s-native_2.11-3.2.11.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-format-0.1.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-mlcore-0.1.2.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-ps-core-3.0.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-ps-mllib-3.0.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-ps-psf-3.0.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-math-0.1.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-ps-graph-3.0.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/core-0.1.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angelml-0.1.0.jar,hdfs:///user/brook/angel-2.1.0-bin/lib/scala-library-2.11.8.jar \ --conf spark.ps.instances=2 --conf spark.ps.cores=3 --conf spark.ps.memory=5g \ --jars hdfs:///user/brook/sona-0.1.0-bin/lib/fastutil-7.1.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/htrace-core-2.05.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/sizeof-0.3.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/kryo-shaded-4.0.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/minlog-1.3.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/memory-0.8.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/commons-pool-1.6.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netty-all-4.1.17.Final.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/hll-1.6.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/jniloader-1.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/native_system-java-1.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/arpack_combined_all-0.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/core-1.1.2.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_ref-linux-armhf-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_ref-linux-i686-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_ref-linux-x86_64-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_system-linux-armhf-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_system-linux-i686-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/netlib-native_system-linux-x86_64-1.1-natives.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/jettison-1.4.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/json4s-native_2.11-3.2.11.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-format-0.1.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-mlcore-0.1.2.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-ps-core-3.0.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-ps-mllib-3.0.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-ps-psf-3.0.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-math-0.1.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angel-ps-graph-3.0.1.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/core-0.1.0.jar,hdfs:///user/brook/sona-0.1.0-bin/lib/angelml-0.1.0.jar,hdfs:///user/brook/angel-2.1.0-bin/lib/scala-library-2.11.8.jar \ --files ./deepfm.json --driver-memory 2g --num-executors 2 --executor-cores 3 --executor-memory 5g \ --class com.tencent.angel.sona.examples.JsonRunnerExamples \...