Fate on spark部署时出现FEDERATED_ERROR
用spark作为计算存储引擎,fate-python容器里service_conf.yaml配置如下(其余均用默认配置):
fateflow:
host: 192.167.0.100
http_port: 9380
grpc_port: 9360
http_app_key:
http_secret_key:
proxy: nginx
protocol: http
fate_on_spark:
spark:
# default use SPARK_HOME environment variable
home: /data/project/common/spark-3.3.0-bin-hadoop3
cores_per_node: 20
nodes: 2
linkis_spark:
cores_per_node: 20
nodes: 2
host: 10.0.50.61
port: 8088
token_code: MLSS
python_path: /opt/app-root/bin/python
hive:
host: 127.0.0.1
port: 10000
auth_mechanism:
username:
password:
linkis_hive:
host: 127.0.0.1
port: 9001
hdfs:
name_node: hdfs://10.0.50.61:9870/
# default /
path_prefix:
rabbitmq:
host: 10.0.50.61
mng_port: 15672
port: 5672
user: fate
password: fate
# default conf/rabbitmq_route_table.yaml
route_table:
pulsar:
host: 192.168.0.5
port: 6650
mng_port: 8080
cluster: standalone
# all parties should use a same tenant
tenant: fl-tenant
# message ttl in minutes
topic_ttl: 5
# default conf/pulsar_route_table.yaml
route_table:
nginx:
host: 10.0.50.61
http_port: 80
grpc_port: 9310
ngnix容器里/data/projects/fate/proxy/nginx/conf/route_table.yaml配置如下:
default:
proxy:
- host: 10.0.50.61
http_port: 80
9999:
proxy:
- host: 10.0.50.61
http_port: 80
fateflow:
- host: 10.0.50.61
http_port: 9380
10000:
proxy:
- host: 10.0.50.63
http_port: 80
fateflow:
- host: 127.0.0.1
http_port: 9380
可以上传数据,但是发起任务的时候出现如下错误: {'retcode': <RetCode.FEDERATED_ERROR: 104>, 'retmeg': 'Federated schedule error, Expecting value: line 1 column 1 (char 0)'} 请问这是什么原因呢?是什么配置还需要修改么?谢谢~
Nginx 网络代理的问题 如果你没有改动nginx & openresty的配置的话 http_port 应该为 9300 参考这边 #3817
改动nginx端口为80,所以把9300也改成了80,这样做是不可以的么?
没有尝试过修改为80端口, 我不太了解Nginx 具体细节 这里还额外使用了lua脚本(在nginx/lua目录下) 启动虚拟主机的9300端口 建议检查涉及到的路由配置,比如 nginx/conf/nginx.conf (默认 9128), nginx/conf/vhost/coordination_http_proxy.conf (默认9300)
修改完执行/data/projects/fate/proxy/nginx/sbin/nginx -t 并重启服务