magic-hya
magic-hya
我重新弄的环境,现在启动还是这个问题 ``` Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 88s kuscia-scheduler 0/2 nodes are available: waiting for task resource. preemption: 0/2 nodes are available:...
``` [root@kuscia-master-546445d874-rbztj kuscia]# kubectl get kj -n cross-domain NAME STARTTIME COMPLETIONTIME LASTRECONCILETIME PHASE secretflow-task-20240624162727 33m 5m31s 5m31s Failed secretflow-task-20240624165905 84s 1s 1s Failed ``` 现在失败了,不知道怎么查询日志
> 如果任务已经下发了,可以看下问题可以看下 /home/kuscia/var/stdout/pods 的 日志: 详情请参阅:[作业运行失败](https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.8.0b0/reference/troubleshoot/runjobfailed) kubectl get pod secretflow-task-20240624165905-single-psi-0 -o yaml -n alice ``` "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code...
``` [root@kuscia-master-546445d874-rbztj kuscia]# kubectl get svc -n alice NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE secretflow-task-20240624165905-single-psi-0-global ClusterIP None 22151/TCP 90m secretflow-task-20240624165905-single-psi-0-fed ClusterIP None 22150/TCP 90m secretflow-task-20240624165905-single-psi-0-spu ClusterIP None 22149/TCP 90m ```...
最新调试错误,镜像用的secretflow-lite-anolis8:1.6.0b0 ``` container[secretflow] terminated state reason "Error", message: "rgs) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.10/site-packages/click/core.py", line...
> > [root@kuscia-master-546445d874-rbztj kuscia]# > > 你应该是在容器里面看的,你需要到容器外面,执行一下下面的命令。同时请您告诉我,您的sf版本 > > ``` > kubectl get svc -n alice > ``` ``` [root@k8s-master73 kuscia]# kubectl get svc -n lite-alice NAME TYPE CLUSTER-IP EXTERNAL-IP...
按照官方提供的脚本跑的,看的是kuscia8.0的文档 执行脚本 `scripts/user/create_example_job.sh `
[image_processing.py.txt](https://github.com/user-attachments/files/17458049/image_processing.py.txt) [test_image_processing.py.txt](https://github.com/user-attachments/files/17458050/test_image_processing.py.txt)
> 日志是哪个阶段打印的? 你好,这个问题解决了,我在NodeEvalParam 函数里面加了个output_uris=[""] 后面又出现了个新的问题,错误日志如下: 这个问题看起来是某个地方设置的问题,排查了一会没找到地方 ``` tests/component/oran/test_image_processing.py:48: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _...
> 输出的参数不能直接这么赋值,您参考官网的例子重新构建个试下DistData试下 目前需求是在图片处理阶段,在各方本地处理图片数据,然后输出处理后的图片路径,我看了下源码的其他组件,只能一步步摸索试试