dify icon indicating copy to clipboard operation
dify copied to clipboard

Occasional error "operation not permitted" for Code Blocks

Open sam-meng opened this issue 1 year ago • 19 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.12.1

Cloud or Self Hosted

Cloud

Steps to reproduce

This error happens only occasionally so I will describe the situation with details.

In a Dify workflow that uses Code blocks, like the one shown below, I occasionally encounter an "error: operation not permitted" message.

image

The Code block itself is quite straightforward and doesn't rely on any third-party modules—just built-in json and re libraries, and it performs some simple processing, as shown here:

image

Here's a snapshot of what happens when the error occurs. It only happens about 10% of the time.

image

The input shown in the trace log appears valid, and I’ve tested the block with this particular input manually again—the code should execute without any issues. Also, the second input parameter is a constant that never changes. The first input parameter (in this particular case) has less than 100 characters, so it doesn't seem to be a size-related issue.

✔️ Expected Behavior

The Code block does not give "operation not permitted" error message.

❌ Actual Behavior

The Code block occasionally gives "operation not permitted" error message, when the input is valid and no 3rd party packages are used.

sam-meng avatar Nov 30 '24 05:11 sam-meng

@sam-meng Hello! I'm here to help you with bug fixes, answer questions, and assist you in becoming a contributor. If you need anything, feel free to ask.

I found a similar open issue that might be relevant to your problem. The error "Node {Code node name} run failed: error: operation not permitted" in Dify version 0.6.8 (Cloud) is likely due to the sandboxed execution environment that restricts certain functionalities for security reasons. Here are some suggested steps to diagnose and resolve the issue:

  1. Review Code for Restricted Operations: Ensure your code does not attempt to perform any restricted operations like direct file system access, network calls, or operating system-level commands.

  2. Check Input and Output Variable Handling: Verify that input and output variables are correctly defined and referenced in your code node configuration.

  3. Consider Timing and Dependency Issues: Since the issue occurs intermittently, there might be a dependency or initialization issue. Ensure the input block has completed its execution and the data is available before the code block starts.

  4. Use Debugging and Logging Tools: Utilize Dify's debugging and logging tools to gather more information about the error. Fetch workflow logs, run details, and node tracing lists to diagnose the issue.

  5. Implement Workflow Event Callbacks: Implement the WorkflowEventTriggerCallback class to handle various workflow events such as workflow run started, succeeded, or failed, and node execution started, succeeded, or failed. This can help in publishing detailed events to a queue manager for further analysis.

  6. View Logs and Traces in the UI: Use the Dify UI to view detailed logs and traces. By clicking "View Log—Details," you can see a comprehensive overview of the run, including input/output, metadata, and more. By clicking "View Log—Trace," you can review the input/output, token consumption, runtime duration, etc., of each node throughout the complete execution process of the workflow.

For more details, you can refer to the issue here [1].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Nov 30 '24 05:11 dosubot[bot]

@sam-meng the error from sandbox you can refer the FAQ here https://github.com/langgenius/dify-sandbox/blob/f40de1f6bc5f87d0e847cbf52076280bf61c05d5/FAQ.md?plain=1#L3

yihong0618 avatar Dec 01 '24 01:12 yihong0618

@yihong0618 , thank you for your response! But I believe the issue requires further investigation on Dify support team's end. While I understand the reference to the FAQ, it primarily addresses self-hosted deployments and configurations (e.g., seccomp and python_lib_path adjustments). As a cloud user, I do not have control over these aspects.

To recap,

  • My code only uses json.loads and re.findall, which are standard Python operations that shouldn't trigger restricted system calls in any typical sandbox environment.
  • The error occurs only ~10% of the time. I retried the same input in the same cloud environment, and it executed without issue, suggesting the problem is not with my code or input but with transient conditions in your cloud sandbox environment (e.g., resource contention or sandbox initialization).
  • Since I am using the cloud version, the sandbox configuration and system call restrictions are managed entirely by Dify. I cannot debug or modify the underlying sandbox environment as suggested in the FAQ.

This behavior suggests that the issue lies with the cloud infrastructure itself, not my implementation. I rely on the cloud version for its stability and support for my products, and I’d appreciate your assistance in resolving this issue. Please let me know if additional details are needed from my side to help with your investigation. Thanks!

sam-meng avatar Dec 01 '24 03:12 sam-meng

@yihong0618 , thank you for your response! But I believe the issue requires further investigation on Dify support team's end. While I understand the reference to the FAQ, it primarily addresses self-hosted deployments and configurations (e.g., seccomp and python_lib_path adjustments). As a cloud user, I do not have control over these aspects.

To recap,

  • My code only uses json.loads and re.findall, which are standard Python operations that shouldn't trigger restricted system calls in any typical sandbox environment.
  • The error occurs only ~10% of the time. I retried the same input in the same cloud environment, and it executed without issue, suggesting the problem is not with my code or input but with transient conditions in your cloud sandbox environment (e.g., resource contention or sandbox initialization).
  • Since I am using the cloud version, the sandbox configuration and system call restrictions are managed entirely by Dify. I cannot debug or modify the underlying sandbox environment as suggested in the FAQ.

This behavior suggests that the issue lies with the cloud infrastructure itself, not my implementation. I rely on the cloud version for its stability and support for my products, and I’d appreciate your assistance in resolving this issue. Please let me know if additional details are needed from my side to help with your investigation. Thanks!

copy will take some effect to dig it too

yihong0618 avatar Dec 01 '24 03:12 yihong0618

can you provide some test data and test code ?

yihong0618 avatar Dec 02 '24 02:12 yihong0618

Sure! Please import the attached DSL. It’s pretty simple—just three code blocks:

  • The first two return a constant string.
  • The last one uses json.loads and re.findall (no 3rd-party Python modules involved).

I’ve attached the workflow diagram for quick reference:

Workflow Diagram

Once the backend API is enabled, you can write a script to call this workflow periodically. I encounter an "operation not permitted" error after about 20 calls. Oddly enough, it’s not even at the processing block but at the one that simply returns a constant string.

I’ve attached a screenshot of the error for you to check out:

Error Screenshot

Also, here’s the DSL file: Test.yml.zip

sam-meng avatar Dec 02 '24 11:12 sam-meng

Sure! Please import the attached DSL. It’s pretty simple—just three code blocks:

  • The first two return a constant string.

  • The last one uses json.loads and re.findall (no 3rd-party Python modules involved).

I’ve attached the workflow diagram for quick reference:

Workflow Diagram

Once the backend API is enabled, you can write a script to call this workflow periodically. I encounter an "operation not permitted" error after about 20 calls. Oddly enough, it’s not even at the processing block but at the one that simply returns a constant string.

I’ve attached a screenshot of the error for you to check out:

Error Screenshot

Also, here’s the DSL file: Test.yml.zip

thanks

yihong0618 avatar Dec 02 '24 11:12 yihong0618

I tried about 50 times all passed

yihong0618 avatar Dec 02 '24 11:12 yihong0618

Thanks again! What I wanted to clarify is that this behavior is intermittent. For me, I hit the error very quickly.

What’s more important is that I used Dify Cloud, and the attached DSL is an exact export from there—so there should be minimal to no external factors affecting this issue on my side. If this is due to a transient issue with Dify Cloud, you might need to keep trying until the error shows up.

I believe it’s fair to say that I’m reporting this issue here because I did hit this error. To help further, I’ve attached another screenshot showing that I did encounter this error.

Thanks again for taking the time to look into this! Let me know if there’s anything else I can provide.

image

sam-meng avatar Dec 02 '24 11:12 sam-meng

@yihong0618 Just ran another test and this time I hit this error after 291 calls. The screenshot shows that it's another code block that simply returns a constant string is throwing this error. The detailed error output is also attached.

Hope this helps. Thanks!

image
{
    "task_id": "65c8048a-d189-45e8-abff-03d2bed2127c",
    "workflow_run_id": "8cb24cac-7306-444e-8217-a26f3c782856",
    "data": {
        "id": "8cb24cac-7306-444e-8217-a26f3c782856",
        "workflow_id": "94472c56-2a16-4eb3-ba8a-79365b98e217",
        "status": "failed",
        "outputs": {},
        "error": "error: operation not permitted\n",
        "elapsed_time": 0.08821293897926807,
        "total_tokens": 0,
        "total_steps": 2,
        "created_at": 1733182453,
        "finished_at": 1733182453
    }
}

sam-meng avatar Dec 02 '24 23:12 sam-meng

@yihong0618 Just ran another test and this time I hit this error after 291 calls. The screenshot shows that it's another code block that simply returns a constant string is throwing this error. The detailed error output is also attached.

Hope this helps. Thanks!

image

{

    "task_id": "65c8048a-d189-45e8-abff-03d2bed2127c",

    "workflow_run_id": "8cb24cac-7306-444e-8217-a26f3c782856",

    "data": {

        "id": "8cb24cac-7306-444e-8217-a26f3c782856",

        "workflow_id": "94472c56-2a16-4eb3-ba8a-79365b98e217",

        "status": "failed",

        "outputs": {},

        "error": "error: operation not permitted\n",

        "elapsed_time": 0.08821293897926807,

        "total_tokens": 0,

        "total_steps": 2,

        "created_at": 1733182453,

        "finished_at": 1733182453

    }

}

thanks

yihong0618 avatar Dec 03 '24 00:12 yihong0618

Hello @sam-meng @yihong0618 The same error was occur for me all times, and didn't found any solution

any suggestions

Elghandour-eng avatar Dec 03 '24 11:12 Elghandour-eng

@Elghandour-eng My guess is that this issue impacts all cloud users working with code blocks. If your primary use case involves calling the Dify workflow through the backend API, you could implement a retry mechanism on errors.

sam-meng avatar Dec 03 '24 12:12 sam-meng

I confirmed the same issue.

I created a workflow that loops 100 times with code that simply returns the input variables, and ran it. Most of the time it succeeds, but sometimes it fails and ends with an error. I was able to easily confirm this by trying it a few times.

WF_Codeblock_Iteration.yml.txt

fujita-h avatar Dec 04 '24 09:12 fujita-h

@fujita-h Thanks for helping!

sam-meng avatar Dec 04 '24 09:12 sam-meng

I confirmed the same issue.

I created a workflow that loops 100 times with code that simply returns the input variables, and ran it. Most of the time it succeeds, but sometimes it fails and ends with an error. I was able to easily confirm this by trying it a few times.

WF_Codeblock_Iteration.yml.txt

interesting, I run this workflow about 100 times....never failed...

my env 16G ram docker-compose start, using colima as docker backend

yihong0618 avatar Dec 05 '24 01:12 yihong0618

@yihong0618 Thanks again! Just a quick note—it’s recommended to use the exact same setup when reproducing issues. As I mentioned, this problem happens on the cloud version. I ran two more tests, and the error showed up after 61 and 26 calls, so it’s pretty easy to observe.

Also, like I mentioned earlier, my guess is that this is likely caused by transient conditions in the cloud sandbox environment (e.g., resource contention or sandbox initialization). I also built Dify locally on my laptop and imported the test DSL I shared earlier. After 5000 calls, no errors occurred.

I hope this clears up the reproducibility question and helps guide debugging in the right direction. Thanks! :)

sam-meng avatar Dec 05 '24 12:12 sam-meng

I wonder if there are updates on this. Thanks!

sam-meng avatar Dec 16 '24 08:12 sam-meng

Just tested. This issue still exists in v0.14.0.

sam-meng avatar Dec 18 '24 01:12 sam-meng

Hi, @sam-meng. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • Intermittent "operation not permitted" error in a Dify workflow using Code blocks.
  • Error occurs about 10% of the time, likely due to the cloud sandbox environment.
  • Other users, including @yihong0618, @Elghandour-eng, and @fujita-h, have reported similar issues.
  • You have provided test data and a workflow diagram to help reproduce the issue.

Next Steps:

  • Please confirm if this issue is still relevant to the latest version of the Dify repository by commenting here.
  • If there is no further activity, this issue will be automatically closed in 15 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Jan 22 '25 16:01 dosubot[bot]

Hi, I am facing the error error: operation not permitted. Currently, I am using cloud service. After reading this thread, the only solution to avoid this error is to use self-hosted environment? Are there any solutions for cloud users?

ghost avatar Feb 27 '25 02:02 ghost

To the best of my knowledge no, and it appears this issue is not being resolved actively.

sam-meng avatar Feb 27 '25 03:02 sam-meng

still no find the root cause since it only happened in cloud version

yihong0618 avatar Feb 27 '25 03:02 yihong0618

I deployed Dify (version 1.3.1) locally on nodes with ARM and X86 architectures respectively, and conducted stability tests. Among them, this error occurred multiple times on the ARM node, while it did not appear on the X86 node. The same configuration with no additional dependencies introduced.

zhaobanzhuan avatar Jul 25 '25 06:07 zhaobanzhuan