bpmn-elements icon indicating copy to clipboard operation
bpmn-elements copied to clipboard

Performance issues with bigger bpmn files

Open hvlwork opened this issue 1 year ago • 27 comments

We are currently experiencing performance issues when it comes to transitions between serviceTasks (after the "next" function of the service is called, it takes quite long until the next serviceTask becomes active) and ending a process (from the point on where an "end event" is reached in the process, until the 'end' event of the bpmnElements.Definition is triggered). I have found out that this problem only occurs when the bpmn file is bigger. The bpmn file I tested it with also includes some loops, not sure if that can be a factor. Do you have any idea what could cause this issue or how we could try to fix it? Any suggestion would be highly appreciated.

hvlwork avatar Aug 20 '24 13:08 hvlwork

A couple of things come to mind:

  • Are you using the latest version? There was an issue with sequential multi-instance tasks in previous versions
  • Use parallel gateways where feasible to join a split execution. Loop-back flows may cause unnecessary discard loops

paed01 avatar Aug 21 '24 04:08 paed01

Thank you for your quick response.

  • we were not using the latest version, but as it turns out, this issue also occurs with the latest version

  • for us parallel gateways do not make sense as we never execute more than one task at the same time

hvlwork avatar Aug 22 '24 07:08 hvlwork

Can you give some stats about the process, number of tasks, subprocesses, events, sequence flows, etc?

paed01 avatar Aug 23 '24 03:08 paed01

of course:

  • there are 42 sequence flows in the bpmn (most of them with a conditional expression)
  • 21 Tasks
  • 2 events (one start and one end event)
  • no subprocesses

hvlwork avatar Aug 23 '24 07:08 hvlwork

What is the execution time? or what is slow in your implementation?

Your process seem rather reasonable, or not even large at all. I have a client with a process of >30 tasks/services, and with gateways, subprocess, etc. It runs in less than 200ms.

paed01 avatar Aug 27 '24 04:08 paed01

We will run some further tests and afterwards we will get back to you.

hvlwork avatar Aug 29 '24 07:08 hvlwork

For example ending a process in the current state of our bpmn takes around 2-3 seconds. We found out that reducing the number of sequenceFlows with conditional expressions between the following two serviceTasks(after the one where the process is ended) to one each reduces the needed time to 0,3 to 0,7 seconds. So it seams like the conditional sequenceFlows are an important factor. (we often have several conditional sequence flows between the same two serviceTasks, not sure if that can be important)

What has to be considered here is that these times were measured on a quite powerful laptop, but we are developing our application for quite weak hardware. There it can become really painfully slow.

hvlwork avatar Sep 03 '24 14:09 hvlwork

issue-42-discard

Which of the three sequences is similar to your diagram?

paed01 avatar Sep 04 '24 03:09 paed01

image that is the bpmn file that I am talking about

hvlwork avatar Sep 04 '24 07:09 hvlwork

You will have a massive amount of discard loops with this design.

Each taken or discarded sequence flow will trigger the next task execution. Multiple outbound sequence flows is basically a parallel split. Hence, a discarded flow will discard all subsequent task outbound flows. Since you have multiple loopbacks, the discard sequence will continue until it reaches the first discarded sequence flow again - discard loop detected.

paed01 avatar Sep 06 '24 05:09 paed01

If the end-event is really the end of execution, what if you make it a terminate end event? Then all other element will be stopped.

paed01 avatar Sep 06 '24 05:09 paed01

So you would recommend that the beginning of the process looks like this? image

rakaposhi avatar Sep 09 '24 09:09 rakaposhi

I guess that would speed up execution. The terminate end event will stop all sequence flows and outstanding tasks (if any).

Did you notice any difference in execution time?

paed01 avatar Sep 09 '24 09:09 paed01

Yes. It's faster now. Thanks. But we still have to do some further testing if the process is now fast enough.

rakaposhi avatar Sep 09 '24 10:09 rakaposhi

As it turns out ending the process takes half it's time now - it is still not as fast as we need it to be but it is a noticeable improvement - thanks again 🙂 👍 The thing is that in case Flow 2 is taken and we still have more than one sequenceFlow leading from Task 2 to Task 3 and from Task 3 to Task 4, the transition is still as slow as before - no measurable performance improvement there If we make sure that there is always just one Flow going from one Task to another, then it is really fast - but it would mean a big limitation to our application Can you think of any alternative?

hvlwork avatar Sep 10 '24 13:09 hvlwork

What is the reason behind having multiple conditional sequence flows between the tasks? Logging? Logic?

NB! A task that cannot take any outbound conditional flows will throw ActivityError.

paed01 avatar Sep 11 '24 06:09 paed01

With our application we want to enable the user to define a process with a given set of tasks and every task has a given set of sequence flows. So sometimes a user wants several sequence flows to lead to the same task and sometimes every sequence flow should lead to a different task(separate handling of cases), that should be up to the user.

hvlwork avatar Sep 11 '24 13:09 hvlwork

We have had an interesting finding. The order of the conditional sequence flows of a task in the xml (the options after a task to proceed) have a huge impact on the performance. We are talking about *100 to *200 faster if the option that should be taken is the first one in the xml. So for example if between Task 2 and Task 3 Flow 6 is taken and it is defined after Flow 5 in xml, it is way slower than if Flow 6 would be defined before Flow 5.

hvlwork avatar Sep 13 '24 09:09 hvlwork

Can you reproduce this issue?

hvlwork avatar Sep 18 '24 07:09 hvlwork

so assuming that Flow 6 is used it is really fast with this bpmn:

    <sequenceFlow id="Flow_0sv2vaj" name="Flow 6" sourceRef="Activity_0pec7dd" targetRef="Activity_1swl04q">
      <conditionExpression xsi:type="tFormalExpression">${environment.variables.flow6}</conditionExpression>
    </sequenceFlow>
    <sequenceFlow id="Flow_1fegb8x" name="Flow 5" sourceRef="Activity_0pec7dd" targetRef="Activity_1swl04q">
      <conditionExpression xsi:type="tFormalExpression">${environment.variables.flow5}</conditionExpression>
    </sequenceFlow>    
    <serviceTask id="Activity_1swl04q" name="Task3" implementation="${environment.services.task3}">
      <incoming>Flow_1fegb8x</incoming>
      <incoming>Flow_0sv2vaj</incoming>
      <outgoing>Flow_0t5bjaq</outgoing>
      <outgoing>Flow_0ts20th</outgoing>
      <outgoing>Flow_1gc65aj</outgoing>
    </serviceTask>

but by just switching the order of the flows in the bpmn like this, following Flow 6 already becomes like *100 slower:

   <sequenceFlow id="Flow_1fegb8x" name="Flow 5" sourceRef="Activity_0pec7dd" targetRef="Activity_1swl04q">
      <conditionExpression xsi:type="tFormalExpression">${environment.variables.flow5}</conditionExpression>
    </sequenceFlow>
    <sequenceFlow id="Flow_0sv2vaj" name="Flow 6" sourceRef="Activity_0pec7dd" targetRef="Activity_1swl04q">
      <conditionExpression xsi:type="tFormalExpression">${environment.variables.flow6}</conditionExpression>
    </sequenceFlow>
    <serviceTask id="Activity_1swl04q" name="Task3" implementation="${environment.services.task3}">
      <incoming>Flow_1fegb8x</incoming>
      <incoming>Flow_0sv2vaj</incoming>
      <outgoing>Flow_0t5bjaq</outgoing>
      <outgoing>Flow_0ts20th</outgoing>
      <outgoing>Flow_1gc65aj</outgoing>
    </serviceTask>

hvlwork avatar Sep 19 '24 15:09 hvlwork

I stumbled over another performance issue - in case a sequenceFlow is immediately triggered when a Task becomes active, the sequenceFlow is pretty slow - if the sequenceFlow is triggered with at least a bit of a timeout it is fast again

hvlwork avatar Sep 23 '24 13:09 hvlwork

I haven't encountered designs with multiple sequence flows to the same target. I have to think how to handle that.

paed01 avatar Sep 24 '24 08:09 paed01

Can you test again with npm i bpmn-elements@rc (v16.2.0)?

It should be a little better, but no promises.

paed01 avatar Sep 28 '24 06:09 paed01

it does not look like it improved a lot

hvlwork avatar Oct 03 '24 15:10 hvlwork

Sorry about that. The complexity of a flow will have a performance impact, as in programming in general. Loopback is a fantastic thing but comes with a price.

Instead of loopback, could you end execution and start a new execution immediately after?

Or attempt to join as many sequences as possible before doing the loopback:

issue-42-example

paed01 avatar Oct 04 '24 05:10 paed01

Ending the loop will not be an option for us as we want to be able to define a process all together in the bpmn file. Also restarting a process would mean for us that we have to share information between process runs which we would like to avoid.

I am not sure if I understand your second suggestion completely - the thing is that we do not want to execute tasks in parallel, we always want to execute one task, follow the sequence flow with the fitting condition and then execute the next task - I think with a parallel gateway the process would never finish as this gateway waits for token from all incoming flows

hvlwork avatar Oct 11 '24 15:10 hvlwork

In the example above, that is a subset of your diagram, the parallel join gateway will wait for all taken/discarded sequence flow before continuing. Hence, the parallel join gateway outbound - loopback - sequence flow will be taken once. Or if all inbound sequence flows are discarded, discarded once. The effect will be that Task 1 is not bothered more than necessary. Just a thought.

paed01 avatar Oct 11 '24 17:10 paed01