StaxEventItemReader.read.count skips a number [BATCH-2355]
Erwin Vervaet opened BATCH-2355 and commented
We're leveraging the "StaxEventItemReader.read.count" counter maintained by the StaxEventItemReader (ultimately by the AbstractItemCountingItemStreamItemReader) to learn about the index of the item being processed. We're only reading this value, not updating it. That is left to the Spring Batch code.
Our step looks like this:
<batch:step id="doc.doIntegration" next="doc.doAck">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk
reader="doc.indexReader" processor="doc.integrationProcessor" writer="doc.ackWriter"
commit-interval="1" retry-limit="3">
<batch:retryable-exception-classes>
<batch:include class="net...batches.support.TemporaryTechnicalException"/>
</batch:retryable-exception-classes>
</batch:chunk>
</batch:tasklet>
</batch:step>
The "doc.integrationProcessor" is an ItemProcessor that also logs the item index:
@Override
public final I process(I item) throws Exception {
...
activityDebug("Processing item " + getItemIndex() + " of batch [" + getBatchId() + "]");
...
}
The getItemIndex() method is implemented by simply looking up the "StaxEventItemReader.read.count" in the step execution context:
protected int getItemIndex() {
return stepExecution.getExecutionContext().getInt("StaxEventItemReader.read.count");
}
The strange thing is that from time to time (very rarely) we see the following:
Processing item 214 of batch [9208fa8a5e9645a1949d6db8d1e7041d]
Processing item 215 of batch [9208fa8a5e9645a1949d6db8d1e7041d]
Processing item 216 of batch [9208fa8a5e9645a1949d6db8d1e7041d]
Processing item 216 of batch [9208fa8a5e9645a1949d6db8d1e7041d]
Processing item 218 of batch [9208fa8a5e9645a1949d6db8d1e7041d]
Processing item 219 of batch [9208fa8a5e9645a1949d6db8d1e7041d]
Notice that item 217 is missing and 216 is mentioned twice.
We see no errors in our logs but item 216 is not committed to our database so it seems some transaction rolled back somewhere. Directly after that the counter is out of whack and item 217 is processed as if it were 216. Then things recover and the counter jumps directly to 218. After that, everything is back to normal.
I'm trying to understand how this can happen. Could this be a bug in Spring Batch? Or is this normal behavior in certain circumstances and should we be using another way to get the item index (probably ItemCountAware as mentioned in BATCH-1906)?
Affects: 2.2.7
Erwin Vervaet commented
One bit of extra info that popped up. The bit of logging shown above was produced by processing a small batch of 500 elements. For the "doc.doIntegration" step we have the following in the STEP_EXECUTION table:
- STATUS = COMPLETED
- EXIT_CODE = COMPLETED
- COMMIT_COUNT = 500
- ROLLBACK_COUNT = 1
- READ_COUNT = 500
- FILTER_COUNT = 0
- WRITE_COUNT = 499
- READ_SKIP_COUNT = 0
- WRITE_SKIP_COUNT = 0
- PROCESS_SKIP_COUNT = 0
In the STEP_EXECUTION_CONTEXT table we have:
- SHORT_CONTEXT = ``` {"map":{"entry":{"string":"StaxEventItemReader.read.count","int":501}}}
So this seems to suggest processing of one item got retried. The retry succeeded and the step finished normally.
Jimmy Praet commented
As you already noted yourself, you should be using ItemCountAware to obtain the item number.
StaxEventItemReader.read.count will only be updated in the ExecutionContext at the end of the chunk when the update() method of the reader is being called. So it's always one chunk behind. And if you would ever change the commit-interval of this step to be > 1, it definitely won't work.
The STEP_EXECUTION statistics do look a bit off though.. read count 500 and write count 499, without any filtered or skipped records?
Erwin Vervaet commented
Jimmy Praet, You're absolutely right that this does not feel like an ideal solution. I'll look into switching to ItemCountAware. One complexity I'll have to tackle here is that the items are JAXB generated beans (we're using Jaxb2Marshaller) so I'll have to see if I can add some ItemCountAware sauce to the generated code.
Still, I'm puzzled how this can happen. The fact that we have a commit-interval of 1 is actually a deliberate decision. So as far as I unserstand you would always have a ItemReader.read call followed by an ItemStream.update call, all of this in transaction processing the single item. So how can things get messed up like we see?
Erwin Vervaet commented
Using the JAXB2 commons I was able to have the generated classes extend a superclass that is ItemCountAware (using inheritance:extends in bindings.xjb). In other words, we're now leveraging ItemCountAware to get the item index.
Hopefully this will prevent the problem from happening again.
I'm going to keep this ticket open since there indeed seems to be something odd happening with the read and write counts, as mentioned above. Might be worth investigating that a bit deeper.
Erwin Vervaet commented
Jimmy Praet, One small tip: it might be useful to mention explicitly in the ItemCountAware JavaDoc that the count starts at 1 and consequently cannot be used directly when indexing Java arrays or collections.
Thank you for raising this issue. If there is an update to this issue please let us know by adding a comment else it will soon be closed.