-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Exceptions during chunk commit are silently ignored #1189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Michael Minella commented
|
Erwin Vervaet commented
|
Albert Strasser commented Hi Erwin, You might be correct that we are looking at 2 different issues here. My assumption was that just checking if an exception was fatal (last catch block of your example) seemed to me as a wrong implementation of skipping (which should default to false for any exception, not only for fatal ones). But you could be right that at this point we are already beyond the concept of skipping and therefore this would mean that we look at a major bug in handling the status of a chunk. I can assure that this issue does persist in the 3.0.x branch as we are using 3.0.4 and experience exactly the same behaviour. |
Erwin Vervaet commented I can now also confirm that the problem is reproducible with Spring Batch 3.0.4. The logging is from two 50-item batches (documents_cuba_RLBTST17.zip and documents_cuba_RLBTST16.zip) that run concurrently. The system first picks up the documents_cuba_RLBTST17.zip batch and about 10 seconds later documents_cuba_RLBTST16.zip is also picked up. Both batches complete successfully. documents_cuba_RLBTST17.zip incorrectly skipped 2 items. The STEP_EXECUTION table contains the following (no skips):
documents_cuba_RLBTST16.zip incorrectly skipped 12 items. The STEP_EXECUTION table contains the following (no skips):
|
Erwin Vervaet commented As a temporary work-around, we're now forcing a transaction flush (using Obviously this is not a real fix since a transaction commit might still fail for other reasons. |
Erwin Vervaet commented Michael Minella, |
Andrei Amariei commented Also affected by this issue. |
Erwin Vervaet commented Thanks for that Andrei Amariei! |
Andrei Amariei commented One concern I have with flushing is that it can add side effects to an unsuccessful write phase. For example:
About the attached test, maybe it's wrong to expect the step to fail (even if all the items are skipped), and it would be better to instead have a field |
Erwin Vervaet commented Andrei Amariei, you're right, the flush is just a hack. In our application it effectively fixes the problem: we've not seen any new occurrences in our production system. But I fully agree that it's not a real solution that applies generally. |
Erwin Vervaet commented Michael Minella, still no input on this? |
Erwin Vervaet commented
|
Failing test with 5.0.3 (to add in @Test
public void faultTolerantStepShouldFailWhenCommitFails() throws Exception {
StepBuilder stepBuilder = new StepBuilder("step", jobRepository);
FaultTolerantStepBuilder<String, String> faultTolerantStepBuilder = new FaultTolerantStepBuilder<>(stepBuilder);
faultTolerantStepBuilder.transactionManager(transactionManager);
faultTolerantStepBuilder.reader(getReader(new String[] { "a", "b", "c" }));
faultTolerantStepBuilder.writer(data -> TransactionSynchronizationManager
.registerSynchronization(new TransactionSynchronizationAdapter() {
@Override
public void beforeCommit(boolean readOnly) {
throw new RuntimeException("Simulate commit failure");
}
}));
step = faultTolerantStepBuilder.build();
JobParameters jobParameters = new JobParameters();
JobExecution jobExecution = jobRepository.createJobExecution(job.getName(), jobParameters);
StepExecution stepExecution = new StepExecution(step.getName(), jobExecution);
jobRepository.add(stepExecution);
step.execute(stepExecution);
Assert.assertEquals(BatchStatus.FAILED, stepExecution.getStatus());
} Will be reviewed in #3950 . |
Is there a workaround for this issue? |
Hello. facing the same issue. is there any workaround currently? |
Erwin Vervaet opened BATCH-2415 and commented
A TaskletStep uses various helpers to loop over all chunks to be processed and process each of them in a
separate transaction while managing things such as skippable and retryable exceptions.
In pseudo code this whole system can be summarized as follows:
This code exhibits the following type of behavior:
commits. Since the chunk its complete it is not pushed back on the queue and the repeat loop ends normally.
will cause the chunk not to be marked complete and the transaction to rollback. Since the chunk
is not complete its pushed back onto the queue of chunks to process. Furthermore, since the
exception is not fatal it will be ignored by the RepeatTemplate and a new iteration will start, actually doing the retry.
So far so good. However, we're experiencing the following case:
Now an exception is thrown during transaction commit (point [B]). Since the chunk is already marked as
complete the chunk is not pushed back on the queue and no retry is attempted. The commit failure
caused correct chunk transaction rollback. However, since the exception is not deemed to be fatal
it is silently ignored by the RepeatTemplate! End result: the batch completes normally but certain chunks were silently skipped!
The exception we get is (see also BATCH-2403):
I fail to see how that can be desired behavior. Or am I just missing something here?
Affects: 2.2.7, 3.0.4
Attachments:
4 votes, 9 watchers
The text was updated successfully, but these errors were encountered: