MINIFICPP-1515 - Add integration tests testing different flowfile sizes in a simple flow
Issues with big flowfiles were reported on stack overflow.
Seems like this is related to twi issues, one a narrowing exception happens when trying to determine the length of the file to be written into the content repository, and another is that we use _stat on windows even though we should be using _stat64 to query files larger than 2GB.
In order to avoid these issues to reappear, we should add some integration test coverage.
This PR will be in draft status until the mentioned issues are fixed. Also, currently integration test logs are bloated for scenario outline based testing, this is to be fixed as part of this task.
The check still shows that we do not handle the file size 2.1 GiB properly. Merge only once it is fixed.
The check still shows that we do not handle the file size 2.1 GiB properly. Merge only once it is fixed.
Right -- #1028 needs to be merged first, and probably its successor (which will handle OutputStream::write), too.
Right -- #1028 needs to be merged first, and probably its successor (which will handle
OutputStream::write), too.
I'm close to being done with the write refactor. I'll submit a PR next week against the read refactor branch (i.e. #1028) to keep the size managable, and both can be merged together after they're reviewed and approved.
This still fails after merging #1028 and #1083. The main problem is that SingleFileContentHashValidator reports a hash mismatch on the 2.1 GiB test file:
INFO:root:Output file created: /tmp/.nifi-test-output.fe59da51-9bb0-4c9f-9966-876ec2ea543b/.4c3b8487-5767-4b29-830e-7e83d4e3e5ec.053203f8-ce84-11eb-82db-0242c0a81002
INFO:root:Output folder: /tmp/.nifi-test-output.fe59da51-9bb0-4c9f-9966-876ec2ea543b/
INFO:root:dir /tmp/.nifi-test-output.fe59da51-9bb0-4c9f-9966-876ec2ea543b/ -- name .4c3b8487-5767-4b29-830e-7e83d4e3e5ec.053203f8-ce84-11eb-82db-0242c0a81002
INFO:root:expected hash: 3531295d2dc5e038e4f4e684d020167b -- actual: 2073b6f0bc607551c97ed79b4e0de87e
I suspect this is problem with the test and not with GetFile/PutFile, but it needs to be investigated.
Another, minor issue is that the 20 seconds timeout needs to be increased, as running the test with the 2.1 GiB test file takes around 40 seconds on my computer.
Also, OutputEventHandler prints "Output file modified" more than 2 million times, which is annoying.
@hunyadi-dev do you want to fix this? I can do it if you don't feel like it.
After rebasing to main, the tests seem to be running successfully, thanks to #1131. I think we can merge this after a successful CI run, I've rebased it to the latest main. edit: didn't work, will look into this
This test passes for me locally after rebasing on top of main, so I think it's ready for merging.
In order to not increase the running time of the CI job by too much, I suggest making the timeout configurable and lower, eg. 1 second for the first 3 tests and 10 seconds for the last two.
This test passes for me locally after rebasing on top of main, so I think it's ready for merging.
In order to not increase the running time of the CI job by too much, I suggest making the timeout configurable and lower, eg. 1 second for the first 3 tests and 10 seconds for the last two.
Rebased and fixed the runtime issues (the file observer wanted to log the content of the file on every modification which prevented to get the actual notifications in time when large files flooded the logger), I think this could be merged if all tests pass.