Limit engine correctness check
Resolves issue #54. I assume the problem appears when there are floating point errors during serialization, so I added a function to check the subset relation with floating error tolerance.
@ttt-77 Do you mean to add a check like this?
Yes. But can you refactor the code? Don't use three functions.
@ttt-77 Is this better?
I expanded the outer two layers into loops to make it more readable, but I think retaining the innermost all() function won't affect readability much and simplifies the code.
@ttt-77 Could you please review this? I added the test mentioned by issue #55 to this PR as well. Since no real index is provided for the jackson dataset, I compared the inference count when using the random proxy score and the accurate proxy score generated from the ground truth result.
Can you split the new commit into another PR?