spock Mock Argument mismatch is hidden by Exceptions

  def "NPE"() {
    given:
    List<String> list = Mock()

    when:
    def len = list.get(1).length()

    then:
    len == 5
    1 * list.get(0) >> "hello"
  }

Fails with java.lang.NullPointerException: Cannot invoke method length() on null object. We loose any information, that this was caused by a wrong argument to the mock.

If there are Unmatched invocations, then they should be reported as well.

Too few invocations for:

1 * list.get(0) >> "hello"   (0 invocations)

Unmatched invocations (ordered by similarity):

1 * list.get(1)
One or more arguments(s) didn't match:
0: argument == expected
   |        |  |
   1        |  0
            false

Jan 10 '20 15:01 leonard84

As someone who has been writing Groovy tests for a few years I still get tripped up by this semi-regularly. While debugging the issue is relatively simple by adding an extra interaction, it is just one more speed bump for newer developers. I think anything we can do to improve the messaging for users will improve the overall experience.

NOTE: Quick fix for anyone looking to understand how to debug these issues (see: Strict Mocking).

0 * list.get(_) >> "hello"

I would like to recommend a few different options for feedback from to others. Full disclosure I am new to the Spock Framework and don't understand the internal so I apologize in advance if any of these are non-viable.

Option 1 - Defer Exceptions (Recommended)

Spock already has the ability to intercept exception from the code under test and expose them through the thrown() method. The proposal would be to always hold onto the exception and then run through the interaction blocks to verify any missed interactions. The initial exception and missed interactions exception would be combined to output meaningful information for the users.

Advantages

This solution would allow users to still see the root cause error so that if the bug is unrelated to mismatched mocks it can still be detected.
This solution would provide the user clear direction that a specific interaction was missing which may also be the cause.

Disadvantages

May add additional overhead of both handling the exception and checking the interactions.
May confuse users by pointing them in the wrong direction based on missed interactions.
Will increase the verbosity of failures making them harder to understand.

This would result in an output like the following.

Test Failed with:
java.lang.NullPointerException: Cannot invoke method length() on null object

Possible Cause:
Too few invocations for:

1 * list.get(0) >> "hello"   (0 invocations)

Unmatched invocations (ordered by similarity):

1 * list.get(1)
One or more arguments(s) didn't match:
0: argument == expected
   |        |  |
   1        |  0
            false

Option 2 - Strict Mocks

This option is inspired by similar concepts in EasyMock and as mentioned in Spock's own Strict Mocking. Instead of allowing unmatched interactions to return default values they would instead throw assertion exceptions and detail all possible unmatched interactions for that mock. To avoid breaking existing tests this change would likely have to be made opt-in instead of enabled by default. A few possible options to enable this behavior would be:

A new option in the Spock Configuration File to enable this feature globally by default.

{ 
  mocks {
    strict = true
  }
}

A new extension to enable the feature per Specification @MockBehavior(STRICT)
A new Mocking API for StrictMock() to opt-in on a case by case basis.

Advantages

The solution can be granularly applied only where necessary.
The solution is significantly more explicit about the location of the problem.

Disadvantages

Being Opt-in means that newer users might not be aware this should be enabled.
Would make it harder to identify which expected interaction was missed.

This would result in an output like the following.

Test Failed with:
Unexpected invocation of List.get(int) at {location information}:
def len = list.get(1).length()

Possible Cause:
Too few invocations for:

1 * list.get(0) >> "hello"   (0 invocations)

Option 3 - Default Mocks

This option is the least viable in my opinion but might be most applicable for the simplest use cases where people are getting tripped up. If a mock method only has a single interaction mocked on it then add an implicit any _ matcher with the same side-effects. This would possibly allow the code to flow through the rest of the test without throwing any NPEs and reach the interaction blocks where the user would be informed of the mismatch.

Advantages

The solution is specifically targeted at simple use cases likely to affect newer users.
Requires minimal extra handling and processing within the Spock Framework.

Disadvantages

Does not cover all scenarios.
May potentially lead to other unexpected failures diverting attention from real cause.
No guarantee that it would actually ensure tests reach to the interactions block.

This would result in an output like the following.

Too few invocations for:

1 * list.get(0) >> "hello"   (0 invocations)

Unmatched invocations (ordered by similarity):

1 * list.get(1)
One or more arguments(s) didn't match:
0: argument == expected
   |        |  |
   1        |  0
            false

Mar 09 '24 18:03 akiesler

Let me give my opinion bottom up.

Option 3: I really dislike that. The disadvantages you posted are too significant, and it feels extremely hacky.

Option 2: As detailed in the "Lenient vs. Strict Mocking Frameworks" block at https://spockframework.org/spock/docs/2.3/all_in_one.html#_default_behavior_of_mock_objects, it is not only a matter of backwards compatibility to not enable strict mocking by default. This would be against the Spock philosophy as it very easily leads to overspecification and brittle tests. As you can already do strict mocking explicitly, it might well be an option to add a global option and / or extension that can enable strict mocking by default for the given scope, but anyway I think this is independent from this issue.

Option 1: This indeed seems to be the proper solution and should relatively easily be possible as far as I remember the code. But it first needs to be defined what we really want here.

There are basically 4 cases.

interactions defined before when/then/expect / exception during stimulus
interactions defined before when/then/expect / verification error
interactions defined in then / exception during stimulus
interactions defined in then / verification error

Currently, for 1 and 3 you get only the exception displayed for 2 you get only the verification error displayed for 4 you get only the unmatched invocation displayed

This is caused by the simplified structure:

things from before when/then/expect
stimulus
verification of then interactions
then verifications
verification of interactions from before when/then/expect

Addtionally to the above, there is to consider not only excpetions during stimulus, but also verification errors. That 2. and 4. are different is most probably also just an implementation detail and not intentionally.

So I think the question that need clarification are at least:

If there are unsatisfied mock interactions and an exception during stimulus, what should be shown, unsatisfied interactions, exception, or both?
If there are unsatisfied mock interactions and a verification failure, what should be shown, unsatisfied interactions, verification failure message, or both?
Should it behave differently when the interaction is from a then block or not?
What should happen if there are unsatisfied interactions from a then block and also from interactions before when/then/expect?
Should the unsatisified interactions only be showed alongside the exception and / or verification error if there are also unmatched invocations?

If for example the answer to the first two questions is "only the unsatisfied interactions", and to the third "yes", the solution is probably to simply catch the exception like is down with thrown() usage and re-throw it after the final leaveScope for the verifications of interactions from before when/then/expect.

If for example the answer to the first two questions is "both", and to the third "yes", it gets a bit more complicated.

I guess if someone is motivated enough to do it, the most appropriate answers should maybe be

both
both
no
not sure
only if unmatched ones

But defining the intended behavior with all variations here might indeed be the biggest problem. The implementation is most probably not that hard to do, no matter how the answers will be.

Mar 11 '24 00:03 Vampire