XPath2.Net icon indicating copy to clipboard operation
XPath2.Net copied to clipboard

XPathNodeIterator.Count returns one less that the actual count w/ XPath2SelectNodes

Open Todo18 opened this issue 3 years ago • 15 comments

We've swapped MS's built-in XPath 1.0 processor with the XPath2.Net library and are conducting a couple of (unit) tests, and have noticed something weird about the Count property (of the XPathNodeIterator). Logically, and according to MSDN (and the reference implementation by MS), Count returns the index/position of the last node in the set. Positions start counting at 1 according to the XPath specs, so this interpretation of Count makes sense when you have for example 3 nodes with indices 1 to 3.

However, when swapping libs, Count suddenly returns 2 for a set with 3 nodes, 0 for a set with 1 node, and -1 for the empty set. Could it be that indices start at 0 in the implementation, or are we missing something obvious?

Regardless, thanks for this amazing library!

Todo18 avatar Oct 07 '22 11:10 Todo18

@Todo18 Can you provide a unit-test for this?

StefH avatar Oct 16 '22 08:10 StefH

Sure. Here's a sample test:

using System.Diagnostics;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using System.Xml.XPath;
using Wmhelp.XPath2;

namespace Test.Functional
{
    [TestClass]
    public class Compliancy
    {
        [TestMethod]
        public void XPath2SelectNodes_Count_Returns_Number_Of_Items_In_Expression()
        {
            var xIn = Resources.Provider.LoadXmlDocument("count_xml.xml");
            var navigator = xIn.CreateNavigator();

            var brandSet = navigator.XPath2SelectNodes("/report/brand");
            var brandCount = (int)navigator.XPath2Evaluate("count(/report/brand)");

            var highVolumeBrandSet = navigator.XPath2SelectNodes("/report/brand[units > 20000]");
            var highVolumeBrandCount = (int)navigator.XPath2Evaluate("count(/report/brand[units > 20000])");

            Debug.Assert(brandCount == 5);  // These are just here to assert the correct (expected) value
            Assert.AreEqual(brandCount, brandSet.Count);

            Debug.Assert(highVolumeBrandCount == 2);  // These are just here to assert the correct (expected) value
            Assert.AreEqual(highVolumeBrandCount, highVolumeBrandSet.Count);
        }
    }
}

We tested this with the following input file (count_xml.xml):

<?xml version="1.0" encoding="utf-8"?>
<!-- chocolate.xml -->
<report month="8" year="2006">
    <title>Chocolate bar sales</title>
    <brand>
        <name>Lindt</name>
        <units>27408</units>
    </brand>
    <brand>
        <name>Callebaut</name>
        <units>8203</units>
    </brand>
    <brand>
        <name>Valrhona</name>
        <units>22101</units>
    </brand>
    <brand>
        <name>Perugina</name>
        <units>14336</units>
    </brand>
    <brand>
        <name>Ghirardelli</name>
        <units>19268</units>
    </brand>
</report>

When we run the test, if returns 1 less (4 resp. 1) than the expected number of items in the node set of each expression (5 resp. 2), and one less than the count-function (which returns the correct results). I hope I'm being clear. If not, let me know. :)

Todo18 avatar Oct 18 '22 11:10 Todo18

@Todo18 I copied your code and created 2 unit tests:

  • https://github.com/StefH/XPath2.Net/blob/stef-issue-59/tests/XPath2.Tests/XPathNavigatorTests.cs#L366
  • https://github.com/StefH/XPath2.Net/blob/stef-issue-59/tests/XPath2.Tests/XPathNavigatorTests.cs#L391

And these run fine?

Can you double check what's the difference?

StefH avatar Nov 28 '22 19:11 StefH

@Todo18 Can you please take a look at the unit tests and check the difference with your code?

StefH avatar Dec 05 '22 07:12 StefH

Apologies, I'm short on time these days. I will inspect our integration and let you know my findings.

Todo18 avatar Dec 08 '22 11:12 Todo18

@Todo18 Did you have time to check?

StefH avatar Dec 21 '22 16:12 StefH

I'm checking it today. :)

Todo18 avatar Dec 22 '22 09:12 Todo18

I've taken the exact same code that you provided above, together with the v1.1.3 Nuget from the built-in Nuget manager in VS, and it fails the test. I have no idea what's going on. I'll look into it a bit more later today.

Todo18 avatar Dec 22 '22 11:12 Todo18

So I've had some more time to check. Before diving further into (proprietary) code, could you please try and rephrase your unit tests to conform to ours, and check the (new) results, because there seems to be a discrepancy between brandSet.Count (our code) and brandSet.Should().HaveCount(5) (your code, which translates to: brandSet.Cast<object>().Count(), which iterates the collection, and thus bypasses the Count property of the XPathNoteIterator altogether).

Todo18 avatar Jan 18 '23 15:01 Todo18

Please note that my unit tests are the same as your unit test. Please look at

  • https://github.com/StefH/XPath2.Net/blob/stef-issue-59/tests/XPath2.Tests/XPathNavigatorTests.cs#L366
  • https://github.com/StefH/XPath2.Net/blob/stef-issue-59/tests/XPath2.Tests/XPathNavigatorTests.cs#L391

StefH avatar Jan 18 '23 18:01 StefH

To my knowledge, they are not the same: Count (in our code) is a property of XPathNoteIterator (see here), while HaveCount calls the Count method of Enumerable (see here). The former returns the index of the last node of the set, while the latter iterates over the set, and never calls the property of XPathNoteIterator (which is the point of the unit test).

To illustrate my point, I've added a second assertion to our unit tests, which succeeds, while the original assertion still fails.

Assert.AreEqual(brandCount, brandSet.Cast<object>().Count()); // Succeeds
Assert.AreEqual(brandCount, brandSet.Count); // Fails

Todo18 avatar Jan 19 '23 09:01 Todo18

According to the documentation: https://learn.microsoft.com/en-us/dotnet/api/system.xml.xpath.xpathnodeiterator.count?view=net-7.0

Count => Gets the index of the last node in the selected set of nodes. So it's not the number of nodes, it's the index, which is 4 in this case because we have 5 items.

https://github.com/StefH/XPath2.Net/pull/60

StefH avatar Jan 20 '23 11:01 StefH

True, but positions start counting at 1 according to the XPath specs, so this interpretation of Count makes sense when you have for example 3 nodes with indices 1 to 3. (See here, for example: https://learn.microsoft.com/en-us/dotnet/api/system.xml.xpath.xpathnodeiterator.currentposition?view=net-7.0, or: https://www.w3.org/TR/2010/REC-xpath20-20101214/#eval_context)

So the indexing needs some tinkering, I guess.

Fiddle to illustrate: https://dotnetfiddle.net/QuErKT

Todo18 avatar Jan 20 '23 12:01 Todo18

But CurrentPosition is not the same as Count.

StefH avatar Jan 22 '23 12:01 StefH

I printed the first item of the set, not the last. It was simply to illustrate the numbering of indices behind the scenes. But you're right, I've adjusted the fiddle to illustrate my point better. ;)

https://dotnetfiddle.net/YdcE3V

Todo18 avatar Jan 22 '23 12:01 Todo18