january icon indicating copy to clipboard operation
january copied to clipboard

Gotcha: hasNext() modifies state

Open IanMayo opened this issue 8 years ago • 3 comments

Ooh, this just caught me out.

In Java the iterator.hasNext() method is always a pure function: it doesn't modify state. So, until iterator.next() is called, it will always return the same value.

However, in January iterator.hasNext() is used to advance the iterator, since the actual value is returned from a public int value.

Maybe the above strategy has been introduced as a performance measure. Does this outweigh the counter-intuitive implementation of iterator.hasNext()?

I propose that January iterators follow the Java standard of hasNext() for loop control and next() to both retrieve the current value and progress the iterator.

IanMayo avatar Apr 05 '17 10:04 IanMayo

Yes, IndexIterator was unwisely called an iterator but does not behave like Iterator<?>. As you guessed correctly, it is for performance to avoid boxing and extra calls (for access to it.index or it.getPos()).

Add Java 8 functional support would hopefully give a way for future users to avoid having to learn these quirks. See #118.

PeterC-DLS avatar Apr 05 '17 11:04 PeterC-DLS

Traditional coding patterns

Even if IndexIterator wasn't named as an iterator, the method name hasNext() would have thrown me.

I accept that the extensive use of these iterators in DAWN may unfortunately make it too resource-expensive to switch to the traditional model of Java iterator. This could be overcome by renaming hasNext() to something more explicit (hasNextAfterAdvance()).

But, my personal favourite would be to reflect the traditional iterator pattern. I'd also be prepared to invest effort to support the change.

Performance

I'm not a performance expert, but I'll throw in my tuppence-worth...

In moving through a collection we have to:

  1. check for more items
  2. advance to the next item
  3. retrieve the current value

Items 1. and 2. include processing.

The January iterator combines operations 1 and 2. The java iterator combines operations 2 and 3. No more or less processing seems to occur in either implementation.

IanMayo avatar Apr 05 '17 12:04 IanMayo

Renaming the classes would avoid future confusion so there would be no expectations on how it works.

As you note, these 'iterators' do not offer the contents of the datasets (only the index or position within a dataset). Mainly reduction tasks would benefit from a traditional iterator but this pattern is not so useful for a lot of dataset operations.

There is no reason you could not write a proper iterator that wraps the datasets...

PeterC-DLS avatar Apr 07 '17 08:04 PeterC-DLS