PolygonOps.jl icon indicating copy to clipboard operation
PolygonOps.jl copied to clipboard

API considerations

Open KlausC opened this issue 5 years ago • 1 comments

Hi, I am just reviewing this package and have some remarks about the API of inpolygon.

The untyped input argument p obiously expects a vector-like object, which contains the x and y coordinate of the point to be processed. I think, it would be a better idea to allow for multiple input points to be processed in one go.

The untyped input argument poly expects a vector of vectors of x-y- coordinates. Here I see a potential performance bottleneck, if that is compared to a nx2 matrix. There is also a restriction of generality, because describing the polygon only by a list of vertices does not allow unconnected components, as it is used. The sample data found in inpoly-bench-data are are not easily or not at all translated to this restricted format. Again I would propose toe use a mx2 matrix for the vertices and add an extra integer mx2 matrix to define the edges more liberately.

For the output data stat a vector of integer values in {-1,0,1} is returned. I would expect stat == 0 for on boundary, and abs(stat) == 1 for not on boundary, more intuitive than you have.

KlausC avatar Apr 01 '20 20:04 KlausC

Klaus,

Thank you for your review.

The untyped input argument p obiously expects a vector-like object, which contains the x and y coordinate of the point to be processed. I think, it would be a better idea to allow for multiple input points to be processed in one go.

I strongly agree. I want to benchmark the different approaches and try to provide some guidance. One additional consideration is that sampling multiple points at once will need to allocate whereas a serial approach won't require allocation (somewhat important for interactive uses).

The untyped input argument poly expects a vector of vectors of x-y- coordinates. Here I see a potential performance bottleneck, if that is compared to a nx2 matrix. There is also a restriction of generality, because describing the polygon only by a list of vertices does not allow unconnected components, as it is used. The sample data found in inpoly-bench-data are are not easily or not at all translated to this restricted format. Again I would propose toe use a mx2 matrix for the vertices and add an extra integer mx2 matrix to define the edges more liberately.

Adding support for matrix formats should be easy to do and shouldn't impact the existing API. We can also do an inpolygon(pt, vts, edges) as well without impact. It is always okay to special case the dispatch, especially so if the generic implementation would give a broken result.

For the output data stat a vector of integer values in {-1,0,1} is returned. I would expect stat == 0 for on boundary, and abs(stat) == 1 for not on boundary, more intuitive than you have.

To make things generic we might let the user specify a convention using keyword arguments so that the "on" case could be a different value. e.g. default inpolygon(p, poly, in=1, on=-1, out=0) so you could do something like inpolygon(p, poly, in=true, on=true, out=false) or inpolygon(p, poly, in=1,on=0, out=1). This shouldn't impact type stability either. We might also consider adding an onpolygon function as well.

My general idea for this package is to curate algorithms and APIs that can be integrated into larger packages without being super-opinionated on the polygon format. I really appreciate your feedback.

sjkelly avatar Apr 01 '20 21:04 sjkelly