raytracing.github.io AABB hit and NaNs & Infinities

(Disclaimer: my implementation of this project is in Rust, and my experience with C++ is limited. I am linking to reference docs just to make sure I have verified my own understanding - not to imply that the authors of this book & maintainers of this repo wouldn't know! Feel free to point out any inaccuracies and mistakes - I'm here to learn. Thanks!)

Looking at dev-major branch, to be most up to date.

Given the function

https://github.com/RayTracing/raytracing.github.io/blob/62ce2e6a1f08b4c103808e513dc1c8fc6a188776/src/common/aabb.h#L61-L73

and an example ray with an origin of [0.0, 0.0, -1.0] and direction [0.0, 0.0, 1.0] - or anything similar for that matter, where any component of the direction vector is zero - you will end up getting a NaN from the division by r.direction()[a].

On a quick look at the reference, fmin(NaN, other) and fmax(NaN, other) calls will return other, unless both operands are NaN, in which case a NaN is returned. This will mean that t0 and t1 will get set to NaN. Additionally, ray_t.min and ray_t.max will get set to what they already are, and the comparison of ray_t.max <= ray_t.min will be falsy so the loop continues to the next iteration.

This could potentially lead to unintended results, with false positives on whether the AABB can be hit with the ray (todo: are false negatives possible?)

Sidenote: also pondering whether this could potentially be one explaining factor for the open question of why one implementation is 10x faster compared to the other one - or not, this is just a random guess.

Note that even the other implementation below it is susceptible:

https://github.com/RayTracing/raytracing.github.io/blob/62ce2e6a1f08b4c103808e513dc1c8fc6a188776/src/common/aabb.h#L78-L88

In this function invD will be NaN due to the division by zero. The later multiplications will make t0 and t1 set to NaN. The invD < 0 comparison will always be falsy with a NaN so the values won't get swapped. The fmin and fmax function calls will return other, the resulting ray_t.max <= ray_t.min call will always be falsy so the loop continues.

There's also the version that currently exists in the published book:

    for (int a = 0; a < 3; a++) {
        auto invD = 1.0f / r.direction()[a];
        auto t0 = (min()[a] - r.origin()[a]) * invD;
        auto t1 = (max()[a] - r.origin()[a]) * invD;
        if (invD < 0.0f)
            std::swap(t0, t1);
        t_min = t0 > t_min ? t0 : t_min;
        t_max = t1 < t_max ? t1 : t_max;
        if (t_max <= t_min)
            return false;
    }
    return true;

This should be identical to the one above.

Sep 13 '21 17:09 Walther

I'm leaving the above message as original, including the mistakes, but here's some errata: division by zero causes an infinity, not a NaN. Operations with infinities work in different ways than operations with NaNs.

I managed to find at least one reproducible example case where the two methods produce different outcomes: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=48a920a2604206baab6f555e1759fe44

Note that in the old method, 0 * inf = NaN, and tmin will be set to epsilon due to NaN comparison semantics, and inf <= epsilon will be false, and the loop continues. Conversely, in the new method, inf <= inf will be true, so the code will do an early return false for the box hit.

I think this means that the new method can cause false negative hits for the AABB hit method.

Sep 14 '21 09:09 Walther

@trevordblack — I'm inclined to punt this to post v4.0.0.

Jun 19 '23 21:06 hollasch

Just wanted to mention that I hit this exact problem (and as it turns out I'm also using rust.) My code is slightly different from the book but it's based on the "optimized" example in section 3.5 of book 2:

            let ray_direction_in_axis_inv = 1.0 / ray_direction_in_axis;
            let mut t_min = (axis.min - ray_origin_in_axis) * ray_direction_in_axis_inv;
            let mut t_max = (axis.max - ray_origin_in_axis) * ray_direction_in_axis_inv;

ray_direction_in_axis_inv is +Inf which in theory is fine, but if the ray origin is exactly on axis.min or axis.max there will be a NaN, causing the comparison to fail and the AABB to incorrectly report that it was not hit by the ray.

I addressed this by changing the code to offset the axis bounds by an epsilon. If NaN occurs the function returns no hit, which would be correct in this case.

            const EPSILON: f32 = 1e-4;
            let mut t_min = (axis.min - EPSILON - ray_origin_in_axis) * ray_direction_in_axis_inv;
            let mut t_max = (axis.max + EPSILON - ray_origin_in_axis) * ray_direction_in_axis_inv;

Aug 19 '23 09:08 aclysma

BTW, I found it very helpful for debug purposes to hack the BvhNode to always check if the children return a hit, and if it does assert that the AABB for the node also reports a hit. This made it very easy to find/reproduce this issue. Don't know if it's worth mentioning this in the book though.

Aug 19 '23 09:08 aclysma

Ok, in reviewing the aabb::hit() function, I've rewritten and simplified it to use our new interval class, which makes things a bit more understandable (just as using points and vectors is preferable to managing a bunch of coordinates individually). As a bonus, this speeds things up about 13%. Still, all of the issues above with NaNs and infinities still apply, but I don't think these are a problem. To illustrate, I'll use the new (but not yet merged in) code:

bool hit(const ray& r, interval ray_t) const {
    for (int a = 0; a < 3; a++) {
        const interval& ax = axis(a);
        auto ao = r.origin()[a];
        auto ad = r.direction()[a];

        auto t_interval = span((ax.min - ao) / ad, (ax.max - ao) / ad);
        ray_t = ray_t.intersect(t_interval);

        if (ray_t.is_empty())
            return false;
    }
    return true;
}

As a quick explanation, the new span() function returns the interval that spans the values a and b, regardless of their ordering. The other new functions interval::intersect(const interval& other) and interval::is_empty() should be obvious.)

The key issue here is the interval t_interval.

$[0, X]$ or $[X, 0]$ — In this case, the ray origin lies on either of the two axis planes. This just yields an intersection solution for t = 0 for one of the planes, and everything just works normally.

$[±∞, ±∞]$ — This happens when ad = 0. This happens when the ray is parallel to the axis planes. There are three categories of this situation. In all cases, the ray never intersects either of the two planes. In two cases, the ray lies entirely outside the two planes, to one side or the other. In the other case, the ray lies between the two planes (but never intersects). If the ray lies fully outside, the interval will be either $[-∞, -∞]$ or $[+∞, +∞]$. The intersection of this interval with any finite interval will yield an empty interval, which means the ray-aabb intersection returns false, which is what we want.

In the case that the ray lies between the two planes, then t_interval == [-∞, +∞], which just means that the ray intersects that slab space everywhere (for any t). In this case, the is_empty() test fails, and the loop continues, judging based on whether the ray intersects in either of the other two dimensions. Since the ray direction cannot be $<0, 0, 0>$, it must have a legitimate test in at least one other dimension, so everything works fine.

$[\text{NaN}, X]$, $[X, \text{NaN}]$, or $[\text{NaN}, \text{NaN}]$ — This happens when the ray is parallel to the two axis planes, and the ray origin lies on either one plane or the other (or both, if fed an aabb of zero size in that dimension, which we take measures to avoid anyway). In this case, the intersection of an interval with another interval that contains a NaN in either or both of its bounds will yield a result that has a NaN in one or both of its bounds. I've rewritten the interval::is_empty() test to always return true if either bound is a NaN, so the ray will be judged to miss the aabb. This is fine, since we inflate the aabb's anyway, and it's a fine decision to just rays exactly on one of the aabb bounds to be a miss.

Also note that I tried being more explicit about the zero denominator, intercepting such values and treating them specially, but the result is more complicated code that runs a bit slower. Better to understand and use the defined IEEE-754 semantics instead of regarding infinities and NaNs as errors.

Sep 03 '23 20:09 hollasch

The question now is how much of this to add to our text.

Sep 03 '23 20:09 hollasch

Ref: #1270

Sep 27 '23 22:09 hollasch

Finally. Done.

Mar 19 '24 23:03 hollasch