arkouda Closes #3002: Add `ak.clip` functionality

Closes #3002, adding np.clip functionality.

Usage: a_clipped = ak.clip(a,lo_value,high_value)

Notes regarding np.clip anomalies are in the comments.

This passes unit testing, but I'm aware that my python may not match our standards. Please let me know if something needs to be changed.

Mar 18 '24 14:03 drculhane

Thanks. That falls squarely in the "oops" category. Fixed, and about to commit.

On Thu, Mar 21, 2024 at 9:58 AM ajpotts @.***> wrote:

@.**** commented on this pull request.

In arkouda/numeric.py https://github.com/Bears-R-Us/arkouda/pull/3043#discussion_r1533966744:

@@ -2001,3 +2027,76 @@ def value_counts( (array([0, 2, 4]), array([3, 2, 1])) """ return GroupBy(pda).count()

@.*** +def clip(

pda: pdarray,

lo: Union[numeric_scalars, pdarray],

hi: Union[numeric_scalars, pdarray], +) -> pdarray:

"""

Mimic the behavior of numpy.clip.

The example section is still not rendering correctly. If you replace
Examples:
--------
with
Examples
--------
it should work. Sphinx is pretty fussy unfortunately. Other thank that it looks good.

— Reply to this email directly, view it on GitHub https://github.com/Bears-R-Us/arkouda/pull/3043#discussion_r1533966744, or unsubscribe https://github.com/notifications/unsubscribe-auth/BGL3SKWIFRAWKKAYKMGZXILYZLRP7AVCNFSM6AAAAABE3WUDGCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNJSGMYDMMJXGQ . You are receiving this because you were assigned.Message ID: @.***>

-- Andrew D. Culhane, Ph.D. Sagecor Solutions LLC

Mar 21 '24 14:03 drculhane

The prob-size is another Oops. I'll correct it, and make that change to clip.

--- Andy

On Thu, Mar 21, 2024 at 10:45 AM tess @.***> wrote:

@.**** commented on this pull request.

In PROTO_tests/tests/numeric_test.py https://github.com/Bears-R-Us/arkouda/pull/3043#discussion_r1534054223:

+#prob_size = 1000 +prob_size = 10

it looks like this was commented out for testing, but I think we should prob put it back to 1000

In arkouda/numeric.py https://github.com/Bears-R-Us/arkouda/pull/3043#discussion_r1534061717:

if lo is not None:
   pda1 = where(pda < lo, lo, pda)
else:
   pda1 = pda[:]
if hi is not None:
   pda1 = where(pda1 > hi, hi, pda1)
return pda1
so this could still result in a deep copy of pda when it's not necessary, since we know at least one of these cases has to be true, we can safely do this:

results in a shallow copy, but it doesn't matter because it will be overwrittenpda1 = pdaif lo is not None:
pda1 = where(pda < lo, lo, pda)if hi is not None:
pda1 = where(pda1 > hi, hi, pda1)return pda1
— Reply to this email directly, view it on GitHub https://github.com/Bears-R-Us/arkouda/pull/3043#pullrequestreview-1952441270, or unsubscribe https://github.com/notifications/unsubscribe-auth/BGL3SKQFQQRRH3SG2SOOIELYZLXCHAVCNFSM6AAAAABE3WUDGCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNJSGQ2DCMRXGA . You are receiving this because you were assigned.Message ID: @.***>

-- Andrew D. Culhane, Ph.D. Sagecor Solutions LLC

Mar 21 '24 15:03 drculhane

It was a deliberate decision not to use more parameterization, because I felt that brevity would get in the way of clarity.

But I'll still look at it again anyway, and I definitely like your suggestion about combining the singleton and array cases.

--- Andy

On Mon, Mar 25, 2024 at 5:01 PM jaketrookman @.***> wrote:

@.**** commented on this pull request.

In PROTO_tests/tests/numeric_test.py https://github.com/Bears-R-Us/arkouda/pull/3043#discussion_r1538235366:

```
   ilo = np.full(ia.shape, ilo)
```

```
   for dtype1 in dtypes:
```
```
       for dtype2 in dtypes:
```
```
           for dtype3 in dtypes:
```

               hi = np.dtype(dtype1).type(ihi)

               nd_lo = ilo.astype(dtype2)

               nd_arry = ia.astype(dtype3)

```
               ak_lo = ak.array(nd_lo)
```

               ak_arry = ak.array(nd_arry)

```
               assert np.allclose(
```

                   np.clip(nd_arry, nd_lo, hi),

                   ak.clip(ak_arry, ak_lo, hi).to_ndarray(),

```
               )
```
```
   # test with arrays for hi
```
```
   ilo = 25
```
```
   ihi = np.full(ia.shape, ihi)
```
```
   for dtype1 in dtypes:
```
```
       for dtype2 in dtypes:
```
```
           for dtype3 in dtypes:
```

               lo = np.dtype(dtype2).type(ilo)

               nd_hi = ihi.astype(dtype1)

               nd_arry = ia.astype(dtype3)

```
               ak_hi = ak.array(nd_hi)
```

               ak_arry = ak.array(nd_arry)

```
               assert np.allclose(
```

                   np.clip(nd_arry, lo, nd_hi),

                   ak.clip(ak_arry, lo, ak_hi).to_ndarray(),

```
               )
```
```
   # test with arrays for both
```
```
   ilo = np.full(ia.shape, ilo)
```
```
   for dtype1 in dtypes:
```
```
       for dtype2 in dtypes:
```
```
           for dtype3 in dtypes:
```

               nd_hi = ihi.astype(dtype1)

               nd_lo = ilo.astype(dtype2)

               nd_arry = ia.astype(dtype3)

```
               ak_lo = ak.array(nd_lo)
```
```
               ak_hi = ak.array(nd_hi)
```

               ak_arry = ak.array(nd_arry)

```
               assert np.allclose(
```

                   np.clip(nd_arry, nd_lo, nd_hi),

                   ak.clip(ak_arry, ak_lo, ak_hi).to_ndarray(),

```
               )
```

Overall this all looks great. The one think that I would recommend is taking more advantage of the parameterization that PROTO_test offers. This may be able to reduce the amount of code that you have in these sections. You do not have very many parameters, so there may not be as much of an improvement as there are other places.

The other suggestion is to combine the logic of the singleton cases and the array cases into a singe for loop. If you generate the ilo and ihi arrays, you can then use ilo[0] and ihi[0] and combine several sections by doing the singleton and array tests in the same for loop. Similar things were done in other places in numeric_test.py in places like sqrt, power, and arctan2.

— Reply to this email directly, view it on GitHub https://github.com/Bears-R-Us/arkouda/pull/3043#pullrequestreview-1958777009, or unsubscribe https://github.com/notifications/unsubscribe-auth/BGL3SKRD534ORVJI6ZMYI6LY2CGC7AVCNFSM6AAAAABE3WUDGCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNJYG43TOMBQHE . You are receiving this because you were assigned.Message ID: @.***>

-- Andrew D. Culhane, Ph.D. Sagecor Solutions LLC

Mar 26 '24 11:03 drculhane