python-flint Fmpz mod mpoly and nmod mpoly

This PR aims to add support for the fmpz_mod_mpoly and nmod_mpoly flint objects. I'm using the recently added fmpz_mpoly as a basis, there may be remants of the large copy-paste that have escaped my changes.

I haven't given the docs any more attention that a quick search-replace as of yet so while they (and all tests locally) may be passing I'll have to give them a once over before I start on nmod_mpoly.

Jul 18 '24 14:07 Jake-Moss

For now I've crammed the fmpz_mod_mpoly into the mpoly tests with both a prime and non-prime modulus. Though it's not quite as clean as Id like. The division functions assume the modulus is prime, so I've set it to raise an exception on any division function with a non-prime modulus, even if it would be within the domain. The same goes for sqrt and gcd. While the flint docs don't state they require a prime modulus for gcd, it was raising an impossible inversion error so I added the exception to it as well.

Jul 18 '24 14:07 Jake-Moss

I've set it to raise an exception on any division function with a non-prime modulus, even if it be within the domain. The same goes for sqrt and gcd.

That seems right to me.

Since nmod_mpoly has a context (unlike other nmod_* types; see gh-124) we could also check whether the modulus is prime when the context is created.

Jul 18 '24 15:07 oscarbenjamin

Since nmod_mpoly has a context (unlike other nmod_* types; see gh-124) we could also check whether the modulus is prime when the context is created.

I think that's also reasonable, currently I'm just checking the first time, storing the result and just returning that instead for later calls.

Jul 19 '24 05:07 Jake-Moss

I think that's also reasonable, currently I'm just checking the first time, storing the result and just returning that instead for later calls.

Either approach is reasonable I think. Not doing it when the context is created saves the one time cost of the check if it is never needed. In exchange we get a slight bit of overhead each time the attribute needs to be checked. I don't know whether that overhead is even measurable though.

What would be measurable I think is actually computing the prime check every time division happens. Particularly in the case of nmod rather than nmod_mpoly I would expect that to bring a measurable slowdown. This is why we need a context for nmod so that we can store whether or not the modulus is prime and only do the prime check once.

Jul 19 '24 07:07 oscarbenjamin

This is why we need a context for nmod so that we can store whether or not the modulus is prime and only do the prime check once.

That sounds reasonable, is that something we'd like in this PR?

I've brought the nmod_mpoly up to the same spec as fmpz_mod_mpoly. I've refrained from cramming them into the other nmod_poly and fmpz_mod_poly tests respectively just because of large API differences. Happy to have a crack at it but would either require a little more than just adding a for loop + some lambdas.

Whoops I looks like I missed that numpy isn't required here, was just using it for 32bit/64bit dependent memoryviews in place of a malloc of ulongs.

Jul 28 '24 13:07 Jake-Moss

This is why we need a context for nmod so that we can store whether or not the modulus is prime and only do the prime check once.

That sounds reasonable, is that something we'd like in this PR?

That's up to you. If it makes things easier in this PR then go ahead. Otherwise we can leave that to a separate PR.

More generally I think that we want all types to have associated contexts so that there is a uniform interface like ZZ.poly([1,2]) or GF(3).poly([1,2]) and we can express operations like ZZ.poly([1,2]).set_domain(QQ). That means we want contexts for fmpz, fmpq, arb, ...

We would also need contexts for all types for generic rings as well. I haven't looked enough into the generic rings interfaces to see if that should affect the overall design of contexts for the non-generic types.

For arb we need a contexts that can hold non-global precision. Likewise for the number of terms in a series. It isn't possible for SymPy, mpmath etc to make use of things like arf if the precision is global (e.g. SymPy and mpmath would both clobber each other). Even methods like __repr__ depend on global variables so there needs to be some sort of printer context or something.

If you want to add contexts for nmod then I would suggest just keeping it simple so that it resembles fmpz_mod. I mention these other things here just to give a sense for where I would want to go in future.

Jul 28 '24 14:07 oscarbenjamin

The test failure is

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/flint/test/__main__.py", line 36, in run_tests
    test()
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/flint/test/test_all.py", line 2888, in test_mpolys
    assert P(ctx=ctx).is_zero() is True
           ^^^^^^^^^^^^^^^^^^
AttributeError: 'flint.types.fmpz_mod_mpoly.fmpz_mod_mpoly' object has no attribute 'is_zero'

I think that is a test added in gh-168 so you may need to merge master to see the same failure locally.

Jul 28 '24 14:07 oscarbenjamin

I think that is a test added in gh-168 so you may need to merge master to see the same failure locally.

Yep that was it thanks! Seems like CI was doing something and merging the PR before testing, definitely through me off for a minute.

If you want to add contexts for nmod then I would suggest just keeping it simple so that it resembles fmpz_mod. I mention these other things here just to give a sense for where I would want to go in future.

I think something like that might be best suited in it's own PR, this one only touches on nmod_mpoly and fmpz_mod_mpoly

Jul 28 '24 14:07 Jake-Moss

More generally I think that we want all types to have associated contexts so that there is a uniform interface like ZZ.poly([1,2]) or GF(3).poly([1,2]) and we can express operations like ZZ.poly([1,2]).set_domain(QQ). That means we want contexts for fmpz, fmpq, arb, ...

This seems like it could be done as system that builds upon the existing contexts. It would have to be Python types rather than Cython types as cdef classs don't support multiple inheritance if something OO driven is desired.

We would also need contexts for all types for generic rings as well. I haven't looked enough into the generic rings interfaces to see if that should affect the overall design of contexts for the non-generic types.

Just from a quick skim they seem like quite the beast.

For arb we need a contexts that can hold non-global precision. Likewise for the number of terms in a series. It isn't possible for SymPy, mpmath etc to make use of things like arf if the precision is global (e.g. SymPy and mpmath would both clobber each other). Even methods like __repr__ depend on global variables so there needs to be some sort of printer context or something.

Would this be replacing the global FlintContext? Instead moving it towards something instance specific? Or at least swappable?

Jul 28 '24 14:07 Jake-Moss

Seems like CI was doing something and merging the PR before testing, definitely through me off for a minute.

That is usually how CI works at least for Travis and GitHub Actions. We were using Cirrus as well before which did not seem to work like that.

When you push to a PR (or close and open it) the CI system will make a temporary branch somewhere effectively like:

git checkout master
git checkout -b tmp_branch
git merge pr_branch # merge the PR into temporary copy of master
<run CI checks>

Sometimes you need to close and open a PR to get it to merge with latest master for CI (asking GA to "rerun" a job does not trigger a new merge).

There are also a bunch of complicated access rules in GitHub so if the PR modifies the GA workflow files then it might still run with the master version of the workflows depending on the event type that triggers the workflow.

Jul 28 '24 15:07 oscarbenjamin

Looks like another possible memory error. I'll hunt that down tomorrow.

Jul 28 '24 15:07 Jake-Moss

Would this be replacing the global FlintContext? Instead moving it towards something instance specific?

These are details that need to be worked out. For arf for example I would like to use it for SymPy's RR domain: https://github.com/sympy/sympy/pull/26861. It also makes sense for mpmath to use arf at least as its internal representation for mpf. That would mean that both SymPy and mpmath start manipulating the global precision and would mess each other up. Likewise the global precision is not thread-safe even if SymPy is the only user of it.

The thread-safe part could be managed by using a thread-local variable in python-flint. I don't know what the overheads would be for that in the context of arf though. I still think SymPy and mpmath would clobber each other though because it is likely that SymPy wants to modify the precision exactly when it is doing arithmetic with mpf or calling mpmath functions. Using a thread-local but otherwise global precision would still require very careful usage of context managers like with ctx.extraprec() everywhere and that just seems fragile.

The cleanest solution is to use separate contexts with context-local precision etc settings. Then you can do e.g.:

ctx = arb_ctx(53)
result = ctx.sin(1)

Then SymPy and mpmath can each make their own contexts and manipulate the precision separately. Of course if SymPy makes this context a global variable and modifies its precision then we're back to not thread-safe etc.

In the case of something like the RR domain though we can attach a separate context to each domain object and we never need to modify its precision. Ideally creating this arb_ctx would be a very cheap operation so that it can be say the first step in a call to something like .evalf(). Then everything is naturally thread-safe and robust since we've just avoided having any global variables to worry about.

The problem with having separate arb contexts everywhere is that then you need to do e.g. ctx.add(x, y) rather than x + y. For many purposes that is fine but it's not really nice for end users and wouldn't work for SymPy's RR domain where the elements need to work in generic code that uses elementary arithmetic. Now then what you need is to have each arb, arf etc hold a reference to the arb_ctx. Then in arithmetic like x + y the x.__add__ method can check the context for the other object and if they have the same context then its precision can be used for the operation and the returned result can also hold a reference to the same context.

Then we could ensure that any arb created as e.g. flint.arb(2) just ends up using the global context. That way users who just want to set the precision globally can do so but it doesn't mess up SymPy or mpmath who would never use the global context directly.

Now the problem is what are you going to do when you have x + y and they have different contexts with different precision or even different rounding modes? Let's see what mpmath does:

In [1]: import mpmath

In [2]: ctx1 = mpmath.MPContext()

In [3]: ctx2 = mpmath.MPContext()

In [4]: ctx2.dps = 50

In [5]: ctx1.mpf(1) / ctx2.mpf(3)
Out[5]: mpf('0.33333333333333331')

In [6]: ctx2.mpf(1) / ctx1.mpf(3)
Out[6]: mpf('0.33333333333333333333333333333333333333333333333333311')

It basically just takes the context from the left-hand operand. I think that it would be better to say something like:

If there are mixed context operands then take the one with higher precision (since dropping precision is irreversible)?
Definitely error if they have different rounding modes though.
Maybe actually just make all mixed context operations an error and require an explicit conversion or the use of context methods like x1+ctx1(x2) or ctx1.add(x1,x2).

There are likely two very different groups of users here:

End users who are probably happy using a global context. These users probably want implicit conversions.
Library users like SymPy, mpmath etc who are probably happy using explicit conversions. From SymPy's perspective it is nicer for debugging if mixed context operations give an immediate error so that we can fix it and add the explicit conversion in the right place.

Perhaps a way to satisfy both groups is to say that implicit coercions between contexts are allowed only when mixing the global context with a non-global context (the global context can coerce all others). Then SymPy can never use the global context but end users who do use it can take the output of SymPy functions and mix it into their own types.

Similar considerations apply to all of the other operations like the number of terms in a series how exactly everything gets printed. The problem with printing is you might have something like:

result = str([obj1, obj2])

Now the list's __str__ method is going to call obj1.__repr__(). Adding obj1.repr() doesn't help us here. This is basically the same problem as with x + y. One of two things is needed:

Either we have a printer object that takes care of printing lists and things as well so you do printer.str([obj1, obj2]). Now __repr__ and __str__ are not used at all.
Or otherwise each object needs to somehow hold a reference to a context that holds its printing settings but that adds to the memory footprint of every object.

Again the ideal solution here is probably different for different groups of users. SymPy for example already has its own complicated printing machinery so being able to call printer.str(...) or obj.repr() works fine. Many end users will find that inconvenient though and may want something more like the current global variables.

Jul 28 '24 15:07 oscarbenjamin

There should be some handling equality checking against scalars:

In [26]: flint.fmpz_poly([1]) == 1
Out[26]: True

In [27]: ctx = flint.fmpz_mpoly_ctx(2, flint.Ordering.lex, ['x', 'y'])

In [28]: ctx.from_dict({(0,0):1})
Out[28]: 1

In [29]: ctx.from_dict({(0,0):1}) == 1
Out[29]: False

Jul 28 '24 16:07 oscarbenjamin

Looks like another possible memory error. I'll hunt that down tomorrow.

I believe this is still lurking, I don't expect CI to pass on windows. I'll try to get to this soon

Aug 07 '24 16:08 Jake-Moss

This took a while to catch but I believe it's been fixed. The ctypedef unsigned long ulong in flintlib/flint.pxd causes Cython to inline the definition of ulong in place of the ulong in the sizeof(ulong). This is the incorrect size on 64bit windows machines causing the wrong amount of memory to be allocated. Because it was such a small amount (only 8 bytes) in most tests the failures were sporadic while testing, mainly only failing on the freeing of the memory.

Aug 13 '24 13:08 Jake-Moss

@GiacomoPope just added missing argument in https://github.com/flintlib/python-flint/pull/164/commits/f15b55a29f68d9493a13ca8990a157df4262294e. I don't think the fix is worth a separate PR

Aug 13 '24 14:08 Jake-Moss

Thanks for fixing this!

Aug 13 '24 14:08 GiacomoPope

My local CI via act is not working, hopefully that's all that's need for the coverage

Aug 13 '24 14:08 Jake-Moss

This took a while to catch but I believe it's been fixed. The ctypedef unsigned long ulong in flintlib/flint.pxd causes Cython to inline the definition of ulong in place of the ulong in the sizeof(ulong)

Good work!

Should we remove the ctypedef unsigned long ulong line?

I guess it is not really accurate and so could maybe cause other problems as well...

At Cython time we do not know if we are building for Windows or not so I guess that ulong and slong should be treated as being opaque types. Maybe that would be awkward because it might prevent Cython from automatically casting e.g. ulong to Python int when returning from a function. I'd be happier with explicit casts than obscure bugs though...

We have this which fixed previous Windows bugs: https://github.com/flintlib/python-flint/blob/f5b07e958b6b22ea4ef4d7e40a3d6cf42a342bba/src/flint/flintlib/flint.pxd#L63-L76

Aug 13 '24 14:08 oscarbenjamin

Should we remove the ctypedef unsigned long ulong line?

I guess it is not really accurate and so could maybe cause other problems as well...

At Cython time we do not know if we are building for Windows or not so I guess that ulong and slong should be treated as being opaque types. Maybe that would be awkward because it might prevent Cython from automatically casting e.g. ulong to Python int when returning from a function. I'd be happier with explicit casts than obscure bugs though...

I think its preferable to remove it or use the true definition, though I'm unsure how that would be done. I'm sure removing will cause the Cython compiler to complain. I'll try some things and see if anything sticks tomorrow morning, I'm currently away from my desktop.

Aug 13 '24 14:08 Jake-Moss

I think its preferable to remove it or use the true definition, though I'm unsure how that would be done

I think the only way to use the true definition is to do the extern from '*' thing that I just showed. This is because in principle cython runs to generate the C code and then the C code gets moved to another computer to be compiled. In practice that doesn't really happen much but basically what it means is that we can't check OS, Flint version etc at Cython compile time: only the C compiler can do that.

At least defining it as long long would be more correct on 64 bit systems. Maybe we should add a CI job somewhere to check that python-flint works on at least some 32 bit OS...

Aug 13 '24 14:08 oscarbenjamin

It seems it's possible to declare the types as opaque structs, Cython seems happy with this, however almost every single usage of them breaks. There are hundreds of compiler errors, possibly thousands as there are around 3.3k occurrences of slong and ulong in the code base. Arithmetic doesn't work, can't use them as default arguments, not sure if you can even have them as an argument, I gave up fixing compiler errors before I got it building.

diff --git a/src/flint/flintlib/flint.pxd b/src/flint/flintlib/flint.pxd
index 993a61e..50cc58e 100644
--- a/src/flint/flintlib/flint.pxd
+++ b/src/flint/flintlib/flint.pxd
@@ -22,16 +22,27 @@ cdef enum:
 #
 
 cdef extern from "gmp.h":
-    ctypedef unsigned long ulong
-    ctypedef unsigned long mp_limb_t
-    ctypedef long mp_size_t
-    ctypedef long mp_exp_t
     ctypedef mp_limb_t* mp_ptr
     ctypedef mp_limb_t* mp_srcptr
     ctypedef unsigned long mp_bitcnt_t
 
+    # Cython cannot be trusted to know the actual types of these.
+    # `mp_limb_t` and `mp_limb_signed_t` will either be `unsigned int` and `int` or
+    # `unsigned long long int` and `long long int` respectively.
+    ctypedef struct mp_limb_t:
+        pass
+    ctypedef struct mp_limb_signed_t:
+        pass
+
+    # `mp_size_t` and `mp_exp_t` will either be an `int` or `long int`.
+    ctypedef struct mp_size_t:
+        pass
+    ctypedef struct mp_exp_t:
+        pass
+
 cdef extern from "flint/fmpz.h":
-    ctypedef long slong
+    ctypedef mp_limb_t ulong
+    ctypedef mp_limb_signed_t slong
     ctypedef ulong flint_bitcnt_t
 
 ctypedef slong fmpz_struct

At least defining it as long long would be more correct on 64 bit systems.

I think this would be the best course of action, there is almost certainly some hidden overflow issue somewhere.

Aug 14 '24 05:08 Jake-Moss

It seems it's possible to declare the types as opaque structs, Cython seems happy with this, however almost every single usage of them breaks.

Okay that's not acceptable I think.

I've asked how to handle this on the Cython mailing list: https://groups.google.com/g/cython-users/c/sw1D4DdJI0U

Leaving aside the ctypedef ulong issue where are we at with this?

Is it just waiting for review?

Aug 14 '24 12:08 oscarbenjamin

It seems it's possible to declare the types as opaque structs, Cython seems happy with this, however almost every single usage of them breaks.

Okay that's not acceptable I think.

Yep I agree. Thanks, I'll keep an eye on that list, I'm interested to see if there's a blessed solution.

Leaving aside the ctypedef ulong issue where are we at with this?

Is it just waiting for review?

I believe so, it should be very similar to the fmpz_mpoly and fmpq_mpoly PR. Although I've made no attempts to address this comment of yours https://github.com/flintlib/python-flint/pull/164#issuecomment-2254563573, it's certainly something that should be done but I'm not sure where everything is currently standing with the other recent changes.

Aug 14 '24 13:08 Jake-Moss

I've asked how to handle this on the Cython mailing list: https://groups.google.com/g/cython-users/c/sw1D4DdJI0U

The suggested solution from Cython maintainer is to do:

cdef extern from *:
    """
    #define SIZEOF_ULONG sizeof(ulong)
    """
    int SIZEOF_ULONG

That is marginally better than using FLINT_BITS // 8.

It is also apparently considered to be a Cython bug that the code for sizeof(ulong) is not generated correctly. Apparently it should be okay to have an extern typedef like:

cdef extern from "flint/flint.h":
   ctypedef unsigned long ulong
   ctypedef long slong

Cython should be expected to respect the signed-ness of the types but otherwise not make any assumptions about the actual size or whether it really is typedef'd to unsigned long in the C code.

It doesn't look like we actually use sizeof(ulong) or sizeof(slong) anywhere else so hopefully there are no other problems lurking as a result of this.

Aug 14 '24 17:08 oscarbenjamin

Tests are failing but only under Python 3.12. Specifically this test:

            assert p.subs({"x1": S(0), 0: S(0)}) == ctx.from_dict({(0, 0): 1})
>           assert p.compose(p.subs({"x1": 0}), ctx.from_dict({(0, 1): 1})) == mpoly({
                (2, 2): 36,
                (1, 2): 24,
                (1, 0): 9,
                (0, 2): 4,
                (0, 1): 2,
                (0, 0): 4
            })
E           assert 64*x0^4*x1^2 ...^2 + 2*x1 + 10 == 36*x0^2*x1^2 ...1^2 + 2*x1 + 4

The output is definitely quite different from that expected in the test:

(Pdb) type(p)
<class 'flint.types.fmpz_mod_mpoly.fmpz_mod_mpoly'>
(Pdb) p p.compose(p.subs({"x1": 0}), ctx.from_dict({(0, 1): 1}))
64*x0^4*x1^2 + 96*x0^3*x1^2 + 31*x0^2*x1^2 + 12*x0^2 + 72*x0*x1^2 + 9*x0 + 36*x1^2 + 2*x1 + 10

It looks like subs is the problem:

(Pdb) p p
4*x0^2*x1^2 + 3*x0 + 2*x1 + 1
(Pdb) p p.subs({"x1":0})
4*x0^2 + 3*x0 + 3

If we set x1 -> 0 then we should just get 3*x0 + 1 here. It seems to be setting x1 -> 1 instead.

The fmpz_mod_mpoly.subs method is here: https://github.com/flintlib/python-flint/blob/043ca9dd9d047c78af7fe4ca8a01ee0ffa9341de/src/flint/types/fmpz_mod_mpoly.pyx#L894-L925

Aug 16 '24 18:08 oscarbenjamin

I had a couple of comments about the tests but generally I think this looks good and could be merged.

Is there anything outstanding to do?

Aug 17 '24 16:08 oscarbenjamin

Is there anything outstanding to do?

I don't believe so, nothing comes to mind. I will however open another PR to remove the if isinstance ... elif typecheck ladders from fmpz_mpoly and fmpq_mpoly as well once this is merged.

Aug 18 '24 01:08 Jake-Moss

Okay, this looks good. Let's get it in!

Aug 18 '24 11:08 oscarbenjamin

This is nice because it is mote of the mpoly types from Flint now.

The only ones missing I think are gr_mpoly (all of gr is missing) and fq_nmod_mpoly (which makes less sense since we use fq_default rather than fq_nmod).

Certainly these are all of the mpoly types that SymPy would currently use until we add gr.

Aug 18 '24 11:08 oscarbenjamin