Compute the sum of powers and the sum of divisors of an integer
This pull request was inspired by a recent Numberphile video.
The main goal of this PR is the method fmpz_sum_divisors_proper which computes the sum of all proper divisors of an integer g, i.e. all divisors including 1 but excluding g itself, without multiplicities (e.g. 4 counts the divisor 2 only once).
It does so by factorizing into primes and then counting multiplicities. Specifically, it uses the following well-known formula:
Let $k$ be a positive integer with prime factorization $k = p_{1}^{m_1} \dots p_{n}^{m_n}$, where the $p_i$ are pairwise distinct primes. Denote by $S(n, e)$ the sum of powers of $n$ up to and including $n^e$, i.e. $S(n, e) = \sum\limits_{i=1}^{e} n^{i}$. Then the sum of all divisors of $k$ (not necessarily proper) is equal to $\prod\limits_{i = 1}^{n} S(p_i, m_i)$ where the $p_i$ and $m_i$ are the primes and their multiplicities from the prime factor decomposition of $k$. If $k$ is negative, then this implementation performs the computation for $|k|$ and adds the sign in the end.
To compute this efficiently, this PR implements two methods to evaluate the function $S$. The first method is a Horner-style evaluation of the polynomial $h_e(x) \coloneqq S(x,e)$. The other method makes use of the identity $h_e(x) \cdot (x - 1) = x^{e+1} - 1$. Both methods have their benefits and their downsides. Assuming exponentiation is fast, the polynomial division trick executes in near constant time, but it requires a division. The Horner-style method only needs multiplications and small integer additions, but it takes $O(e)$ time. Hence for large values of $e$, the polynomial version will provide slightly better performance. Some quick and dirty testing on my machine shows that for $e \approx 100$ they are roughly equal. To "hide" this distinction from the user, an additional convenience method picks an algorithm automatically, based on the value of $e$.
To demonstrate the usefulness of this method, a new example program uses the function fmpz_sum_divisors_proper to compute the aliquot sequence of 138, which famously blows up to almost 180 billion before collapsing to 1.
To the best of my knowledge, I followed the code style guidelines. I added documentation, and tests are provided for all functions (except for fmpz_sum_divisors_proper because all it does is subtract $k$ from fmpz_sum_divisors).
An analogue of fmpz_sum_powers might also be of interest for other types, like fmpq_t or the arb types. Since the interfaces are pretty much identical, implementing similar functions in the other modules should be a trivial task (though it does lead to code duplication - tweaks in one module do not transfer to the other modules - which raises the question: Is there any plan on how to avoid such duplication? Something like templates in C++ or generics in C#. But that is a topic for another time).
As for the example, you also have to modify dev/check_examples.sh. See Makefile for how it is called.
It should check that the example gives a correct output. Rigorousness is not that important here.
There is already fmpz_divisor_sigma for computing the sum of divisors.
I don't think fmpz_sum_divisors_proper really warrants its own function.
There is already
fmpz_divisor_sigmafor computing the sum of divisors.I don't think
fmpz_sum_divisors_properreally warrants its own function.
I have the same exact opinion
Yup, you're totally right. I embarrassingly didn't see that in the docs, and it certainly makes both fmpz_sum_divisors*methods redundant.
Nonetheless, I think fmpz_sum_powers has some merit of its own, and in my testing the function fmpz_sum_divisors (which corresponds to setting $k = 1$ in fmpz_divisors_sigma) is consistently faster or comparable. Apart from different memory overhead, I can think of two sources of this difference.
On the one hand, I don't need to handle the case $k > 1$ and I use the horner-style method to avoid division for small exponents.
Perhaps it is warrented to migrate fmpz_divisors_sigma to use fmpz_sum_powers internally?
If you agree, I would
- remove both of the
fmpz_sum_divisors*functions - modify
fmpz_divisors_sigmato callfmpz_sum_powers - modify the
aliquotexample to callfmpz_divisors_sigmainstead
I don't think fmpz_sum_powers needs to be a public function, but you could put such a private function in fmpz_divisors_sigma to speed up the k = 1 case.
I'm a bit surprised that the exponent cutoff is as large as 100. One issue might be that you use fmpz_tdiv_q instead fmpz_divexact (though I'm not sure if that makes a difference here).
Anyway, the k = 1 case in fmpz_divisors_sigma could be sped up much further. To begin with, you could have separate cases for (a) sigma(n) < 2^64, (b) n < 2^64, (c) general n, and in cases (a) and (b) use ulong arithmetic internally which will be a lot faster than fmpz arithmetic.