Vectorize order of derivatization
I tried to vectorize the orders of derivatization in _eval_deriv_contractions. The end result ended up being slower than the nonvectorized version. I'm still making a pull request for this for record keeping.
What
- Handle multiple orders of derivatives at the same time.
- Add tests and everything
Results
Negligible difference in performance (<1%) for adding an extra dimension for the case of a single order of differentiation.
Decrease(!) in performance (both time and memory) in the case of multiple orders of differentiation. I'm honestly not sure why vectorization results in worse performance. Since the axis of the orders is linked with the axis for the coordinate (x, y, z), this vector for the orders coupled with many of the other vectors that uses the coordinate axis (e.g. coords, angmom_comps). This may have caused some difficulties in broadcasting, resulting in bad performance.
I've also checked that the fancy boolean indexing (for separating out the zeroth order derivative) is not the cause. Performance actually decreased when the fancy indexing is replaced with a more straight forward broadcasting
Following code was used to test:
- Single order calculation
import timeit
setup = '''
import numpy as np
from gbasis.deriv import _eval_deriv_contractions
coords = np.random.rand(10000, 3)
orders = np.array([0, 0, 0])
center = np.random.rand(3)
angmom_comps = np.array([(i, j, k) for i in range(5) for j in range(5) for k in range(5)])
alphas = np.random.rand(200)
coeffs = np.random.rand(200)
norm = np.random.rand(125, 200)
'''
print(timeit.timeit(
'''
_eval_deriv_contractions(coords, orders, center, angmom_comps, alphas, coeffs, norm, vectorized_orders=0)
''',
setup=setup,
number=20
) / 20)
print(timeit.timeit(
'''
_eval_deriv_contractions(coords, np.array([orders]), center, angmom_comps, alphas, coeffs, norm, vectorized_orders=1)
''',
setup=setup,
number=20
) / 20)
- Multiple order calculation
import timeit
setup = '''
import numpy as np
from gbasis.deriv import _eval_deriv_contractions
coords = np.random.rand(10, 3)
orders = np.array([(i, j, k) for i in range(5) for j in range(5) for k in range(5)])
center = np.random.rand(3)
angmom_comps = np.array([(i, j, k) for i in range(5) for j in range(5) for k in range(5)])
alphas = np.random.rand(200)
coeffs = np.random.rand(200)
norm = np.random.rand(125, 200)
'''
print(timeit.timeit(
'''
for order in orders:
_eval_deriv_contractions(coords, order, center, angmom_comps, alphas, coeffs, norm, vectorize_orders=0)
''',
setup=setup,
number=1
) / 1)
print(timeit.timeit(
'''
_eval_deriv_contractions(coords, orders, center, angmom_comps, alphas, coeffs, norm, vectorize_orders=1)
''',
setup=setup,
number=1
) / 1)
My guess is that the hermite polynomial evaluation is cheaper than the other parts that we don't gain much by reusing the evaluated hermite polynomials (besides, we are already reusing hermite polynomial evaluated in one dimension in the other dimensions). However, the cost associated with adding an extra axis and muddying up the orders of multidimensionary array may be greater than the gain.
Again, I'm not 100% sure about this. Maybe I screwed up something.
It seems this doesn't improve performance. So we closed it, but might end up being relevant for very large systems I guess.