Xiaojie Wu
Xiaojie Wu
Recently we found `pytorch` pollutes `openmp` symbols. DFTD3 returns wrong results after the pollution. While some packages are not affected. I guess that is because the current pypi version (v1.0.0)...
- [x] Use screening index to generate AO values - [ ] Generate screening index for small grid blocks, e.g 256*256 grids -> (256, 256) grids - [ ] Implement...
- Reduce the GPU memory footprint of libXC - Improved DFT screen_index kernel
Some of [CUDA kernel functions](https://github.com/pyscf/gpu4pyscf/blob/master/gpu4pyscf/lib/gvhf-md/unrolled_md_j.cu#L5079) for MD J engine use shared memory more than 64 KB. Some Nvidia GPUs (even modern ones, such as CC 7.5) still only have 64...
The future release of GPU4PySCF will adopt the Apache 2.0 license. This change was requested by a valued member of our community and aligns with our goal to foster broader...
### Description cuTensor 2.2 introduces a feature of trinary tensor contraction which contracts three tensors in one function. Can we expose this function in cupy? https://docs.nvidia.com/cuda/cutensor/latest/api/cutensor.html#cutensorcreatecontractiontrinary ### Additional Information _No...
This is a feature request from one of friends. Currently, the analytical Hessian of NLC functional is supported for RKS only.