[Feature Discussion] Quantum ESPRESSO support
Hi for the third time today,
(Slightly long issue since the purpose is discussion, see TL;DR at the bottom)
I recently moved back to a Quantum ESPRESSO (QE) dominant workflow after using VASP + Sumo + pymatgen for close to 2 years. I didn't want to let go of Sumo/pymatgen and didn't want to reinvent the wheel, so I have been working on an independent pymatgen namespace package (pymatgen.io.espresso) that attempts to solve this problem. I'm opening this issue to discuss the addition of QE support to sumo.
The package provides the PWxml class, which has a public interface fully compatible with Vasprun. Under the hood, it takes care of guessing all relevant file names from the XML (the quantities QE calculates are spread over many files produced by many executables, all with different formats), unit/coordinate conversion, etc. so that if your code looks like this
from pymatgen.io.vasp.output import Vasprun
calc = Vasprun('vasprun.xml', **kwargs)
# Some code that uses calc
it can be converted to
from pymatgen.io.espresso.output import PWxml
calc = PWxml('my_calc.xml', **kwargs)
# Exact same code that uses calc, without modification
(Currently, the repo is private until I'm ready for public release in a few weeks, but I'm more than happy to discuss this further and give access to the repo.)
This has allowed me to port sumo-bandstats, sumo-bandplot, and sumo-dosplot to work with Quantum ESPRESSO with practically no effort, I just added an espresso option to the --code flag of the three entry points. Currently, all three offer full feature parity with VASP. This work can be seen on the espresso-bandplot branch of my fork, you'll notice the diffs are quite small.
(There is a minor quirk related to QE using the (L,J,Jz) basis for calculations with SOC while VASP uses (L, Lz), so you can't do lm-decomposed PDOS with QE, just s,p,d decomposition. I might implement the conversion at some point, but it's nontrivial, and I hope to implement the (L, J, Jz) basis in sumo when I get the chance).
I have also implemented QE support in sumo-kgen, again with full feature parity with VASP. This required adding the --pwi flag for Quantum ESPRESSO input, but it's not a final design decision, I might piggyback off --poscar instead. This work can be seen on the espresso-kgen branch of my fork.
Currently, I don't anticipate much work needing to be done on either of those branches, besides documentation and adding some unit tests. Any bug fixes or issues will be handled on the pymatgen.io.espresso side.
The one thing you need to be aware of is that pymatgen.io.espresso requires pymatgen >= 2022.0.03 (which is almost 2.5 years old at this point, so nothing bleeding edge). I plan to open a pull request in a few weeks as soon as pymatgen-io-espresso is stable and publicly released on PyPI. Please let me know if you have any feedback or thoughts.
tl;dr: PR with support for Quantum ESPRESSO coming in a few weeks, with minimal changes to sumo itself. Need feedback/thoughts :)
Hi Omar,
that sounds very promising. In principle it is ok for Sumo to depend on an external package for an additional parser; we already do this for the CASTEP PDOS outputs with castepxbin. We would prefer if
- this can be implemented in a way that is optional (i.e. it is not imported until needed, and non-QE users can install Sumo without it)
- it doesn't come with too many extra dependencies of its own.
We could afford to bump the pymatgen dependency of Sumo a bit, too. To some extent this is forced on us by Pymatgen anyway because they don't make it very easy for downstream packages to support a wide range of software versions.
Hi Adam,
pymatgen-io-espresso's only dependencies (besides pymatgen, which sumo requires anyway), are f90nml and xmltodict. Both are super lightweight, it's a few hundred kilobytes extra if you already have pymatgen available.
I could certainly make this an extra package (i.e., pip install sumo[qe]) and use conditional imports, but I did intentionally build pymatgen-io-espresso to be as lightweight as possible (besides pymatgen, of course) so it can integrate with packages that already use pymatgen without any friction/bloat. Of course, it's up to you folks :).
And yeah, I've noticed the difficulty with pymatgen :(