Corey adams
Corey adams
Hi, I've got a package that uses pybind11 (it's awesome, by the way), and had a few users report the following crash. I've been able to reproduce it myself as...
## What does this PR do? This PR extends pytorch_lighting with support for Intel GPUs, as enabled with `intel_extension_for_pytorch`. With Intel's module, pytorch gains the `torch.xpu` module which is equivalent...
I've encountered a strange behavior when using mpi4jax that has, in retrospect, been plaguing me for months. I've managed to put it into a reproducer here: https://gist.github.com/coreyjadams/6640c00c3dc0f202989cc964b8995214 (UPDATE: Here is...
This PR is in response to the discussion on #19409. It does the following: - First, it adds an additional cluster environment for jax.distributed that is based on autodetection of...
`jax.distributed.initiallize()` works, without arguments, on several but not all common MPI / Slurm parallel job launchers. Unfortunately, the environment variables used in the `ompi_cluster.py` class are not standardized. I'd like...
### Describe the bug When I copy/paste the installation instructions for XPU, they fail: ``` ❯ python -m pip install torch==1.13.0a0+git6c9b55e intel_extension_for_pytorch==1.13.120+xpu -f https://developer.intel.com/ipex-whl-stable-xpu Looking in links: https://developer.intel.com/ipex-whl-stable-xpu ERROR: Could...
# PhysicsNeMo Pull Request This PR updates the DoMINO model with performance enhancements and model updates. This is not yet the domain parallelism update - that is queued for next....
## Description This PR is the initial version of domain parallelism for DoMINO. Currently, the forward pass (inference) is supported though there are some numerical instabilities to track down with...
This commit adds a new, small functionality to physicsnemo's version check tools, to see if a package is installed. Then, the import of the datapipes leverages this tool to prevent...
### Version 0.10.0 ### On which installation method(s) does this occur? _No response_ ### Describe the issue MeshGraphNet is approximately 2x faster in some scenarios using TELayerNorm instead of LayerNorm....