Shanbin Ke
Shanbin Ke
just in case if anyone wants to run code with python3.7+ and torch0.4+ ...
@superbobry how do we make sure test_sdpa_inference have 4 gpus? also could you share some info on the internal checks failure?
@superbobry Hi, I think all issues are resolved, could you take another look and trigger the internal review?
> Sorry for the delay @Cjkkkk. `DotProductAttentionTest.test_sdpa_inference` seems to fail internally with > > ``` > Traceback (most recent call last): > File "[...]/jax/_src/test_util.py", line 456, in test_method_wrapper > return...
> Yeah, it seems likely. Can you, perhaps, skip your test if
@MoFHeka , it is not correct to say it is implemented in tensorflow, it is implemented in XLA and there is a PR https://github.com/openxla/xla/pull/6872 pending to integrate the final piece...
add `set(CMAKE_CXX_COMPILER "clang-9")` into skeleton/CMakeList.txt solved my problem. If remove this line, then default compiler is g++ in my case, which will cause this problem. Hope it helps.
> > Compilation: TSL:XlaCompile:#module=pjit__wrapped_step_fn,program_id=24#: 3.754429084 (parallel + inline) > > What are the units, seconds? > > Mentioning both runtime and compile time in the bug description is a bit...
Update some more models compilation with this changes: https://docs.google.com/spreadsheets/d/1uIRf66UT9hOBOge3nvRZebDintgM0zmozNts0tOiXQA/edit?usp=sharing. Seems preserveLocals=False is not doing any better than parallel + inline. So i will just remove that.
@cheshire Hi, any updates on this?