Patrick Toulme comments

Results 44 comments of


                                            Patrick Toulme

Neuron support in Axlearn

> @apoorvtintin I see this PR is quite stale for sometime. If no objection, I'd like to have @Ruixuan who is working on Trn from our end to port your...

New DataPartitionType DATA

> > Increases memory efficiency > > Do you have measurements on how DATA improves memory efficiency? Thanks. > > Increases memory efficiency > > Do you have measurements on...

New DataPartitionType DATA

> > > > Increases memory efficiency > > > > > > > > > Do you have measurements on how DATA improves memory efficiency? Thanks. > > >...

CUDA-based RMSNorm for Performance Optimization

There is a misconception here. The RMSNorm will lower into the HLO as decomposed Jax ops, but the XLA GPU compiler will fuse the ops together and potentially with more...

[XLA:TPU] Do not reorder upcast in ReorderConvertReduceAdd.

Im glad to see TPU is using the setup passes I contributed!

Improving `eqx.nn.Embedding`'s performance

The jnp.take causes the indices to be assumed to be in bounds. This assumption will be faster on chip. See the IR here - https://github.com/openxla/xla/issues/20899#issuecomment-2570010611 The jnp.take also seems to...

[V1] Support `LLM.apply_model`

Why has this not been merged? vLLM has no way right now to easily access the underlying model. That is a rather basic feature.

Add while loop config options and optional pass pipeline immediately before unroll.

@ezhulenev can this be merged?

Add while loop config options and optional pass pipeline immediately before unroll.

@xla-rotation gentle ping

Add while loop config options and optional pass pipeline immediately before unroll.

@fhoushmand can you help get this merged?