n8programs comments

Results 34 comments of


                                            n8programs

[Feature Request] Full-Tuning Example

Confirmed, SGD works.

[Feature Request] Full-Tuning Example

But my god, float32 is brutal. 1/10th the speed of float16...

[Feature Request] Full-Tuning Example

How come mlx fails in 16-bit if most big models are pretrained that way? Is it cause it doesn't use bfloat16?

[Feature Request] Full-Tuning Example

Got it. Thank you for the info!

[Feature Request] Full-Tuning Example

Can confirm the effectiveness of float32 end-to-end tuning on tinyllama.

Added code for Full fine tune

Do you perform your full fine-tune in float32?

Added code for Full fine tune

Tried training qwen-1.8b. NaN loss immediately. Will try phi-2.

Feature-Request: Matrix Exponentiation

In-python implementation, yoinked from torch and ported w/ Claude - appears to work in training, though: ```python def _compute_T1(A): """I + A""" return mx.eye(A.shape[-1]) + A def _compute_T2(A): """I +...