Johannes E. M. Mosig issues

Results 5 issues of


                                            Johannes E. M. Mosig

Installation on MacOS X: clang: error: unsupported option '-fopenmp'

This package works perfectly on Linux, but I cannot install it on MacOS X. Both, installing with `pip install sparse-som` or installing from source with `cd src && make all`...

Support ISO Dates

**Have you read the [Longevity Statement](https://standardnotes.org/longevity)?** Yes **Is your feature request related to a problem? Please describe.** Dates in a spreadsheet cannot be formatted in the international [ISO standard](https://en.wikipedia.org/wiki/ISO_8601). I...

Feature request: Support for quantization

It'd be great if penzai would support model quantization out of the box. I know this is a lot of work to implement, but right now the lack of quantization...

feature-request

Conversion of a LlamaForCausalLM does not support `use_cache` and `_name_or_path`

When I run ```py hf_model = transformers.LlamaForCausalLM.from_pretrained("Unbabel/TowerInstruct-7B-v0.2") pz_model = penzai.models.transformer.variants.llama.llama_from_huggingface_model(hf_model) ``` the second line fails with ```sh ValueError: Conversion of a LlamaForCausalLM does not support these configuration attributes: {'use_cache': False,...

LayerStack.from_sublayer_builder temporarily uses about twice the memory

When I use the config ```py config = LlamalikeTransformerConfig( num_kv_heads=32, query_head_multiplier=1, embedding_dim=4096, projection_dim=128, mlp_hidden_dim=11008, num_decoder_blocks=32, vocab_size=32007, mlp_variant='swiglu', tie_embedder_and_logits=False, rope_wavelength=10000.0, rms_norm_eps=1e-05, attention_type=AttentionTypeGlobalCausal(), use_post_attn_norm=False, use_post_ffw_norm=False, final_logit_softcap=None, attn_logits_soft_cap=None, query_scaling_factor='default', parameter_dtype=jax.numpy.bfloat16, activation_dtype=jax.numpy.float32, use_layer_stack=True, )...

bug