Remo Dentato comments

Results 16 comments of


                                            Remo Dentato

Use of standard C integers defined in stdint.h

I believe, for example, that `int64_t` is clearer than `long`. However consider that I come from an age when usually `int` was 16 bits and I still have to switch...

float16 and 8-bit CUDA implementations

Fwiw, I believe it is key to have in the repo a Cuda implementation (and later an OpenCL one and so on). This will allow to focus the efforts on...

float16 and 8-bit CUDA implementations

I know I'm annoying but this is exactly why I believe it's beneficial to have this version in the repo.

float16 and 8-bit CUDA implementations

To keep them aligned, I would push the differences to specific functions like "load_weights" etc. If you are not opposed to the idea of creating a llm "object" (like I...

float16 and 8-bit CUDA implementations

@karpathy , I see your point, that's why I submitted those minimal PR in the hope they can help you moving faster to your desired state. However not having this...

float16 and 8-bit CUDA implementations

I got the same issue but I only have 16GB of Ram at the moment. I told myself I would have tried with a bigger machine but never did. How...

export model to fp16

Ok. I see you went for a much deeper change. Did you manage to test it?

export model to fp16

The point is that they can be directly loaded into the GPU. Not needing conversion on-the-flying (and having a smaller file to load) significantly reduce the load time (which, for...

Modified export.py to add the ability to export fp16 weights.

It will be the "legacy version" but with fp16 weights. Because of how `export.py` works, you need to give a version number to it: ``` usage: export.py [-h] [--version VERSION]...

Modified export.py to add the ability to export fp16 weights.

Just thought of another way, but I'm not sure I like it: use the extension of the output file to determine the fp32/fp16 size. For example: ``` python export.py --hf...