course
course copied to clipboard
Chapter 1 "How do Transformers work"?
January 2022: [InstructGPT](https://huggingface.co/papers/2203.02155), a version of GPT-3 that was trained to follow instructions better This list is far from comprehensive, and is just meant to highlight a few of the different kinds of Transformer models. Broadly, they can be grouped into three categories:
This list is far from comprehensive, and is just meant to highlight a few of the different kinds of Transformer models. Broadly, they can be grouped into three categories: in huggingface/course/blob/main/chapters/en/chapter1/4.mdx appears to be in the wrong area (in the middle of the list of influential models, where it should presumably be after the list of models.)
...
November 2024: [SmolLM2](https://huggingface.co/papers/2502.02737), a state-of-the-art small language model (135 million to 1.7 billion parameters) that achieves impressive performance despite its compact size, and unlocking new possibilities for mobile and edge devices.
GPT-like (also called auto-regressive Transformer models)
BERT-like (also called auto-encoding Transformer models)
T5-like (also called sequence-to-sequence Transformer models)
2022年1月: InstructGPT, 这是一种经过训练的GPT-3 版本,这份清单并非很全面,最好遵从介绍说明,以及这只是重点介绍一些不同类型的Transformer 模型。 总的来说, 他们可以归纳为三种类型。