
Zami
Add a review FollowOverview
-
Founded Date februari 24, 1932
-
Sectors Education
-
Posted Jobs 0
-
Viewed 8
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation reasoning designs, achieving performance similar to OpenAI-o1 across mathematics, code, and thinking tasks.
Models
DeepSeek-R1
Distilled models
DeepSeek team has actually shown that the thinking patterns of bigger designs can be distilled into smaller models, leading to much better performance compared to the reasoning patterns found through RL on small models.
Below are the models produced by means of fine-tuning versus numerous dense models extensively used in the research community utilizing data created by DeepSeek-R1. The assessment results show that the distilled smaller dense models perform exceptionally well on standards.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The model weights are accredited under the MIT License. DeepSeek-R1 series assistance industrial usage, enable any modifications and acquired works, including, however not limited to, distillation for training other LLMs.