Overview

  • Founded Date februari 24, 1932
  • Sectors Education
  • Posted Jobs 0
  • Viewed 8

Company Description

DeepSeek’s First-generation Reasoning Models

DeepSeek’s first-generation reasoning designs, achieving performance similar to OpenAI-o1 across mathematics, code, and thinking tasks.

Models

DeepSeek-R1

Distilled models

DeepSeek team has actually shown that the thinking patterns of bigger designs can be distilled into smaller models, leading to much better performance compared to the reasoning patterns found through RL on small models.

Below are the models produced by means of fine-tuning versus numerous dense models extensively used in the research community utilizing data created by DeepSeek-R1. The assessment results show that the distilled smaller dense models perform exceptionally well on standards.

DeepSeek-R1-Distill-Qwen-1.5 B

DeepSeek-R1-Distill-Qwen-7B

DeepSeek-R1-Distill-Llama-8B

DeepSeek-R1-Distill-Qwen-14B

DeepSeek-R1-Distill-Qwen-32B

DeepSeek-R1-Distill-Llama-70B

License

The model weights are accredited under the MIT License. DeepSeek-R1 series assistance industrial usage, enable any modifications and acquired works, including, however not limited to, distillation for training other LLMs.