Skip to content
Change the repository type filter

All

    Repositories list

    • DeepGEMM

      Public
      DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
      C++
      6665.6k281Updated Aug 3, 2025Aug 3, 2025
    • FlashMLA

      Public
      FlashMLA: Efficient MLA kernels
      C++
      88312k420Updated Aug 1, 2025Aug 1, 2025
    • DeepEP

      Public
      DeepEP: an efficient expert-parallel communication library
      Cuda
      8838.3k9616Updated Aug 1, 2025Aug 1, 2025
    • 3FS

      Public
      A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
      C++
      9239.2k9925Updated Jul 28, 2025Jul 28, 2025
    • 841.2k92Updated Jul 18, 2025Jul 18, 2025
    • Python
      16k98k3035Updated Jun 27, 2025Jun 27, 2025
    • 12k91k5725Updated Jun 27, 2025Jun 27, 2025
    • ESFT

      Public
      Expert Specialized Fine-Tuning
      Python
      25365550Updated May 22, 2025May 22, 2025
    • Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
      2827.9k00Updated May 15, 2025May 15, 2025
    • Integrate the DeepSeek API into popular softwares
      3.7k33k8554Updated May 13, 2025May 13, 2025
    • [ICLR 2024] Official implementation of DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
      Python
      3573k340Updated Apr 22, 2025Apr 22, 2025
    • EPLB

      Public
      Expert Parallelism Load Balancer
      Python
      1961.2k71Updated Mar 24, 2025Mar 24, 2025
    • Analyze computation-communication overlap in V3/R1.
      1441.1k110Updated Mar 21, 2025Mar 21, 2025
    • DualPipe

      Public
      A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
      Python
      3002.8k40Updated Mar 10, 2025Mar 10, 2025
    • smallpond

      Public
      A lightweight data processing framework built on DuckDB and 3FS.
      Python
      4184.8k226Updated Mar 5, 2025Mar 5, 2025
    • DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
      Python
      1.8k5k9715Updated Feb 26, 2025Feb 26, 2025
    • Janus

      Public
      Janus-Series: Unified Multimodal Understanding and Generation Models
      Python
      2.2k17k15324Updated Feb 1, 2025Feb 1, 2025
    • DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
      5284.9k783Updated Sep 25, 2024Sep 25, 2024
    • DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
      9386k532Updated Sep 24, 2024Sep 24, 2024
    • Python
      23353280Updated Aug 16, 2024Aug 16, 2024
    • DeepSeek Coder: Let the Code Write Itself
      Python
      2.6k22k11222Updated May 21, 2024May 21, 2024
    • DeepSeek-VL: Towards Real-World Vision-Language Understanding
      Python
      5763.9k412Updated Apr 24, 2024Apr 24, 2024
    • DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
      Python
      5332.8k332Updated Apr 15, 2024Apr 15, 2024
    • A curated list of open-source projects related to DeepSeek Coder
      20271600Updated Apr 3, 2024Apr 3, 2024
    • DeepSeek LLM: Let there be answers
      Makefile
      1k6.5k332Updated Feb 4, 2024Feb 4, 2024
    • DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
      Python
      2871.8k174Updated Jan 16, 2024Jan 16, 2024