Skip to content
@ictnlp

ICTNLP

Natural Language Processing Group, Institute of Computing Technology, Chinese Academy of Sciences

Pinned Loading

  1. LLaMA-Omni LLaMA-Omni Public

    LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

    Python 3.1k 216

  2. StreamSpeech StreamSpeech Public

    StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

    Python 1.1k 86

  3. BayLing BayLing Public

    “百聆”是一个基于LLaMA的语言对齐增强的英语/中文大语言模型,具有优越的英语/中文能力,在多语言和通用任务等多项测试中取得ChatGPT 90%的性能。BayLing is an English/Chinese LLM equipped with advanced language alignment, showing superior capability in English/Ch…

    Python 318 20

  4. LLaVA-Mini LLaVA-Mini Public

    LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

    Python 523 28

  5. Auto-RAG Auto-RAG Public

    This is the official repository for Auto-RAG.

    Python 222 20

  6. FlexRAG FlexRAG Public

    FlexRAG: A RAG Framework for Information Retrieval and Generation.

    Python 219 20

Repositories

Showing 10 of 84 repositories
  • PSO-Merging Public

    PSO-Merging is an innovative deep model fusion method that uses particle swarm optimization algorithm to automatically find optimal model fusion weights.

    ictnlp/PSO-Merging’s past year of commit activity
    Python 5 MIT 1 0 0 Updated Aug 26, 2025
  • FastLongSpeech Public

    FastLongSpeech is a novel framework designed to extend the capabilities of Large Speech-Language Models for efficient long-speech processing without necessitating dedicated long-speech training data.

    ictnlp/FastLongSpeech’s past year of commit activity
    Python 10 0 0 0 Updated Jul 22, 2025
  • Auto-RAG Public

    This is the official repository for Auto-RAG.

    ictnlp/Auto-RAG’s past year of commit activity
    Python 222 Apache-2.0 20 4 0 Updated Jul 18, 2025
  • StreamUni Public

    StreamUni is a framework that efficiently enables unified Large Speech-Language Models to accomplish streaming speech translation in a cohesive manner.

    ictnlp/StreamUni’s past year of commit activity
    Python 10 1 0 0 Updated Jul 14, 2025
  • LLaVA-Mini Public

    LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

    ictnlp/LLaVA-Mini’s past year of commit activity
    Python 523 Apache-2.0 28 25 2 Updated Jun 29, 2025
  • StreamSpeech Public

    StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

    ictnlp/StreamSpeech’s past year of commit activity
    Python 1,146 MIT 86 13 1 Updated Jun 29, 2025
  • Stream-Omni Public

    Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across various modality combinations.

    ictnlp/Stream-Omni’s past year of commit activity
    Python 343 GPL-3.0 35 5 0 Updated Jun 17, 2025
  • FlexRAG Public

    FlexRAG: A RAG Framework for Information Retrieval and Generation.

    ictnlp/FlexRAG’s past year of commit activity
    Python 219 MIT 20 5 1 Updated Jun 17, 2025
  • SLED-TTS Public

    Streamable Text-to-Speech model using a language modeling approach, without vector quantization

    ictnlp/SLED-TTS’s past year of commit activity
    Python 98 5 4 0 Updated May 20, 2025
  • MonoAttn-Transducer Public

    Code for ICML25 Paper "Overcoming Non-monotonicity in Transducer-based Streaming Generation"

    ictnlp/MonoAttn-Transducer’s past year of commit activity
    Python 12 2 0 0 Updated May 20, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.