Skip to content

Welcome to the OpenLLM-France 🇫🇷 Github

Discord

The aim of the OpenLLM France community is to collaborate on the development of truly Open Source AI LLM models.

This space contains the software tools used to :

  • collect and clean data;
  • pretrain the foundation models;
  • train and align the instruction models.

This space is strongly linked with the following Hugging Face space that contains and details datasets and models :

According to the OSI, open source AI model means that we provide :

  • the training corpus under an open license --> Hugging Face;
  • model weights under an open source non-restrictive license --> Hugging Face;
  • code for data curation and training algorithms under open source licenses --> this Github space.

Follow us:

Pinned Loading

  1. Manifesto Manifesto Public

    Page de préconfiguration de la communauté OpenLLM-France

    47 1

  2. Lucie-dataset-filtering Lucie-dataset-filtering Public

    Lucie-dataset-filtering: Code to compile and preprocess training data for Lucie's training

    Python 1

  3. Lucie-Training Lucie-Training Public

    Code for continual pretraining of LUCIE

    Jupyter Notebook 48 7

  4. wikiplaintext wikiplaintext Public

    Get plain text from Wikipedia pages

    HTML 5

Repositories

Showing 10 of 21 repositories

Top languages

Loading…

Most used topics

Loading…