Skip to content

Welcome to the 🐶 InstructLab Project

Quick Links: Documentation | FAQ | Hugging Face | Calendar | Slack | YouTube | X | Reddit

Banner

InstructLab is an approachable open source AI community project. Our community's mission is to enable anyone to shape the future of generative AI via the collaborative improvement of open source-licensed Granite large language models (LLMs) using InstructLab's fine-tuning technology.

Who should use InstructLab?

Have you ever asked an LLM a question about a subject you know well, and it gave an incorrect answer? “If only I could submit a patch to fix it,” you may have thought. With InstructLab, you can!

InstructLab allows anyone to improve an existing LLM by fine-tuning it with additional data sources. This allows LLMs to continuously gain new knowledge, supplementing gaps in their initial training, even about current events that happened since their pre-training phase.

Subject matter experts from any domain—for example, archaeology, astronomy, and human anatomy—can use InstructLab to teach the Granite LLM family new information.

Project users can experiment with making updates to a quantized version of the Granite model and then rebuild that version locally, checking to see if the update improved the quality of the model’s responses. InstructLab’ s tools allow contributors to both see preview of a newly tuned model, and also check to ensure that a submission to improve the community model is properly composed.

You can learn more about InstructLab and how it works in the project documentation.

Where can I find more information about the LLMs created by InstructLab?

Check out InstructLab on Hugging Face.

Get started

Additional information

InstructLab uses Large-Scale Alignment for ChatBots [1] (LAB), a new alignment tuning method for LLMs that leverages synthetic data. To learn more about InstructLab’s origins, visit the About Taxonomy page.

[1] Shivchander Sudalairaj*, Abhishek Bhandwaldar*, Aldo Pareja*, Kai Xu, David D. Cox, Akash Srivastava*. "LAB: Large-Scale Alignment for ChatBots", arXiv preprint arXiv: 2403.01081, 2024. (* denotes equal contributions)

Pinned Loading

  1. instructlab instructlab Public

    InstructLab Command-Line Interface. Use this to chat with a model and execute the InstructLab workflow to train a model using custom taxonomy data.

    Python 1.1k 362

  2. taxonomy taxonomy Public

    Taxonomy tree that will allow you to create models tuned with your data

    Python 216 948

  3. community community Public

    InstructLab Community wide collaboration space including contributing, security, code of conduct, etc

    Python 75 45

  4. dev-docs dev-docs Public

    Developer documents for the InstructLab organization

    Makefile 5 34

Repositories

Showing 10 of 18 repositories
  • website Public
    instructlab/website’s past year of commit activity
    TypeScript 2 CC-BY-4.0 24 9 8 Updated Dec 26, 2024
  • community Public

    InstructLab Community wide collaboration space including contributing, security, code of conduct, etc

    instructlab/community’s past year of commit activity
    Python 75 Apache-2.0 45 16 8 Updated Dec 25, 2024
  • ui Public

    Place to hack on UI for InstructLab

    instructlab/ui’s past year of commit activity
    TypeScript 21 Apache-2.0 41 51 (2 issues need help) 17 Updated Dec 23, 2024
  • training Public

    InstructLab Training Library - Efficient Fine-Tuning with Message-Format Data

    instructlab/training’s past year of commit activity
    Python 23 Apache-2.0 48 60 (3 issues need help) 12 Updated Dec 20, 2024
  • instructlab Public

    InstructLab Command-Line Interface. Use this to chat with a model and execute the InstructLab workflow to train a model using custom taxonomy data.

    instructlab/instructlab’s past year of commit activity
    Python 1,069 Apache-2.0 362 283 (18 issues need help) 79 Updated Dec 20, 2024
  • taxonomy Public

    Taxonomy tree that will allow you to create models tuned with your data

    instructlab/taxonomy’s past year of commit activity
    Python 216 Apache-2.0 947 4 27 Updated Dec 19, 2024
  • .github Public

    InstructLab GitHub organization community files.

    instructlab/.github’s past year of commit activity
    Makefile 2 Apache-2.0 11 0 6 Updated Dec 19, 2024
  • sdg Public

    Python library for Synthetic Data Generation

    instructlab/sdg’s past year of commit activity
    Python 27 Apache-2.0 40 60 (1 issue needs help) 18 Updated Dec 19, 2024
  • docs.instructlab.ai Public

    The docs.instructlab.ai public documentation repository. PRs accepted and encouraged!

    instructlab/docs.instructlab.ai’s past year of commit activity
    Shell 11 16 7 5 Updated Dec 18, 2024
  • dev-docs Public

    Developer documents for the InstructLab organization

    instructlab/dev-docs’s past year of commit activity
    Makefile 5 Apache-2.0 34 11 22 Updated Dec 18, 2024