Skip to content

Build Colab training notebook (Unsloth + LoRA) #98

@William-Hill

Description

@William-Hill

Summary

Create a single "Run All" Colab notebook that trains fine-tuned models using Unsloth + LoRA on A100 GPU. Replace the MLX-based training/finetune.py with an Unsloth-backed wrapper.

Depends On

Design Principles

  • Single "Run All" execution — no babysitting, no manual cell-by-cell
  • Parameterized config cell — only cell the user edits (school, model sizes, tokens, epochs)
  • Checkpoint and resumeSKIP_DOMAIN_ADAPTATION=True resumes from Phase 2 after disconnects
  • Chat template alignment — uses tokenizer.apply_chat_template() throughout (D4BL critical lesson)

Tasks

  • Create notebooks/training/bishop_state_fine_tuning.ipynb
  • Cell 1: Config (SCHOOL, MODEL_SIZES=["4b","9b"], HF_TOKEN, epoch counts, skip flags)
  • Cell 2+: Autonomous pipeline:
    • GPU detection + validation
    • pip install unsloth, trl, peft
    • Clone repo, load config.yaml
    • Phase 1: Domain adaptation (per model size) — LoRA r=16, all modules, 1 epoch, lr 2e-4, bf16
    • Phase 2: Task adapters (narrator, summarizer, explainer per model size) — task-specific LoRA config
    • Phase 3: GGUF export (q4_k_m) + upload to Drive/HF Hub
    • Summary: comparison table 4B vs 9B, recommend winner
  • Replace training/finetune.py (MLX) with Unsloth-backed training/finetune.py
  • Delete training/export.py (MLX Ollama export) — GGUF export moves into notebook
  • Error handling: try/except per phase, save partial state before re-raising
  • Loss curves saved as PNG alongside GGUFs
  • Progress via tqdm + print (no interactive widgets)

Training Hyperparameters (from D4BL)

Parameter Phase 1 (Domain) Phase 2 (Tasks)
LoRA rank 16 8-16
LoRA alpha 32 16-32
Learning rate 2e-4 1e-4
Effective batch 32 16
Epochs 1 4-7
Sequence length 4096 4096-8192
Precision bf16 bf16

Acceptance Criteria

  • Notebook runs end-to-end on Colab A100 via "Run All" without manual intervention
  • Produces GGUF files for all 3 tasks × 2 model sizes
  • Prints comparison metrics table at end

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions