Build Colab training notebook (Unsloth + LoRA)

## Summary

Create a single "Run All" Colab notebook that trains fine-tuned models using Unsloth + LoRA on A100 GPU. Replace the MLX-based `training/finetune.py` with an Unsloth-backed wrapper.

## Depends On

- #96

## Design Principles

- **Single "Run All" execution** — no babysitting, no manual cell-by-cell
- **Parameterized config cell** — only cell the user edits (school, model sizes, tokens, epochs)
- **Checkpoint and resume** — `SKIP_DOMAIN_ADAPTATION=True` resumes from Phase 2 after disconnects
- **Chat template alignment** — uses `tokenizer.apply_chat_template()` throughout (D4BL critical lesson)

## Tasks

- [ ] Create `notebooks/training/bishop_state_fine_tuning.ipynb`
- [ ] Cell 1: Config (SCHOOL, MODEL_SIZES=["4b","9b"], HF_TOKEN, epoch counts, skip flags)
- [ ] Cell 2+: Autonomous pipeline:
  - GPU detection + validation
  - pip install unsloth, trl, peft
  - Clone repo, load config.yaml
  - Phase 1: Domain adaptation (per model size) — LoRA r=16, all modules, 1 epoch, lr 2e-4, bf16
  - Phase 2: Task adapters (narrator, summarizer, explainer per model size) — task-specific LoRA config
  - Phase 3: GGUF export (q4_k_m) + upload to Drive/HF Hub
  - Summary: comparison table 4B vs 9B, recommend winner
- [ ] Replace `training/finetune.py` (MLX) with Unsloth-backed `training/finetune.py`
- [ ] Delete `training/export.py` (MLX Ollama export) — GGUF export moves into notebook
- [ ] Error handling: try/except per phase, save partial state before re-raising
- [ ] Loss curves saved as PNG alongside GGUFs
- [ ] Progress via tqdm + print (no interactive widgets)

## Training Hyperparameters (from D4BL)

| Parameter | Phase 1 (Domain) | Phase 2 (Tasks) |
|-----------|------------------|-----------------|
| LoRA rank | 16 | 8-16 |
| LoRA alpha | 32 | 16-32 |
| Learning rate | 2e-4 | 1e-4 |
| Effective batch | 32 | 16 |
| Epochs | 1 | 4-7 |
| Sequence length | 4096 | 4096-8192 |
| Precision | bf16 | bf16 |

## Acceptance Criteria

- Notebook runs end-to-end on Colab A100 via "Run All" without manual intervention
- Produces GGUF files for all 3 tasks × 2 model sizes
- Prints comparison metrics table at end

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build Colab training notebook (Unsloth + LoRA) #98

Summary

Depends On

Design Principles

Tasks

Training Hyperparameters (from D4BL)

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parameter	Phase 1 (Domain)	Phase 2 (Tasks)
LoRA rank	16	8-16
LoRA alpha	32	16-32
Learning rate	2e-4	1e-4
Effective batch	32	16
Epochs	1	4-7
Sequence length	4096	4096-8192
Precision	bf16	bf16

Build Colab training notebook (Unsloth + LoRA) #98

Description

Summary

Depends On

Design Principles

Tasks

Training Hyperparameters (from D4BL)

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions