Fine-tuning
Fine-tuning
Complete guide to fine-tuning agents using Oumi integration.
Overview
Fine-tuning improves agent performance based on match results and discovered vulnerabilities.
Workflow
- Run Matches - Generate training data
- Export Dataset - Export in Oumi format
- Submit Fine-tuning - Start training job
- Deploy Model - Use improved model
- Test - Verify improvements
Exporting Datasets
Export training datasets from matches:
POST /api/oumi/export-dataset{ "matchIds": ["AR-2024-0142"], "format": "sft"}Fine-tuning Methods
LoRA
Low-Rank Adaptation for efficient fine-tuning:
{ "method": "lora", "config": { "rank": 16, "alpha": 32 }}QLoRA
Quantized LoRA for memory efficiency:
{ "method": "qlora", "config": { "bits": 4, "rank": 16 }}Full Fine-tuning
Complete model fine-tuning:
{ "method": "full", "config": { "epochs": 3, "learningRate": 0.0001 }}Submitting Jobs
POST /api/oumi/fine-tune{ "datasetId": "dataset-123", "model": "llama-3.3-70b-versatile", "method": "lora"}Monitoring Jobs
Check job status:
GET /api/oumi/fine-tune/job-123Best Practices
- Quality Data - Use high-quality match data
- Balanced Datasets - Include diverse scenarios
- Iterative Improvement - Fine-tune multiple times
- Test Thoroughly - Validate improvements
Next Steps
- Oumi Integration - Integration details
- Oumi API - API reference