Groq Models
Groq Models Reference
Complete list of available Groq models with their IDs, capabilities, and recommended use cases.
Available Models
Llama Models
| Model ID | Name | Tier | Context Window | Best For |
|---|---|---|---|---|
llama-3.3-70b-versatile | Llama 3.3 70B Versatile | powerful | 131,072 | Complex reasoning, creative attacks |
llama-3.1-70b-versatile | Llama 3.1 70B Versatile | powerful | 131,072 | High-quality responses, analysis |
llama-3.1-8b-instant | Llama 3.1 8B Instant | fast | 131,072 | Quick responses, high throughput |
llama-3.1-405b-reasoning | Llama 3.1 405B Reasoning | powerful | 131,072 | Ultra-powerful reasoning (if available) |
Mixtral Models
| Model ID | Name | Tier | Context Window | Best For |
|---|---|---|---|---|
mixtral-8x7b-32768 | Mixtral 8x7B | balanced | 32,768 | Balanced performance, large context |
mixtral-8x22b-instruct | Mixtral 8x22B Instruct | powerful | 65,536 | Complex tasks, high quality |
Gemma Models
| Model ID | Name | Tier | Context Window | Best For |
|---|---|---|---|---|
gemma-7b-it | Gemma 7B Instruct | fast | 8,192 | Fast responses, efficient |
gemma2-9b-it | Gemma 2 9B Instruct | balanced | 8,192 | Balanced performance |
Qwen Models
| Model ID | Name | Tier | Context Window | Best For |
|---|---|---|---|---|
qwen-2.5-72b-instruct | Qwen 2.5 72B Instruct | powerful | 32,768 | Large model, high quality |
qwen-2.5-32b-instruct | Qwen 2.5 32B Instruct | balanced | 32,768 | Medium model, balanced |
qwen-2.5-14b-instruct | Qwen 2.5 14B Instruct | balanced | 32,768 | Small-medium model |
qwen-2.5-7b-instruct | Qwen 2.5 7B Instruct | fast | 32,768 | Compact, fast model |
DeepSeek Models
| Model ID | Name | Tier | Context Window | Best For |
|---|---|---|---|---|
deepseek-r1-distill-llama-8b | DeepSeek R1 Distill Llama 8B | balanced | 131,072 | Reasoning tasks |
deepseek-chat | DeepSeek Chat | balanced | 32,768 | General chat |
Recommended Model Selection
For Red Agents (Attackers)
Primary: llama-3.3-70b-versatile - Best creativity
Fallback: llama-3.1-70b-versatile, mixtral-8x7b-32768, qwen-2.5-72b-instruct
For Blue Agents (Defenders)
Primary: mixtral-8x7b-32768 - Excellent analysis
Fallback: llama-3.1-70b-versatile, qwen-2.5-32b-instruct, gemma2-9b-it
For Target Agents
Primary: llama-3.1-8b-instant - Fast responses
Fallback: gemma-7b-it, qwen-2.5-7b-instruct, deepseek-chat
Automatic Model Fallback
The system automatically tries fallback models when the primary model is rate limited:
- Primary model is tried first
- If rate limited, same-tier models are tried
- If still rate limited, lower-tier models are tried
- Only falls back to mock provider if ALL models fail
Example Fallback Chain
For llama-3.3-70b-versatile (powerful tier):
llama-3.3-70b-versatile(primary)llama-3.1-70b-versatile(same tier)llama-3.1-405b-reasoning(same tier)mixtral-8x22b-instruct(same tier)qwen-2.5-72b-instruct(same tier)mixtral-8x7b-32768(balanced tier)qwen-2.5-32b-instruct(balanced tier)- … and more
Model Tiers
- Fast: Quick responses, lower resource usage, good for high throughput
- Balanced: Good balance of speed and quality
- Powerful: Best quality, higher resource usage, best for complex tasks
Using Models in Code
// In agent creation{ "name": "My Agent", "type": "red", "model": "llama-3.3-70b-versatile", // Will auto-fallback if rate limited "systemPrompt": "..."}The system will automatically try fallback models if the primary is rate limited!
Why Mock Data?
Mock data is used when:
- All Groq models are rate limited (daily token limit exceeded)
- No API keys are configured
- Testing without API costs
The system will automatically use real models once:
- Rate limits reset (usually daily)
- Different models become available
- API keys are configured
API Key Configuration
Single API Key
GROQ_API_KEY=your_key_herePer-Team API Keys
GROQ_API_KEY=general_keyRED_TEAM_GROQ_API_KEY=red_team_keyBLUE_TEAM_GROQ_API_KEY=blue_team_keyThis allows better rate limit management by distributing requests across multiple keys.
Model Selection Best Practices
- Match model to task: Use powerful models for complex reasoning, fast models for high throughput
- Consider rate limits: Have fallback models ready
- Use per-team keys: Distribute load across multiple API keys
- Monitor usage: Track which models are being used most
- Test fallback: Ensure fallback chain works correctly
Troubleshooting
Rate Limit Errors
If you see rate limit errors:
- Wait for rate limit reset (usually daily)
- Use different API keys for different teams
- Upgrade Groq plan for higher limits
- System will automatically try fallback models
Model Not Found
If a model ID is not found:
- Check model ID spelling
- Verify model is available in your Groq account
- System will fallback to available models
Slow Responses
If responses are slow:
- Try faster tier models (
llama-3.1-8b-instant,gemma-7b-it) - Check network connection
- Monitor Groq API status
Next Steps
- Agent Configuration - How to configure agents with models
- API Keys - Setting up API keys
- Model Fallback - Understanding fallback system