Configuration¶
Environment Setup¶
Then edit .env with your preferred LLM provider configuration.
Command Line Configuration¶
You can explicitly specify which configuration file to use for any command:
Or set the COMPOUNDING_ENV environment variable:
Configuration Priority¶
The tool resolves configuration from multiple locations in a prioritized sequence:
--env-fileFlag: Highest priority.COMPOUNDING_ENVVariable: Environment-level override.- Local
.env: Found in the current working directory. - Tool Global Config:
~/.config/compounding/.env. - User Home
.env: Final fallback.
Multi-Repo Best Practice
Store your secret API keys in the Tool Global Config (~/.config/compounding/.env) and use Local .env files in each project to specify the DSPY_LM_MODEL that works best for that repository's language or complexity.
LLM Provider Options¶
OpenAI¶
For GPT-4, GPT-3.5, or other OpenAI models:
Anthropic Claude¶
For Claude 3.5 Sonnet, Haiku, or Opus:
Ollama (Local)¶
For local, privacy-first AI using Ollama:
Ollama Setup
- Install Ollama from ollama.ai
- Pull a model:
ollama pull qwen2.5-coder:32b - Ollama runs automatically on
localhost:11434
Recommended models for coding:
qwen2.5-coder:32b- Best quality, requires 20GB+ RAMqwen2.5-coder:14b- Good balance, 16GB+ RAMdeepseek-coder-v2:16b- Alternative, good for codecodellama:13b- Lighter option, 8GB+ RAM
OpenRouter¶
Access multiple models through one API:
OpenRouter provides access to:
- Anthropic Claude models
- OpenAI GPT models
- Google Gemini
- Meta Llama
- And many more...
Get an API key | Browse models
Model Selection Guide¶
Choose based on your priorities:
| Priority | Recommended Provider | Model |
|---|---|---|
| Best Quality | Anthropic | claude-3-5-sonnet-20241022 |
| Fast & Good | OpenAI | gpt-4-turbo |
| Privacy | Ollama | qwen2.5-coder:32b |
| Cost-Effective | OpenRouter | anthropic/claude-3-haiku |
| Free | Ollama | qwen2.5-coder:7b |
Advanced Configuration¶
Adjusting Model Parameters¶
You can set additional environment variables for fine-tuning:
# Temperature (0.0 - 1.0, lower = more deterministic)
DSPY_LM_TEMPERATURE=0.7
# Max tokens for responses
DSPY_LM_MAX_TOKENS=4096
# Timeout in seconds
DSPY_LM_TIMEOUT=120
# Max tokens for documentation fetching (paging support)
DOCS_MAX_TOKENS=32768
Documentation Paging¶
When documentation content exceeds DOCS_MAX_TOKENS, it is automatically truncated. The agent receives a warning with instructions on how to fetch the next "page" of content using an offset_tokens parameter. This prevents context window overflows while still allowing access to massive documentation sites.
Knowledge Base Settings¶
The knowledge base is stored in .knowledge/ and configured automatically. To customize:
# Maximum learnings to inject into context
KB_MAX_RETRIEVED=10
# Similarity threshold for retrieval (0.0 - 1.0)
KB_SIMILARITY_THRESHOLD=0.6
Embedding Configuration¶
Configure how the system generates vectors for semantic search and code indexing:
# Provider: openai, fastembed, openrouter, or ollama
EMBEDDING_PROVIDER=openai
# Model name (must match provider)
EMBEDDING_MODEL=text-embedding-3-small
# Custom API base if using local proxies or OpenRouter
EMBEDDING_BASE_URL=https://...
Supported Local Models:
- Mxbai:
mxbai-embed-large:latest(1024 dims) - High performance local embedding. - Nomic:
nomic-embed-text(768 dims). - Jina:
jinaai/jina-embeddings-v2-small-en(512 dims).
FastEmbed Fallback
If no OPENAI_API_KEY is found, the system automatically falls back to FastEmbed using the Jina small model for local execution.
Verifying Configuration¶
Test your configuration:
You should see output indicating your LM is initialized:
Next Steps¶
Configuration Complete!
Your environment is ready to use.
Continue to Quick Start to run your first workflow.
Troubleshooting¶
API Key Not Found¶
If you see API key not found:
- Verify
.envfile exists in the project root - Check the variable name matches your provider (e.g.,
OPENAI_API_KEY) - Ensure no quotes around the key value
- Restart your shell or re-run
uv sync
Ollama Connection Error¶
If Ollama fails to connect:
- Check Ollama is running:
ollama list - Verify the base URL:
curl http://localhost:11434/api/version - Ensure the model is pulled:
ollama pull qwen2.5-coder:32b
Model Not Found¶
Check your model name:
- OpenAI: Available models
- Anthropic: Available models
- Ollama:
ollama list(shows installed models) - OpenRouter: Model list