Configuration

ChunkHound is configured through a JSON file, environment variables, and CLI flags.

Configuration File

Create .chunkhound.json in your project root. Here is a full example showing all sections:

{
  "database": {
    "provider": "duckdb",
    "path": ".chunkhound/db"
  },
  "embedding": {
    "provider": "voyageai",
    "model": "voyage-3.5",
    "batch_size": 100
  },
  "indexing": {
    "exclude": ["**/node_modules/**", "**/dist/**"],
    "exclude_mode": "combined",
    "per_file_timeout_seconds": 3.0,
    "batch_size": 50,
    "db_batch_size": 100,
    "detect_embedded_sql": true
  },
  "llm": {
    "provider": "anthropic",
    "utility_model": "claude-haiku-4-5-20251001",
    "synthesis_model": "claude-sonnet-4-5-20250929"
  }
}

Configuration Precedence

Settings are resolved in this order (highest priority first):

  1. CLI arguments — flags passed directly on the command line
  2. Config file — loaded via --config or CHUNKHOUND_CONFIG_FILE
  3. Local .chunkhound.json — auto-detected in the target directory
  4. Environment variablesCHUNKHOUND_* prefixed variables
  5. Defaults — built-in fallback values

Embedding Providers

ProviderConfig ValueEnv VarDefault ModelNotes
VoyageAIvoyageaiCHUNKHOUND_EMBEDDING__API_KEYvoyage-3.5Recommended for code search
OpenAIopenaiCHUNKHOUND_EMBEDDING__API_KEYtext-embedding-3-smallWidely available

Embedding Options

OptionTypeDefaultDescription
base_urlstringnullCustom embedding endpoint. Required for self-hosted OpenAI-compatible embeddings.
ssl_verifybooleantrueVerify TLS certificates for requests sent to base_url. Ignored when base_url is unset.
rerank_modelstringnullReranking model name (enables multi-hop reranking)
rerank_urlstringnullSeparate rerank endpoint URL (optional when reranking is served from base_url)
rerank_ssl_verifybooleannullVerify TLS certificates for rerank requests. Inherits ssl_verify when unset.
rerank_formatstring"auto"Reranking API format: cohere, tei, or auto
rerank_batch_sizenumbernullMax documents per rerank request
timeoutnumber30Request timeout in seconds
max_retriesnumber3Max retry attempts on failure
api_versionstringnullAzure OpenAI API version (YYYY-MM-DD)
azure_endpointstringnullAzure OpenAI endpoint (mutually exclusive with base_url)
azure_deploymentstringnullAzure OpenAI deployment name

Database Backends

BackendStatusRecommended
duckdbStableYes — use this
lancedbExperimentalNo — for evaluation only

DuckDB (default)

Stable — recommended for all use cases.

Fast analytical queries and efficient storage.

{
  "database": {
    "provider": "duckdb",
    "path": ".chunkhound/db"
  }
}

LanceDB

Experimental — not recommended for production use. The LanceDB integration is actively developed but may have rough edges around index rebuilding, migration, and edge-case query correctness. Use DuckDB unless you are evaluating LanceDB specifically.

{
  "database": {
    "provider": "lancedb",
    "path": ".chunkhound/db"
  }
}

Database Options

OptionTypeDefaultDescription
max_disk_usage_mbnumbernullMax DB size in MB before indexing stops (CLI flag uses GB)
lancedb_index_typestringnullLanceDB vector index type: auto, ivf_hnsw_sq, or ivf_rq
lancedb_optimize_fragment_thresholdnumber100Fragment count to trigger LanceDB compaction

Indexing Options

OptionTypeDefaultDescription
excludestring[]built-in listGlob patterns to exclude from indexing
includestring[]all supported file typesGlob patterns limiting which files are indexed; files not matching any pattern are skipped
exclude_modestringnullcombined, config_only, or gitignore_only. When an explicit exclude list is provided, defaults to "combined"; otherwise defaults to "gitignore_only"
force_reindexbooleanfalseForce re-indexing of all files
max_concurrentnumber5Max concurrent parser workers
cleanupbooleantrueRemove orphaned DB records after indexing
max_file_size_mbnumber10Skip files larger than this (MB)
config_file_size_threshold_kbnumber20Skip structured config files (JSON/YAML/TOML) larger than this (KB); 0 to disable
per_file_timeout_secondsnumber3.0Max parse time per file (0 to disable)
batch_sizenumber50Files per parsing batch
db_batch_sizenumber100Chunks per database write batch
detect_embedded_sqlbooleantrueIndex SQL in string literals
per_file_timeout_min_size_kbnumber128Only apply per-file timeout to files at least this large (KB)

By default, ChunkHound excludes common noise directories (node_modules, dist, __pycache__, .git, lock files, build artifacts). Set exclude_mode: "config_only" and exclude: [] to start with a clean slate.

Exclude Modes

  • combined (default when custom exclude patterns are provided) — merges .gitignore rules with your indexing.exclude patterns
  • config_only — only uses patterns from indexing.exclude, ignores .gitignore
  • gitignore_only (default when no custom exclude patterns are provided) — only uses .gitignore rules, ignores config excludes

LLM Configuration

The LLM provider is used for deep code research (chunkhound research and the code_research MCP tool).

ProviderConfig ValueEnv VarUtility DefaultSynthesis DefaultNotes
Claude Code CLIclaude-code-cliclaude-haiku-4-5-20251001claude-haiku-4-5-20251001Uses local Claude Code installation
Codex CLIcodex-clicodexcodexUses local Codex CLI installation
OpenCode CLIopencode-cliopencode/grok-codeopencode/grok-codeUses local OpenCode CLI installation
AnthropicanthropicCHUNKHOUND_LLM_API_KEYclaude-haiku-4-5-20251001claude-sonnet-4-5-20250929Direct API access
OpenAIopenaiCHUNKHOUND_LLM_API_KEYgpt-5-nanogpt-5Direct API access
GeminigeminiCHUNKHOUND_LLM_API_KEYgemini-3-pro-previewgemini-3-pro-previewGoogle Gemini API
GrokgrokCHUNKHOUND_LLM_API_KEYgrok-4-1-fast-reasoninggrok-4-1-fast-reasoningxAI API

"model" is a convenience shorthand that sets both utility_model and synthesis_model to the same value. To use different models per role, set utility_model and synthesis_model explicitly.

When an OpenAI-compatible LLM provider points at a custom base_url, ChunkHound treats it as a generic custom backend. In that mode you must set an explicit model name; ChunkHound does not guess a local default. This applies to provider: "openai", to Grok when routed through a non-official endpoint, and to per-role overrides that resolve to those providers.

LLM Options

OptionTypeDefaultDescription
utility_providerstringnullOverride provider for utility operations
synthesis_providerstringnullOverride provider for synthesis operations
timeoutnumber120LLM request timeout in seconds
max_retriesnumber3Max retry attempts
codex_reasoning_effortstringnullDefault reasoning effort for Codex/OpenAI: minimal, low, medium, high, xhigh
codex_reasoning_effort_utilitystringnullReasoning effort override for utility stage
codex_reasoning_effort_synthesisstringnullReasoning effort override for synthesis stage

Anthropic-specific Options

OptionTypeDefaultDescription
anthropic_thinking_enabledbooleanfalseEnable extended thinking
anthropic_thinking_budget_tokensnumber10000Token budget for thinking (min 1024)
anthropic_interleaved_thinkingbooleanfalseInterleaved thinking for tool use (Claude 4+)
anthropic_effortstringnullEffort parameter: low, medium, high

Research Configuration

Controls the code_research MCP tool and chunkhound research command.

OptionTypeDefaultEnv VarDescription
algorithm"v1"|"v2"|"v3""v3"CHUNKHOUND_RESEARCH_ALGORITHMResearch algorithm version
exhaustive_modeboolfalseCHUNKHOUND_RESEARCH_EXHAUSTIVE_MODERetrieve everything (no time/count limit)
multi_hop_time_limitnumber5.0CHUNKHOUND_RESEARCH_MULTI_HOP_TIME_LIMITMax seconds for evidence expansion
multi_hop_result_limitnumber500CHUNKHOUND_RESEARCH_MULTI_HOP_RESULT_LIMITMax accumulated chunks
target_tokensnumber20000CHUNKHOUND_RESEARCH_TARGET_TOKENSOutput token budget for synthesis
query_expansion_enabledbooltrueCHUNKHOUND_RESEARCH_QUERY_EXPANSION_ENABLEDLLM-based query expansion
relevance_thresholdnumber0.5CHUNKHOUND_RESEARCH_RELEVANCE_THRESHOLDMin rerank score for inclusion
{
  "research": {
    "algorithm": "v3",
    "exhaustive_mode": false,
    "target_tokens": 20000,
    "query_expansion_enabled": true,
    "relevance_threshold": 0.5
  }
}

The full list of parameters is available in research_config.py.

Algorithm Versions

The algorithm setting controls how ChunkHound explores your codebase to answer a research question. All three versions produce the same output format; they differ only in how thoroughly they search.

New to ChunkHound? Start with "v3" (the default).

VersionStrategyLLM callsBest for
v1BFS — generates follow-up questions, explores one level deepMinimalQuick lookups, simple codebases
v2Wide coverage — depth-first on top files, then gap detectionMediumBalanced discovery; most production use cases
v3 (default)Runs v1 + v2 in parallel, merges resultsMost (parallel, not sequential)Complex codebases where missing context is costly

v3 is not slower than v2 — both strategies run concurrently via asyncio.gather, so the wall-clock time is roughly the same as v2 alone while covering more ground.

When to switch away from v3:

  • Use v1 when cost matters and the question is narrow and self-contained (“explain how the config loader works”)
  • Use v2 when you want a good balance without the extra LLM spend of dual-strategy merging
  • v3 is the right default for open-ended research questions (“how does authentication flow through this system?”)

Gap detection parameters (min_gaps, max_gaps, gap_similarity_threshold) only affect v2 and v3. They are silently ignored for v1.

Environment Variables

Most environment variables use the CHUNKHOUND_ prefix with __ (double underscore) as the section delimiter. The LLM section uses a single underscore (CHUNKHOUND_LLM_*).

VariableDescription
CHUNKHOUND_EMBEDDING__PROVIDEREmbedding provider name
CHUNKHOUND_EMBEDDING__MODELEmbedding model name
CHUNKHOUND_EMBEDDING__API_KEYAPI key for embedding provider
CHUNKHOUND_EMBEDDING__BASE_URLBase URL for OpenAI-compatible endpoints
CHUNKHOUND_EMBEDDING__SSL_VERIFYVerify TLS certificates for embedding requests sent to base_url
CHUNKHOUND_EMBEDDING__RERANK_SSL_VERIFYVerify TLS certificates for rerank requests (overrides ssl_verify)
CHUNKHOUND_DATABASE__PROVIDERDatabase backend (duckdb or lancedb)
CHUNKHOUND_DATABASE__PATHDatabase storage path
CHUNKHOUND_LLM_PROVIDERLLM provider for research
CHUNKHOUND_LLM_UTILITY_MODELLLM model for utility tasks (fast, lower cost)
CHUNKHOUND_LLM_SYNTHESIS_MODELLLM model for synthesis tasks (primary output)
CHUNKHOUND_LLM_API_KEYAPI key for LLM provider
CHUNKHOUND_LLM_BASE_URLBase URL for LLM provider (proxy / custom endpoint)
CHUNKHOUND_LLM_SSL_VERIFYVerify TLS certificates for requests sent to llm.base_url
CHUNKHOUND_INDEXING__EXCLUDE_MODEExclusion mode (combined, config_only, gitignore_only)
CHUNKHOUND_INDEXING__PER_FILE_TIMEOUT_SECONDSPer-file parse timeout
CHUNKHOUND_INDEXING__DETECT_EMBEDDED_SQLEnable embedded SQL detection
CHUNKHOUND_INDEXING__GIT_PATHSPEC_CAPMax git pathspec entries (default: 128)
CHUNKHOUND_DB_EXECUTE_TIMEOUTDatabase executor timeout
CHUNKHOUND_YAML_ENGINEYAML parser engine (rapid or tree)
CHUNKHOUND_LLM_CODEX_REASONING_EFFORTReasoning effort for Codex models (minimal, low, medium, high, xhigh)
CHUNKHOUND_CONFIG_FILEPath to config file (alternative to --config)
CHUNKHOUND_DEBUGEnable debug logging
CHUNKHOUND_DATABASE__MAX_DISK_USAGE_GBMax database size in GB
CHUNKHOUND_INDEXING__FORCE_REINDEXForce re-indexing
CHUNKHOUND_INDEXING__MAX_CONCURRENTMax concurrent workers
CHUNKHOUND_EMBEDDING__RERANK_MODELReranking model
VOYAGE_API_KEYFallback API key for VoyageAI provider

Advanced routing

The homepage configurator emits the 30-second onboarding shape. Real enterprise deployments often need to hit Azure, a self-hosted endpoint, or an LLM proxy. Below is what ChunkHound actually wires through, and what it doesn’t.

TLS verification for custom endpoints

ssl_verify is explicit now. ChunkHound does not disable certificate verification automatically.

  • embedding.ssl_verify only affects requests sent to an explicit embedding.base_url.
  • embedding.rerank_ssl_verify only affects rerank requests and overrides inherited ssl_verify when set.
  • llm.ssl_verify only affects requests sent to an explicit llm.base_url.
  • If base_url is unset, ssl_verify is ignored for security.
  • If rerank_url is unset, rerank_ssl_verify is ignored.
  • Prefer a proper CA trust chain when possible. Use false only for local endpoints or trusted internal networks with self-signed/private certificates.

Azure OpenAI (embeddings)

ChunkHound’s OpenAI embedding provider speaks Azure OpenAI natively. Supply the four Azure fields and omit base_url — the two are mutually exclusive.

{
  "embedding": {
    "provider": "openai",
    "model": "text-embedding-3-small",
    "api_key": "<YOUR_AZURE_KEY>",
    "azure_endpoint": "https://<resource>.openai.azure.com",
    "api_version": "2024-02-01",
    "azure_deployment": "<your-deployment-name>"
  }
}

LLM-side Azure OpenAI is not supported yet — the llm section has no Azure fields. Use a proxy (see below) if you need to route LLM traffic through Azure.

VoyageAI on Azure ML / AI Foundry

VoyageAI models are available on the Azure Marketplace and in Microsoft Foundry. ChunkHound can target an Azure-hosted Voyage deployment via base_url:

{
  "embedding": {
    "provider": "voyageai",
    "model": "voyage-3.5",
    "api_key": "<YOUR_AZURE_VOYAGE_KEY>",
    "base_url": "https://<your-resource>.services.ai.azure.com/models",
    "ssl_verify": true,
    "rerank_url": "https://<your-rerank-endpoint>/rerank",
    "rerank_ssl_verify": true,
    "rerank_format": "tei"
  }
}

Caveats:

  • Native Voyage API required. The Azure deployment must expose /v1/embeddings with the native Voyage shape (true for Voyage marketplace listings; verify your specific deployment).
  • Bundled reranker unavailable. VoyageAI’s rerank-* models are not accessible through a custom base_url — the embedding endpoint doesn’t expose /rerank. Run a separate reranker and point rerank_url at it. vLLM with Qwen/Qwen3-Reranker-0.6B is a drop-in option:
    vllm serve Qwen/Qwen3-Reranker-0.6B --task score --port 8000
  • TLS disablement is primarily for the HTTP reranker path. The separate rerank_url path respects ssl_verify / rerank_ssl_verify. For the VoyageAI SDK path, prefer trusted CA configuration such as REQUESTS_CA_BUNDLE.
  • Concurrency throttled to 1 by default when base_url is set, to respect Azure serverless rate limits. Override via max_concurrent_batches if your SKU permits.
  • api_key still required. The validator doesn’t enforce it when base_url is present, but Azure-hosted endpoints still need their own key — supply it.

LLM via proxy (Anthropic, OpenAI, Grok)

The Anthropic, OpenAI, and Grok LLM providers all forward base_url to their SDK. Point them at a gateway like LiteLLM to centralize auth, logging, and rate limiting:

{
  "llm": {
    "provider": "anthropic",
    "model": "claude-sonnet-4-5-20250929",
    "api_key": "<YOUR_GATEWAY_KEY>",
    "base_url": "https://your-gateway.example.com",
    "ssl_verify": true
  }
}

The gateway must preserve each provider’s native request/response shape — ChunkHound uses the vendor SDKs, not a generic HTTP client.

Local OpenAI-compatible servers (Ollama, vLLM)

Local inference servers that speak the OpenAI API work via provider: "openai" with base_url pointing at the local endpoint. No api_key is needed for servers that don’t enforce auth, but you must set an explicit model.

Ollama

Ollama provides embeddings, reranking, and LLM inference in a single process. Pull the models you need, then point ChunkHound at the Ollama endpoint:

# Embedding + reranker models
ollama pull qwen3-embedding && ollama pull qwen3-reranker

# LLM — pick one
ollama pull qwen3-coder:30b
ollama pull gemma4:27b

Embedding and reranker config (.chunkhound.json):

{
  "embedding": {
    "provider": "openai",
    "model": "qwen3-embedding",
    "base_url": "http://localhost:11434/v1",
    "ssl_verify": false,
    "rerank_model": "qwen3-reranker",
    "rerank_format": "cohere"
  }
}

No rerank_url is needed — it is auto-derived from base_url.

LLM config:

{
  "llm": {
    "provider": "openai",
    "model": "qwen3-coder:30b",
    "base_url": "http://localhost:11434/v1",
    "ssl_verify": false
  }
}

Use whichever model you pulled in llm.model. For example, set "model": "gemma4:27b" if you want the Gemma 4 path instead of Qwen. ChunkHound does not infer a local default model from base_url.

If your embeddings stay on the official provider but reranking goes to a local HTTPS service with a self-signed certificate, override the reranker only:

{
  "embedding": {
    "provider": "openai",
    "model": "text-embedding-3-small",
    "api_key": "<YOUR_OPENAI_KEY>",
    "rerank_model": "Qwen/Qwen3-Reranker-0.6B",
    "rerank_url": "https://localhost:8001/rerank",
    "rerank_ssl_verify": false,
    "rerank_format": "tei"
  }
}

vLLM

vLLM gives you dedicated processes per model, which is better for throughput and lets you serve HuggingFace model IDs directly. When embeddings and reranking are served from the same OpenAI-compatible endpoint, ChunkHound infers the reranker path from base_url just like it does for Ollama:

# Embedding + reranker server
vllm serve Qwen/Qwen3-Embedding-0.6B --port 8000

# LLM server
vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct --port 11434

Embedding and reranker config (.chunkhound.json):

{
  "embedding": {
    "provider": "openai",
    "model": "Qwen/Qwen3-Embedding-0.6B",
    "base_url": "http://localhost:8000/v1",
    "rerank_model": "Qwen/Qwen3-Reranker-0.6B",
    "rerank_format": "cohere"
  }
}

No rerank_url is needed when the reranker lives behind the same OpenAI-compatible endpoint. ChunkHound auto-derives /rerank from base_url.

If you split embeddings and reranking across different services, keep base_url pointed at the embedding server and set rerank_url explicitly:

{
  "embedding": {
    "provider": "openai",
    "model": "Qwen/Qwen3-Embedding-0.6B",
    "base_url": "http://localhost:8025/v1",
    "rerank_model": "Qwen/Qwen3-Reranker-0.6B",
    "rerank_url": "http://localhost:8000/rerank",
    "rerank_format": "cohere"
  }
}

LLM config:

{
  "llm": {
    "provider": "openai",
    "model": "Qwen/Qwen3-Coder-30B-A3B-Instruct",
    "base_url": "http://localhost:11434/v1"
  }
}

Ollama vs vLLM: Ollama is simpler — one process, one command per model. vLLM is better for throughput and gives you full control over each serving process. Both work equally well with ChunkHound as long as llm.model is set explicitly.