Settings & LLM Configuration
File Palaces routes all LLM calls through LiteLLM, which means you can swap providers without changing any code. This page covers every setting available in the Settings modal.
Opening Settings
- Click the gear icon in the top-right of the app toolbar, or
- Press
Ctrl+,(Windows/Linux) /Cmd+,(macOS)
LLM Provider
Ollama (recommended for local use)
Ollama runs models entirely on your machine — no API key, no internet connection required.
Setup:
- Install Ollama from ollama.com
- Pull a model:
ollama pull llama3.2 # or for a smaller model: ollama pull qwen2.5:3b - In File Palaces Settings, set Provider to
ollama - Set Model to the model name you pulled (e.g.
llama3.2) - Leave Base URL blank (defaults to
http://localhost:11434)
If Ollama is not installed, File Palaces shows an Ollama Setup Wizard on first launch that guides you through installation and model selection. You can also trigger it from Settings → Provider → Ollama → Launch Setup Wizard.
Recommended models by use case:
| Use case | Model |
|---|---|
| Fast answers, low RAM | qwen2.5:3b, phi4-mini |
| Balanced (8 GB+ RAM) | llama3.2, mistral |
| Best quality (16 GB+ RAM) | llama3.1:70b, qwen2.5:72b |
| Code-heavy documents | qwen2.5-coder, deepseek-coder-v2 |
OpenAI
- Set Provider to
openai - Set Model to your preferred model, e.g.
gpt-4o,gpt-4o-mini,o3-mini - Enter your API Key (stored locally in
config.json, never sent to File Palaces servers)
Anthropic
- Set Provider to
anthropic - Set Model to e.g.
claude-sonnet-4-5,claude-haiku-4-5,claude-opus-4 - Enter your API Key
OpenAI-compatible (Custom endpoint)
Use this for any server that implements the OpenAI chat completions API — LM Studio, Together AI, Groq, a local vLLM instance, etc.
- Set Provider to
openai-compat - Set Base URL to your server's base URL, e.g.
http://localhost:1234/v1 - Set Model to the model identifier expected by your server
- Enter an API Key if your server requires one (or any non-empty string if it doesn't)
Search settings
| Setting | Default | Description |
|---|---|---|
| Top-K | 8 | Number of chunks retrieved per query. Higher = more context but slower and more tokens used. |
| Similarity threshold | 0.3 | Minimum cosine similarity for a chunk to be included. Lower = more permissive. |
If answers are missing information you know is in your documents, try increasing Top-K to 12 or 16 and lowering the threshold to 0.2.
System prompt
The system prompt is prepended to every chat session. Use it to:
- Set the tone or persona ("You are a concise legal analyst…")
- Restrict the response format ("Always reply in bullet points")
- Add domain-specific context ("The documents are internal engineering RFCs")
The default system prompt instructs the LLM to cite sources and stay grounded in the retrieved context.
Config file
All settings are persisted to config.json in the platform app data directory:
| Platform | Path |
|---|---|
| Windows | %APPDATA%\File Palaces\config.json |
| macOS | ~/Library/Application Support/FilePalaces/config.json |
| Linux | ~/.local/share/FilePalaces/config.json |
You can edit this file directly; the sidecar reads it on startup. Do not commit this file to version control — it may contain API keys.