File Palaces iconFilePalaces

Settings & LLM Configuration

File Palaces routes all LLM calls through LiteLLM, which means you can swap providers without changing any code. This page covers every setting available in the Settings modal.

Opening Settings

  • Click the gear icon in the top-right of the app toolbar, or
  • Press Ctrl+, (Windows/Linux) / Cmd+, (macOS)

LLM Provider

Ollama (recommended for local use)

Ollama runs models entirely on your machine — no API key, no internet connection required.

Setup:

  1. Install Ollama from ollama.com
  2. Pull a model:
    ollama pull llama3.2
    # or for a smaller model:
    ollama pull qwen2.5:3b
    
  3. In File Palaces Settings, set Provider to ollama
  4. Set Model to the model name you pulled (e.g. llama3.2)
  5. Leave Base URL blank (defaults to http://localhost:11434)
OLLAMA SETUP WIZARD

If Ollama is not installed, File Palaces shows an Ollama Setup Wizard on first launch that guides you through installation and model selection. You can also trigger it from Settings → Provider → Ollama → Launch Setup Wizard.

Recommended models by use case:

Use caseModel
Fast answers, low RAMqwen2.5:3b, phi4-mini
Balanced (8 GB+ RAM)llama3.2, mistral
Best quality (16 GB+ RAM)llama3.1:70b, qwen2.5:72b
Code-heavy documentsqwen2.5-coder, deepseek-coder-v2

OpenAI

  1. Set Provider to openai
  2. Set Model to your preferred model, e.g. gpt-4o, gpt-4o-mini, o3-mini
  3. Enter your API Key (stored locally in config.json, never sent to File Palaces servers)

Anthropic

  1. Set Provider to anthropic
  2. Set Model to e.g. claude-sonnet-4-5, claude-haiku-4-5, claude-opus-4
  3. Enter your API Key

OpenAI-compatible (Custom endpoint)

Use this for any server that implements the OpenAI chat completions API — LM Studio, Together AI, Groq, a local vLLM instance, etc.

  1. Set Provider to openai-compat
  2. Set Base URL to your server's base URL, e.g. http://localhost:1234/v1
  3. Set Model to the model identifier expected by your server
  4. Enter an API Key if your server requires one (or any non-empty string if it doesn't)

Search settings

SettingDefaultDescription
Top-K8Number of chunks retrieved per query. Higher = more context but slower and more tokens used.
Similarity threshold0.3Minimum cosine similarity for a chunk to be included. Lower = more permissive.
TIP

If answers are missing information you know is in your documents, try increasing Top-K to 12 or 16 and lowering the threshold to 0.2.

System prompt

The system prompt is prepended to every chat session. Use it to:

  • Set the tone or persona ("You are a concise legal analyst…")
  • Restrict the response format ("Always reply in bullet points")
  • Add domain-specific context ("The documents are internal engineering RFCs")

The default system prompt instructs the LLM to cite sources and stay grounded in the retrieved context.

Config file

All settings are persisted to config.json in the platform app data directory:

PlatformPath
Windows%APPDATA%\File Palaces\config.json
macOS~/Library/Application Support/FilePalaces/config.json
Linux~/.local/share/FilePalaces/config.json

You can edit this file directly; the sidecar reads it on startup. Do not commit this file to version control — it may contain API keys.