Local AIIntermediate25 min to complete14 min read

How to Set Up AnythingLLM with Ollama (2026 Guide)

Q: Is AnythingLLM free to use?

AnythingLLM is free and open-source (MIT license). The Desktop App is free. A paid cloud version exists for teams that prefer hosted access.

Q: What is the difference between AnythingLLM and Open-WebUI?

Open-WebUI is a chat interface for Ollama. AnythingLLM adds RAG document Q&A, isolated workspaces, AI agents, and multi-user role management. Use Open-WebUI for chat; AnythingLLM for document querying.

Q: Does AnythingLLM work without an internet connection?

Yes. Chat, RAG document Q&A, and embedding all work offline with Ollama. Only agent web browsing and web-connected MCP tools require internet access.

Q: What file types does AnythingLLM support?

Supported: PDF, DOCX, TXT, Markdown, HTML, CSV, XLSX, PPTX, JSON, 50+ code types, GitHub repo URLs, YouTube transcript URLs.

Q: Which embedding model should I use with Ollama?

nomic-embed-text is the standard choice: 274 MB, runs on CPU, good multilingual support. Alternatives: mxbai-embed-large (670 MB) or snowflake-arctic-embed:l (1.2 GB) for better quality.

Q: Can multiple users share one AnythingLLM instance?

Yes, in the Docker deployment. Enable multi-user mode in Settings. Three roles: Admin, Manager, and Default. The Desktop App is single-user only.

Q: How do I activate and use agents in AnythingLLM?

Type @agent at the start of a message. Configure capabilities in Settings > Agents. Add MCP tools in Settings > Agent Tools > MCP Servers. Requires a tool-capable model like Llama 3.3 8B.

Q: How much RAM does AnythingLLM need?

AnythingLLM uses ~512 MB RAM. A 7B Ollama model needs ~7-8 GB. Total: 16 GB RAM is comfortable for a 7B model setup. Add more RAM for larger models.

Q: How do I update AnythingLLM to the latest version?

Docker: `docker pull mintplexlabs/anythingllm`, stop and remove the old container, re-run with the same flags. Your storage volume is preserved. Desktop App: use Help > Check for Updates.

Install AnythingLLM Desktop or Docker, connect Ollama, and set up a local RAG workspace. Covers Ollama URL fix, nomic-embed-text embedding, and agents. 2026.

By Amara|Updated 17 May 2026

AnythingLLM is an open-source app that wraps RAG, workspaces, agents, and multi-user access around your local model setup. It passed 54,000 GitHub stars in early 2026. The difference from Open-WebUI is simple: Open-WebUI is a chat frontend, AnythingLLM is built around document Q&A. You upload files and query them privately, with nothing leaving your machine.

The setup needs two things from Ollama: a chat model and an embedding model. The embedding model converts your documents into searchable vectors at upload time. Skip it and document uploads fail quietly, which is confusing.

The other common failure is the Ollama URL. Inside a Docker container, `localhost` refers to the container, not the host machine where Ollama runs. One wrong URL in the connection settings and the whole thing looks broken. This guide covers that fix specifically, along with both installation methods, workspace setup, and agents.

Prerequisites

Ollama installed and running (follow the Ollama setup guide if you have not set it up yet)
A chat model pulled in Ollama (recommended: llama3.3:8b or mistral:7b)
The nomic-embed-text embedding model pulled in Ollama (required for document Q&A)
Docker Engine 24.x+ installed (for the Docker method only)
Port 3001 free on your machine
4 GB free disk space for the AnythingLLM Docker image
8 GB RAM minimum when running a 7B chat model alongside AnythingLLM

🖥️

Need a VPS?

Run this on a Contabo Cloud VPS 10 starting at €5.45/mo. Reliable Linux VPS with NVMe storage, ideal for self-hosted AI workloads.

Install AnythingLLM

AnythingLLM offers two installation paths. The Desktop App is the fastest option for personal use on a single machine. Docker is the right choice for server deployments, always-on access, or when you want multiple users sharing the same instance.

Desktop App (Recommended for Single Users)

The Desktop App bundles everything into a standalone installer. No Docker or Node.js required.

Download the installer for your operating system from the official docs:

OS	File	Download
Windows	AnythingLLMDesktop.exe	docs.useanything.com/installation/desktop/windows
macOS (Apple Silicon)	AnythingLLM.dmg	docs.useanything.com/installation/desktop/mac
Linux	AnythingLLM.AppImage	docs.useanything.com/installation/desktop/linux

After installation, launch the app. It opens a browser window at `http://localhost:3001` automatically. The Desktop App includes its own embedded vector database and storage, so no external configuration is required for the app itself.

Docker (Recommended for Servers and Multi-User Access)

Create a directory for persistent storage, then run the official image:

# Create a storage directory on the host
mkdir -p ~/anythingllm/storage

# Run AnythingLLM
docker run -d \
  -p 3001:3001 \
  --cap-add SYS_ADMIN \
  -v ~/anythingllm/storage:/app/server/storage \
  -e STORAGE_DIR=/app/server/storage \
  --name anythingllm \
  --restart unless-stopped \
  mintplexlabs/anythingllm

The image is approximately 2 GB. Wait for the download and startup to complete, then verify:

docker ps
# Expected output:
# CONTAINER ID   IMAGE                        STATUS         PORTS
# a1b2c3d4e5f6   mintplexlabs/anythingllm    Up 2 minutes   0.0.0.0:3001->3001/tcp

Open `http://localhost:3001` in your browser.

ℹ️

Note:The `--cap-add SYS_ADMIN` flag is required. AnythingLLM uses Chromium under the hood for some document parsing operations, and Chromium requires this capability to run in a container.

Docker Compose (Optional)

If you prefer to manage everything in one file, create a `docker-compose.yml`:

yaml

version: '3.8'

volumes:
  anythingllm_storage:

services:
  anythingllm:
    image: mintplexlabs/anythingllm
    restart: unless-stopped
    ports:
      - "3001:3001"
    cap_add:
      - SYS_ADMIN
    volumes:
      - anythingllm_storage:/app/server/storage
    environment:
      - STORAGE_DIR=/app/server/storage

Start it with:

docker compose up -d

Pull Required Ollama Models

AnythingLLM needs two types of models from Ollama: a chat model that generates answers, and an embedding model that converts your documents into vectors for search.

Chat Models

Pull one of these depending on your hardware. For a deeper comparison of model quality and hardware requirements, see the best local LLM models guide.

Model	Pull Command	Disk Size	RAM Required	Best For
llama3.3:8b	`ollama pull llama3.3:8b`	4.9 GB	8 GB	General use, balanced quality
mistral:7b	`ollama pull mistral:7b`	4.1 GB	8 GB	Faster inference, lower RAM
qwen2.5:7b	`ollama pull qwen2.5:7b`	4.7 GB	8 GB	Coding and multilingual docs
phi4	`ollama pull phi4`	9.1 GB	16 GB	Strong reasoning on technical docs

Embedding Model (Required for Document Q&A)

The embedding model converts your uploaded documents into vectors that AnythingLLM searches at query time. Without it, document uploads fail silently.

# Pull the embedding model
ollama pull nomic-embed-text

# Expected output:
# pulling manifest
# pulling 970aa74c0a90... 100% 274 MB
# verifying sha256 digest
# success

`nomic-embed-text` is 274 MB and runs on CPU without GPU memory. It is the standard embedding model for Ollama-based setups and works well for English and multilingual documents.

Verify Both Models Are Available

ollama list
# Expected output (your versions may differ):
# NAME                    ID              SIZE    MODIFIED
# llama3.3:8b             a6eb4748fd29    4.9 GB  2 minutes ago
# nomic-embed-text:latest 0a109f422b47    274 MB  1 minute ago

Both models must appear in this list before you configure AnythingLLM.

Connect AnythingLLM to Ollama

The Ollama connection URL is the most common source of setup failures. The correct URL depends on how AnythingLLM is running.

Which URL to Use

AnythingLLM Installation	Ollama Location	URL to Enter
Desktop App	Same machine	`http://localhost:11434`
Docker (Windows or macOS)	Same host machine	`http://host.docker.internal:11434`
Docker (Linux)	Same host machine	`http://172.17.0.1:11434`
Docker	Another server	`http://:11434`

ℹ️

Note:[!IMPORTANT] If AnythingLLM is running in Docker and you enter `http://localhost:11434`, the connection will fail. Inside a Docker container, `localhost` refers to the container itself, not the host machine where Ollama is running. Use `host.docker.internal` on Windows and macOS, or `172.17.0.1` on Linux.

Setup Wizard Steps

The first time you open `http://localhost:3001`, a setup wizard walks you through the configuration:

1. Create an admin account (username and password) 2. On the LLM configuration screen, select "Ollama" from the provider list 3. Enter the Ollama base URL for your deployment (see table above) 4. Select your chat model from the dropdown (it fetches the list from Ollama automatically) 5. On the embedding configuration screen, select "Ollama" again 6. Select `nomic-embed-text:latest` as the embedding model 7. Complete the wizard

Verify the Connection After Setup

In the AnythingLLM interface, go to Settings (gear icon) > LLM Provider. The status indicator next to the Ollama URL should show green. If it shows red, the URL is wrong or Ollama is not running.

Test Ollama directly to confirm it is reachable:

# If using Desktop App or checking from the host machine
curl http://localhost:11434
# Expected: Ollama is running

# If using Docker on Linux, test the bridge IP
curl http://172.17.0.1:11434
# Expected: Ollama is running

If Ollama is not running, start it:

# Linux (systemd service)
sudo systemctl start ollama

# macOS or manual start
ollama serve

Create a Workspace and Upload Documents

Workspaces are the core concept in AnythingLLM. Each workspace is a separate document container with its own vector store, chat history, and model settings. Documents uploaded to one workspace are not visible to other workspaces. If you came from Open-WebUI, think of workspaces as separate chat sessions that each have their own private document library.

Create Your First Workspace

1. Click the "+" button or "New Workspace" in the left sidebar 2. Enter a name for the workspace (for example, "Research Notes" or "Company Docs") 3. Click Create

Upload Documents

Click the paperclip icon in the chat input area, or use the Document Manager (the folder icon in the sidebar). Supported file types include:

PDF, DOCX, TXT, Markdown, HTML
CSV, XLSX, PPTX
JSON files and most code file types (.py, .js, .ts, .go, etc.)
GitHub repository URLs (imports the full repo content)
YouTube video URLs (imports the transcript)

After selecting files, AnythingLLM processes each document and shows a progress indicator. Processing time depends on file size. A 100-page PDF typically takes 10-30 seconds on a modern CPU.

Processing: research-paper.pdf
  Chunking document...
  Embedding 47 chunks with nomic-embed-text...
  Done. 47 vectors stored.

Chat Mode vs Query Mode

Each workspace has two response modes, controlled by the toggle in the workspace settings:

Mode	Behavior	Best For
Chat	Uses uploaded documents as context alongside the model's general knowledge	Mixed document + general Q&A
Query	Returns answers only from uploaded documents; refuses off-topic questions	Strict document retrieval

Switch modes via the settings icon inside the workspace. For research and document Q&A, Query mode gives more focused answers and prevents the model from hallucinating information not in your files.

Test the Setup

After uploading a document, ask a question about its content in the chat input. A working RAG response includes a "Sources" section below the answer showing which document chunks were retrieved.

ℹ️

Note:If the model answers without citing any sources, the embedding step may have failed silently. Check Settings > LLM Provider to confirm the embedding model is set to `nomic-embed-text`.

Enable Agents and MCP Tools

AnythingLLM includes an agent system that lets the model take actions beyond answering questions. Agents can browse the web, run code, call external APIs, and use MCP (Model Context Protocol) servers.

Activate an Agent

Type `@agent` at the start of any chat message to activate agent mode in that conversation:

@agent Search the web for the latest AI papers from May 2026 and summarize the top 3.

The agent works through the task step by step, showing each action in the chat window. Default built-in agent capabilities include:

Web browsing and URL fetching
Basic calculation and data analysis
Reading documents already in the workspace

ℹ️

Note:Web browsing requires an internet connection. If AnythingLLM is running on an air-gapped machine, web-based agent actions will fail. Document-based agents work fully offline.

Enable or Disable Agent Capabilities

Go to Settings > Agents to see which capabilities are active. You can toggle individual capabilities on or off. For example, disable web browsing if you want agents restricted to local documents only.

Add MCP Servers

AnythingLLM has built-in MCP support as of 2026. To add an MCP server:

1. Go to Settings > Agent Tools > MCP Servers 2. Click "Add MCP Server" 3. Paste the server configuration in JSON format

Example: adding a local filesystem MCP server:

json

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "/Users/yourname/Documents"
      ]
    }
  }
}

4. Click Save and restart AnythingLLM if prompted

After adding a server, the agent can use its tools automatically when relevant. You can also call a specific tool by naming it in your `@agent` message. For programmatic access to Ollama models outside the UI, see the Ollama Python guide.

Update AnythingLLM

For Docker deployments, pull the latest image and restart:

docker pull mintplexlabs/anythingllm
docker stop anythingllm
docker rm anythingllm

# Re-run with the same flags as the original install
docker run -d \
  -p 3001:3001 \
  --cap-add SYS_ADMIN \
  -v ~/anythingllm/storage:/app/server/storage \
  -e STORAGE_DIR=/app/server/storage \
  --name anythingllm \
  --restart unless-stopped \
  mintplexlabs/anythingllm

Your documents, chat history, and settings are preserved in the `~/anythingllm/storage` directory. The Docker container itself is stateless.

For the Desktop App, check for updates from the Help menu inside the app.

Troubleshooting

"Ollama is not available" or empty model dropdown

Cause: AnythingLLM cannot reach the Ollama API. In Docker deployments, localhost in the URL resolves to the container itself, not the host machine.

Fix: Change the Ollama URL in Settings > LLM Provider. Use http://host.docker.internal:11434 on Windows or macOS. Use http://172.17.0.1:11434 on Linux. Use http://localhost:11434 only for the Desktop App.

Document upload completes but queries return no sources

Cause: The embedding model is not configured or nomic-embed-text is not pulled in Ollama.

Fix: Run `ollama pull nomic-embed-text` on the host running Ollama. Then go to Settings > Embedding Provider in AnythingLLM, select Ollama, and choose nomic-embed-text:latest. Re-upload the documents to re-embed them.

Document embedding hangs or crashes partway through

Cause: Insufficient RAM when embedding a large file, or a corrupt PDF that Chromium cannot parse.

Fix: Split large PDFs into smaller files (under 50 pages each) before uploading. For corrupt files, convert to plain text first using a tool like pdftotext. Check Docker container logs: `docker logs anythingllm` for specific error messages.

@agent command is not recognized or does nothing

Cause: Agent mode is disabled or no agent provider is configured.

Fix: Go to Settings > Agents and confirm agent mode is enabled. The agent uses the same LLM configured under Settings > LLM Provider. Make sure a chat model is selected and the Ollama connection is working.

AnythingLLM container exits immediately on start

Cause: The --cap-add SYS_ADMIN flag is missing, or the storage volume path does not exist.

Fix: Confirm the docker run command includes `--cap-add SYS_ADMIN`. Create the storage directory before running: `mkdir -p ~/anythingllm/storage`. Check logs with `docker logs anythingllm` to see the specific exit reason.

Alternatives to Consider

Tool	Type	Price	Best For
Open-WebUI	Self-hosted	Free	A clean ChatGPT-style interface for Ollama with conversation history and model switching. No RAG or agents, but easier to set up.
PrivateGPT	Self-hosted	Free	API-first RAG server for developers who want to integrate document Q&A into their own applications rather than use a web UI.
LibreChat	Self-hosted	Free	Multi-model chat with support for OpenAI, Anthropic, Ollama, and 10+ other providers in one interface. Better for multi-model comparison than document RAG.
Flowise	Self-hosted	Free	Visual drag-and-drop builder for RAG pipelines and AI agents. More flexible than AnythingLLM for custom flows but requires more configuration.

Frequently Asked Questions

Is AnythingLLM free to use?

The self-hosted version is completely free and open-source under the MIT license. The Desktop App is also free to download and use. Mintplex Labs offers a paid cloud version (AnythingLLM Cloud) for teams that do not want to manage their own server, but self-hosting has no cost beyond your hardware or VPS.

What is the difference between AnythingLLM and Open-WebUI?

Open-WebUI is a chat frontend for Ollama. It provides a clean ChatGPT-style interface with model switching and conversation history, but it does not have built-in RAG, workspaces, or multi-user management beyond basic accounts.

AnythingLLM adds document upload and Q&A (RAG), isolated workspaces per project or team, AI agents that can browse the web and call tools, and role-based user management. If you only need to chat with local models, Open-WebUI is simpler to set up. If you need to query your own documents privately, AnythingLLM is the better fit.

Does AnythingLLM work without an internet connection?

Yes, the core functionality works fully offline. Chat, document upload, embedding, and Q&A all run locally using Ollama models. The only features that require internet are agent web browsing and any external MCP server that fetches data from the web.

To use AnythingLLM in an air-gapped environment, pull all required Ollama models while you have internet access, then disconnect. The models stay on disk and do not need to be re-downloaded.

What file types does AnythingLLM support?

AnythingLLM supports PDF, DOCX, TXT, Markdown (.md), HTML, CSV, XLSX, PPTX, and JSON files. It also supports over 50 code file types including .py, .js, .ts, .go, .java, and .cpp.

Beyond local files, you can import content from GitHub repository URLs (it clones and indexes the full repo) and YouTube video URLs (it fetches the transcript). The document processor uses Chromium for HTML and PDF rendering, which is why the Docker image requires the SYS_ADMIN capability.

What is a workspace in AnythingLLM?

A workspace is an isolated document container with its own vector store and chat history. Documents uploaded to one workspace are invisible to other workspaces. This lets you maintain separate contexts for different projects or clients without them interfering with each other.

Each workspace has its own settings including response mode (Chat or Query), context window configuration, and which model to use if you want different workspaces to run different models.

Which embedding model should I use with Ollama?

nomic-embed-text is the standard recommendation for Ollama-based AnythingLLM setups. It is 274 MB, runs on CPU without requiring GPU memory, handles documents in most languages, and produces strong retrieval quality for general content.

If you have a larger budget of RAM and want better multilingual support, mxbai-embed-large (670 MB) or snowflake-arctic-embed:l (1.2 GB) are alternatives available in the Ollama library. Pull them with `ollama pull mxbai-embed-large` and select them in AnythingLLM's embedding settings.

Can multiple users share one AnythingLLM instance?

Yes, multi-user mode is available in the Docker deployment. Go to Settings > Multi-User Mode to enable it. You can create accounts with three roles: Admin (full access), Manager (can create workspaces and invite users), and Default (can only access workspaces assigned to them).

The Desktop App is designed for single-user use only. For teams, deploy AnythingLLM via Docker on a server and expose it through an Nginx reverse proxy with SSL.

How do I activate and use agents in AnythingLLM?

Type `@agent` at the beginning of a chat message in any workspace. The model enters agent mode and works through the task step by step. Default capabilities include web browsing, calculation, and workspace document access.

To configure or disable specific capabilities, go to Settings > Agents. To add external tools, go to Settings > Agent Tools > MCP Servers and add a server configuration in JSON format. Agents require the chat model to be capable of tool use — Llama 3.3 8B, Qwen2.5 7B, and Mistral 7B all support it.

How much RAM does AnythingLLM need?

AnythingLLM itself uses around 512 MB of RAM for the Node.js process and vector database. The real memory requirement comes from the Ollama models running alongside it.

A typical setup with a 7B chat model (llama3.3:8b or mistral:7b) requires 7-8 GB of RAM just for the model, plus 512 MB for AnythingLLM, plus OS overhead. A machine with 16 GB RAM handles this comfortably. For 13B models, plan for at least 16 GB RAM total.

How do I update AnythingLLM to the latest version?

For Docker deployments, pull the latest image and recreate the container. Your data is safe in the named volume or host directory mount, so the update is non-destructive:

docker pull mintplexlabs/anythingllm
docker stop anythingllm && docker rm anythingllm
# Re-run the original docker run command

For the Desktop App, open the app and check Help > Check for Updates. The updater downloads and installs the new version automatically while preserving your data.

Related Guides

Beginner20 min

How to Run Ollama Locally: Complete Setup Guide (2026)

Beginner15 min

How to Set Up Open-WebUI with Ollama (Docker Guide)

Beginner10 min

Best Local LLM Models to Run in 2026 (Benchmarks + Use Cases)

Intermediate35 min

How to Run OpenClaw with Ollama Local Models (2026 Guide)

Intermediate25 min

How to Use Ollama with Python: API Integration Tutorial (2026)