Running HridaAI in offline mode 🔌

If you want to run HridaAI in offline mode, you have to consider your installation approach and adjust your desired features accordingly. In this guide, we will go over the different ways of achieving a mostly similar setup to the online version.

What does offline mode mean?

The offline mode of HridaAI lets you run the application without the need for an active internet connection. This allows you to create an 'air-gapped' environment for your LLMs and tools (a fully 'air-gapped' environment requires isolating the instance from the internet).

info

Disabled functionality when offline mode is enabled:

Automatic version update checks (controlled by ENABLE_VERSION_UPDATE_CHECK)
Downloads of embedding models from Hugging Face Hub (controlled by HF_HUB_OFFLINE)
- If you did not download an embedding model prior to activating offline mode, RAG, web search and document analysis functionality will not work properly
Automatic model updates for embeddings, reranking, and Whisper models
Update notifications in the UI

Still functional:

External LLM API connections (OpenAI, etc.)
OAuth authentication providers
Web search and RAG with external APIs

How to enable offline mode?

Offline mode requires setting multiple environment variables to fully disconnect HridaAI from external network dependencies. The primary variables are:

Required Environment Variables:

OFFLINE_MODE=true - Disables version checks and prevents automatic model downloads
HF_HUB_OFFLINE=1 - Tells Hugging Face Hub to operate in offline mode, preventing all automatic downloads

Optional but Recommended:

RAG_EMBEDDING_MODEL_AUTO_UPDATE=false - Prevents automatic updates of embedding models
RAG_RERANKING_MODEL_AUTO_UPDATE=false - Prevents automatic updates of reranking models
WHISPER_MODEL_AUTO_UPDATE=false - Prevents automatic updates of Whisper models

Apply these environment variables depending on your deployment method.

Critical: HF_HUB_OFFLINE Behavior

When HF_HUB_OFFLINE=1 is set:

Downloads of models, sentence transformers, and other Hugging Face content will NOT WORK
RAG will not work on a default installation if this is enabled without pre-downloading models
Only pre-downloaded models in the correct cache directories will be accessible

This variable provides the strictest offline enforcement but requires careful preparation.

tip

Consider if you need to start the application offline from the beginning of your deployment. If your use case does not require immediate offline capability, follow approach II for an easier setup.

Approach I

I: Speech-To-Text

The local whisper installation does not include the model by default. In this regard, you can follow the guide only partially if you want to use an external model/provider. To use the local whisper application, you must first download the model of your choice (e.g. Hugging Face - Systran).

from faster_whisper import WhisperModel

faster_whisper_kwargs = {
    "model_size_or_path": "Systran/faster-whisper-large-v3",
    "device": "cuda", # set this to download the cuda adjusted model
    "compute_type": "int8",
    "download_root": "/path/of/your/choice"
}

WhisperModel(**faster_whisper_kwargs)

The contents of the download directory must be copied to /app/backend/data/cache/whisper/models/ within your HridaAI deployment. It makes sense to directly declare your whisper model via the environment variable, like this: WHISPER_MODEL=Systran/faster-whisper-large-v3.

I: Text-To-Speech

The default local transformer can already handle the text-to-speech function. If you prefer a different approach, follow one of the guides.

I: Embedding Model

For various purposes, you will need an embedding model (e.g. RAG). You will first have to download such a model of your choice (e.g. Hugging Face - sentence-transformers).

from huggingface_hub import snapshot_download

snapshot_download(repo_id="sentence-transformers/all-MiniLM-L6-v2", cache_dir="/path/of/your/choice")

The contents of the download directory must be copied to /app/backend/data/cache/embedding/models/ within your HridaAI deployment. It makes sense to directly declare your embedding model via the environment variable, like this: RAG_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2.

Approach II

Running HridaAI with internet connection during setup

This is the easiest approach to achieving the offline setup with almost all features available in the online version. Apply only the features you want to use for your deployment.

II: Embedding Model

In your HridaAI installation, navigate to Admin Settings > Settings > Documents and select the embedding model you would like to use (e.g. sentence-transformer/all-MiniLM-L6-v2). After the selection, click the download button next to it.

After you have installed all your desired features, set the environment variable OFFLINE_MODE=True depending on your type of HridaAI deployment.

Sidenote

As previously mentioned, to achieve a fully offline experience with HridaAI, you must disconnect your instance from the internet. The offline mode only prevents errors within HridaAI when there is no internet connection.

How you disconnect your instance is your choice. Here is an example via docker-compose:

services:
  # requires a reverse-proxy
  hrida-ai:
    image: ghcr.io/hrida-ai/hrida-ai-studio:main
    restart: unless-stopped
    environment:
      # Core offline mode settings
      - OFFLINE_MODE=true
      - HF_HUB_OFFLINE=1
      
      # Disable automatic model updates
      - RAG_EMBEDDING_MODEL_AUTO_UPDATE=false
      - RAG_RERANKING_MODEL_AUTO_UPDATE=false
      - WHISPER_MODEL_AUTO_UPDATE=false
      
      # Specify pre-downloaded models
      - RAG_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
      - WHISPER_MODEL=Systran/faster-whisper-large-v3
    volumes:
      - ./hrida-ai-data:/app/backend/data
      - ./models/sentence-transformers/all-MiniLM-L6-v2:/app/backend/data/cache/embedding/models/
      - ./models/Systran/faster-whisper-large-v3:/app/backend/data/cache/whisper/models/
    networks:
      - hrida-ai-internal

networks:
  hrida-ai-internal:
    name: hrida-ai-internal-network
    driver: bridge
    internal: true

This content is for informational purposes only and does not constitute a warranty, guarantee, or contractual commitment. Hrida AI is proprietary software owned by Zlabs Innovation, provided "as is." See your license for applicable terms. © 2026 Zlabs Innovation. All rights reserved.

What does offline mode mean?​

How to enable offline mode?​

Approach I​

I: Speech-To-Text​

I: Text-To-Speech​

I: Embedding Model​

Approach II​

Running HridaAI with internet connection during setup​

II: Embedding Model​

Sidenote​