Kokoro-FastAPI Using Docker
What is Kokoro-FastAPI?β
Kokoro-FastAPI is a dockerized FastAPI wrapper for the Kokoro-82M text-to-speech model that implements the OpenAI API endpoint specification. It offers high-performance text-to-speech with impressive generation speeds.
Key Featuresβ
- OpenAI-compatible Speech endpoint with inline voice combination
- NVIDIA GPU accelerated or CPU Onnx inference
- Streaming support with variable chunking
- Multiple audio format support (
.mp3,.wav,.opus,.flac,.aac,.pcm) - Integrated web interface on localhost:8880/web (or additional container in repo for gradio)
- Phoneme endpoints for conversion and generation
Voicesβ
- af
- af_bella
- af_irulan
- af_nicole
- af_sarah
- af_sky
- am_adam
- am_michael
- am_gurney
- bf_emma
- bf_isabella
- bm_george
- bm_lewis
Languagesβ
- en_us
- en_uk
Requirementsβ
- Docker installed on your system
- HridaAI running
- For GPU support: NVIDIA GPU with CUDA 12.3
- For CPU-only: No special requirements
Òő‘︠Quick startβ
You can choose between GPU or CPU versionsβ
GPU Version (Requires NVIDIA GPU with CUDA 12.8)β
Using docker run:
docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpuOr docker compose, by creating a docker-compose.yml file and running docker compose up. For example:
name: kokoro
services:
kokoro-fastapi-gpu:
ports:
- 8880:8880
image: ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.1
restart: always
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities:
- gpuYou may need to install and configure the NVIDIA Container Toolkit
CPU Version (ONNX optimized inference)β
With docker run:
docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpuWith docker compose:
name: kokoro
services:
kokoro-fastapi-cpu:
ports:
- 8880:8880
image: ghcr.io/remsky/kokoro-fastapi-cpu
restart: alwaysSetting up HridaAI to use Kokoro-FastAPIβ
To use Kokoro-FastAPI with HridaAI, follow these steps:
- Open the Admin Panel and go to
Settings->Audio - Set your TTS Settings to match the following:
-
- Text-to-Speech Engine: OpenAI
- API Base URL:
http://localhost:8880/v1# you may need to usehost.docker.internalinstead oflocalhost - API Key:
not-needed - TTS Voice:
af_bella# also accepts mapping of existing OAI voices for compatibility - TTS Model:
kokoro
The default API key is the string not-needed. You do not have to change that value if you do not need the added security.
Building the Docker Containerβ
git clone
cd Kokoro-FastAPI
cd docker/cpu # or docker/gpu
docker compose up --buildThat's it!
For more information on building the Docker container, including changing ports, please refer to the Kokoro-FastAPI repository
Troubleshootingβ
NVIDIA GPU Not Detectedβ
If the GPU version isn't using your GPU:
-
Install NVIDIA Container Toolkit:
# Ubuntu/Debian distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit sudo systemctl restart docker -
Verify GPU access:
docker run --rm --gpus all nvidia/cuda:12.2.0-base nvidia-smi
Connection Issues from HridaAIβ
If HridaAI can't reach Kokoro, this is usually a Docker networking issue. Choose the method that matches your setup:
Option 1 Γ’β¬β Docker Desktop (Windows/Mac):
Use host.docker.internal instead of localhost:http://host.docker.internal:8880/v1
Option 2 Γ’β¬β Docker Compose (same network):
Use the service name directly:http://kokoro-fastapi-gpu:8880/v1
Option 3 Γ’β¬β Docker Network (recommended for Linux):
If host.docker.internal doesn't work, create a shared Docker network:
# Create a Docker network
docker network create local-llm
# Connect both containers to the network
docker network connect local-llm hrida-ai
docker network connect local-llm kokoro-fastapi
# Restart both containers
docker restart hrida-ai kokoro-fastapiThen set your API Base URL to http://kokoro-fastapi:8880/v1
- Verify the service is running:
curl http://localhost:8880/health