Supported Providers
Pipit supports AI providers for post-processing your transcriptions:OpenRouter
Access to Google Gemini, Claude, and many other models through a unified API.Default model: google/gemini-2.5-flash-lite
Custom Endpoint
Any OpenAI-compatible API, including local models on your machine.Recommended for: Privacy-focused users, offline use, cost savings
Why Use Local Models?
Running AI models locally offers three key advantages: Privacy, Cost, and Offline Use. Your transcriptions never leave your machine, and you don’t pay per-token fees.Local Model Guide
Learn how to set up Ollama, LM Studio, and other local inference servers for Pipit.
OpenRouter
For cloud-based AI without managing infrastructure, OpenRouter provides a unified API to access dozens of models including Google Gemini, Anthropic Claude, and open-source models.Get API Key
- Go to openrouter.ai and sign in
- Create an API key from your account settings
- Copy the key (you will not see it again)
Custom Endpoints (Generic)
Any service that provides an OpenAI-compatible API works with Pipit. This includes local inference servers as well as specialist cloud providers.Configuration
To use a custom endpoint, select Custom Endpoint in Pipit Settings:| Setting | Description |
|---|---|
| Endpoint URL | The base URL of the API (e.g., https://api.together.xyz/v1 or http://localhost:11434/v1) |
| API Key | Your secret API key (leave blank for local servers) |
| Model Name | The specific model identifier to use |
Common Compatible Services
| Service | URL Format | Notes |
|---|---|---|
| Ollama | http://localhost:11434/v1 | See Local Model Guide |
| LM Studio | http://localhost:1234/v1 | See Local Model Guide |
| Together AI | https://api.together.xyz/v1 | High-performance open-source models |
| Fireworks | https://api.fireworks.ai/v1 | Fast, reliable serverless inference |
| LocalAI | http://localhost:8080/v1 | Docker-based self-hosted AI |
Timeouts
- OpenRouter: 3 seconds for fast fallback to raw transcription.
- Custom endpoints: 15 seconds to accommodate slower local inference.
