Local AI Models

Running AI models locally is the ultimate way to use Pipit. It ensures your data never leaves your machine, provides offline capabilities, and eliminates per-token costs.

Why Run Locally?

Privacy

Your transcriptions never leave your machine. Ideal for sensitive notes.

Offline

Works without an internet connection once models are downloaded.

Zero Cost

No API bills or subscriptions. Use it as much as your hardware allows.

Getting Started

Pipit doesn’t bundle its own LLM engine to keep the app size manageable. Instead, it connects to local “Inference Servers” using an OpenAI-compatible API. The two most popular ways to run local models on macOS are Ollama and LM Studio.

Option 1: Ollama (Recommended)

Ollama is a command-line tool that makes running models extremely simple. It’s the most efficient way to use local AI on Mac.

1. Installation

Download from ollama.com or install via Homebrew:

brew install ollama

2. Start the Server

Ollama runs as a background process. You can start it from your Applications folder or via terminal:

ollama serve

Pro Tip: For automatic startup when your Mac boots, run: brew services start ollama

3. Download a Model

Open a terminal and “pull” the model you want to use. We suggest starting with Llama 3.2:

ollama pull llama3.2

4. Configure Pipit

Open Pipit Settings → AI Processing.
Select Custom Endpoint as the provider.
Use the following settings:

Setting	Value
Endpoint URL	`http://localhost:11434/v1`
API Key	(Leave blank)
Model Name	`llama3.2`

Don’t forget the /v1 at the end of the URL! This is required for OpenAI compatibility.

Option 2: LM Studio

LM Studio provides a beautiful graphical interface. Use this if you prefer finding and managing models through a UI rather than the terminal.

1. Installation

Download the macOS version from lmstudio.ai.

2. Download a Model

Open LM Studio and search for Llama 3.2 or Qwen 2.5.
Click Download on a version that fits your RAM (look for “Recommended”).

3. Start the Local Server

Click the Developer (double chevron) icon in the sidebar.
Select your downloaded model from the dropdown at the top.
Click Start Server.

4. Configure Pipit

Setting	Value
Endpoint URL	`http://localhost:1234/v1`
API Key	(Leave blank)
Model Name	Use the exact name shown in LM Studio

Recommended Models

For the best experience in Pipit (speed vs. accuracy), we recommend:

Model	Size	RAM Required	Performance
Llama 3.2 (3B)	~2GB	8GB+	Fast, great for general cleanup
Qwen 2.5 (3B)	~2GB	8GB+	Excellent at following formatting
Llama 3.1 (8B)	~5GB	16GB+	More “intelligent” but slower
DeepSeek R1 (7B)	~5GB	16GB+	Exceptional for technical content

Troubleshooting Local Models

Connection Refused

If Pipit says it can’t connect:

Is the server running? Run curl http://localhost:11434/v1/models (for Ollama) or check the “Start Server” button in LM Studio.
Check the Port: Ensure the port in Pipit matches the server (11434 for Ollama, 1234 for LM Studio).

Model Not Found

Spelling: The model name must be exact. In Ollama, run ollama list to see the exact names.
Is it loaded? In LM Studio, you must explicitly load the model into memory before starting the server.

Slow Processing

Reduce Model Size: If you have 8GB of RAM, avoid 8B+ models. Stick to “3B” or smaller.
GPU Offloading: In LM Studio, ensure GPU offloading is enabled for Apple Silicon (M1/M2/M3/M4) to utilize the Neural Engine.
Background Apps: Close memory-heavy apps like Chrome or Photoshop if you experience lag.

Timeout Errors

Pipit allows up to 15 seconds for custom/local AI processing. If your local model is very slow, it might timeout. Try a smaller model or ensure your Mac is plugged into power.

Getting Started

Features

Use Cases

Configuration

Help

Local AI Models

Why Run Locally?

Privacy

Offline

Zero Cost

Getting Started

Option 1: Ollama (Recommended)

1. Installation

2. Start the Server

3. Download a Model

4. Configure Pipit

Option 2: LM Studio

1. Installation

2. Download a Model

3. Start the Local Server

4. Configure Pipit

Recommended Models

Troubleshooting Local Models

Connection Refused

Model Not Found

Slow Processing

Timeout Errors

Getting Started

Features

Use Cases

Configuration

Help

​Why Run Locally?

Privacy

Offline

Zero Cost

​Getting Started

​Option 1: Ollama (Recommended)

​1. Installation

​2. Start the Server

​3. Download a Model

​4. Configure Pipit

​Option 2: LM Studio

​1. Installation

​2. Download a Model

​3. Start the Local Server

​4. Configure Pipit

​Recommended Models

​Troubleshooting Local Models

​Connection Refused

​Model Not Found

​Slow Processing

​Timeout Errors

Why Run Locally?

Getting Started

Option 1: Ollama (Recommended)

1. Installation

2. Start the Server

3. Download a Model

4. Configure Pipit

Option 2: LM Studio

1. Installation

2. Download a Model

3. Start the Local Server

4. Configure Pipit

Recommended Models

Troubleshooting Local Models

Connection Refused

Model Not Found

Slow Processing

Timeout Errors