Ollama

Connect hi-shell to Ollama for powerful local LLM inference. Ollama provides an easy way to run large language models locally with a simple API.

Prerequisites

  • Ollama installed and running
  • At least one model pulled (e.g., ollama pull llama3.2)

Setup

1. Install and Start Ollama

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Start the server
ollama serve

2. Pull a Model

# Recommended models for hi-shell
ollama pull llama3.2          # Good general-purpose model
ollama pull mistral           # Fast and capable
ollama pull codellama         # Optimized for code
ollama pull qwen2.5-coder     # Great for shell commands

3. Configure hi-shell

hi-shell --init
# Select "Local" → "Ollama"
# The model name (e.g., "llama3.2")

Manual Configuration

Edit ~/.config/hi-shell/config.toml:

llm_provider = "Local"
local_provider = "Ollama"
local_url = "http://localhost:11434"
local_model = "llama3.2"

Configuration Options

OptionDefaultDescription
local_urlhttp://localhost:11434Ollama API endpoint
local_modelModel name to use

Custom Ollama URL

If Ollama is running on a different host or port:

local_url = "http://192.168.1.100:11434"

Verifying the Connection

Test that hi-shell can connect to Ollama:

hi-shell list files in current directory

If you see a generated command, the connection works.

Recommended Models

ModelSizeQualitySpeedBest For
llama3.22.0 GBGoodFastGeneral use
llama3.2:1b1.3 GBOKVery fastQuick commands
mistral4.1 GBVery goodMediumComplex commands
codellama3.8 GBVery goodMediumCode-focused
qwen2.5-coder4.7 GBExcellentMediumShell commands

Troubleshooting

“Connection refused” Error

Make sure Ollama is running:

ollama serve

Check that Ollama is listening on port 11434:

curl http://localhost:11434/api/tags

Model Not Found

Pull the model first:

ollama pull llama3.2

List available models:

ollama list

Slow Responses

  • Use a smaller model (e.g., llama3.2:1b)
  • Ensure you have enough RAM for the model
  • Check if GPU acceleration is working: ollama ps