LocalAlternative

Relay

Relay — Build an AI team that works for you

llama.cpp

Verified

High-performance C/C++ LLM inference with minimal setup

|Share:

About llama.cpp

llama.cpp screenshot

High-performance C/C++ LLM inference with minimal setup.

llama.cpp works 100% offline, is open source, is completely free to use, runs on CPU without a dedicated GPU.

Platform Support

Available for: Windows, macOS, Linux

System Requirements

Minimum RAM: 8 GB
GPU: Not required — runs on CPU

Links

GitHub Repository

Full description coming soon. Check the official website or GitHub for more details.

Frequently Asked Questions

What is llama.cpp?

High-performance C/C++ LLM inference with minimal setup ## About llama.cpp High-performance C/C++ LLM inference with minimal setup. llama.cpp works 100% offline, is open source, is completely free to use, runs on CPU without a dedicated GPU. ### Platfor...

Is llama.cpp free?

Yes, llama.cpp is completely free to use. It's also open source.

Does llama.cpp work offline?

Yes, llama.cpp works 100% offline once installed.

What platforms does llama.cpp support?

llama.cpp is available for Windows, macOS, Linux.

Related Tools

Ollama

Ollama

Run large language models locally with a simple CLI interface

LM Studio

LM Studio

Discover, download, and run local LLMs with an easy-to-use desktop app

Jan

Jan

Open-source ChatGPT alternative that runs 100% offline on your computer

GPT4All

GPT4All

Free-to-use, locally running, privacy-aware chatbot by Nomic AI

Text Generation WebUI

Text Generation WebUI

The AUTOMATIC1111 of text generation - maximum control for LLMs

KoboldCpp

KoboldCpp

Easy-to-use AI text generation software for GGML/GGUF models

Stats

Stars94,642

Last committoday

Self-hostedYes

View Repository

Requirements

Platforms

windows

macos

linux

Offline capable

Yes

Minimum RAM8 GB

GPU requiredNo

Sponsored

Relay

Build an AI team that works for you

The easiest way to create AI agents.