
llama.cpp
High-performance C/C++ LLM inference with minimal setup
About llama.cpp

High-performance C/C++ LLM inference with minimal setup.
llama.cpp works 100% offline, is open source, is completely free to use, runs on CPU without a dedicated GPU.
Platform Support
Available for: Windows, macOS, Linux
System Requirements
- Minimum RAM: 8 GB
- GPU: Not required — runs on CPU
Links
Full description coming soon. Check the official website or GitHub for more details.
Frequently Asked Questions
What is llama.cpp?
High-performance C/C++ LLM inference with minimal setup ## About llama.cpp High-performance C/C++ LLM inference with minimal setup. llama.cpp works 100% offline, is open source, is completely free to use, runs on CPU without a dedicated GPU. ### Platfor...
Is llama.cpp free?
Yes, llama.cpp is completely free to use. It's also open source.
Does llama.cpp work offline?
Yes, llama.cpp works 100% offline once installed.
What platforms does llama.cpp support?
llama.cpp is available for Windows, macOS, Linux.
Related Tools
View all →
Ollama
Run large language models locally with a simple CLI interface

LM Studio
Discover, download, and run local LLMs with an easy-to-use desktop app

Jan
Open-source ChatGPT alternative that runs 100% offline on your computer

GPT4All
Free-to-use, locally running, privacy-aware chatbot by Nomic AI

Text Generation WebUI
The AUTOMATIC1111 of text generation - maximum control for LLMs

KoboldCpp
Easy-to-use AI text generation software for GGML/GGUF models
