Welcome to Inference
Inference is a distributed GPU cluster based on Solana, designed for Large Language Model (LLM) inference. It provides fast, scalable, and token-based payment APIs for models such as DeepSeek V3 and Llama 3.3.

Distributed GPU cluster for LLM inference