Llama 3.1 Nemotron Ultra 253B

NVIDIA-tuned Llama 3.1 253B for maximum capability.

llama-3.1-nemotron-ultra-253b

STABLE

128,000 context

Starting at $0.60/M input tokens

Starting at $1.80/M output tokens

Streaming

JSON Output

Select Provider

LLM Gateway routes requests to the best providers that are able to handle your prompt size and parameters.

nebius/llama-3.1-nemotron-ultra-253b

Context Size

128k

Stability

STABLE

Pricing

Input

$0.60

Cached

—

Output

$1.80

Per Request

$0.000

/req

Capabilities

Streaming

JSON Output