Llama 3.1 Nemotron Ultra 253B

NVIDIA-tuned Llama 3.1 253B for maximum capability.

llama-3.1-nemotron-ultra-253b
STABLE
128,000 context
Starting at $0.60/M input tokens
Starting at $1.80/M output tokens
Streaming
JSON Output

Select Provider

All Providers for Llama 3.1 Nemotron Ultra 253B

LLM Gateway routes requests to the best providers that are able to handle your prompt size and parameters.

Nebius AI

nebius/llama-3.1-nemotron-ultra-253b
Context Size
128k
Stability
STABLE
Pricing
Input
$0.60
/M
Cached
Output
$1.80
/M
Per Request
$0.000
/req
Capabilities
Streaming
JSON Output
Try in Playground