Large language models

The chat API currently supports these models:

ModelFull model nameSupported context length
llama3.1neuralmagic/Meta-Llama-3.1-8B-Instruct-FP88192 tokens
llama3.1:70bneuralmagic/Meta-Llama-3.1-70B-Instruct-quantized.w4a16128k (131 072) tokens
llama3.1:405bneuralmagic/Meta-Llama-3.1-405B-Instruct-quantized.w4a16128k (131 072) tokens
mistral-nemoneuralmagic/Mistral-Nemo-Instruct-2407-quantized.w4a16128k (128 000) tokens

Embedding models

The embeddings API currently supports these models:

ModelFull model nameSupported context lengthOutput dimensions
gte-large-en-v1.5Alibaba-NLP/gte-large-en-v1.58192 tokens1024

Image generation models

The image generation and editing API currently supports these models:

Model nameSupported features
stabilityai/stable-diffusion-xl-base-1.0Text2Image, Image2Image