NVIDIAHosted on Luminet

Nemotron Ultra 340B

nvidia/nemotron-ultra-340b

NVIDIA's open dense flagship. Trained on 15T tokens with FP8-native pipelines. Strong instruction-following and tool use.

Context window

256Ktokens

Input price

$1.40/ 1M tokens

Output price

$4.20/ 1M tokens

P50 latency

175ms

Quick start

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.luminet.ai/v1",
  apiKey: process.env.LUMINET_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "nvidia/nemotron-ultra-340b",
  messages: [{ role: "user", content: "Hello!" }],
});

Capabilities

textcode

Throughput: 240 tok/s · Released 2026-03