Topics tagged nemotron (original) (raw)

Introducing NVIDIA NemoClaw

NVIDIA NemoClaw is an open source stack that simplifies running OpenClaw always-on assistants more safely, with a single command. It installs the NVIDIA OpenShell runtime, part of the NVIDIA Agent Toolkit, a secure envir…

0

4775

March 16, 2026

Deliberations on 4-sparks cluster advantages

12

150

June 16, 2026

Rate limit increase request — 40 RPM to 200 RPM for Hermes Agent

0

24

June 16, 2026

ASUS Ascent GX10 — Public API Endpoints permission missing from NGC Personal Key — NIM containers returning 403

3

173

June 15, 2026

Asus GX10 Stable: Hermes Twin Qwen3.6-35A-A3B + Qwen3.6-27B + ComfyUI

5

540

June 15, 2026

8x DGX Spark Cluster Build Report: CRS812 + 400DD→4x100G Breakouts, Nemotron 3 Ultra at TP=8

2

224

June 14, 2026

"Qwen3.6-35B-A3B-NVFP4 hangs after attention backend selection across 3 vLLM images, including NVIDIA's own official recipe

2

172

June 14, 2026

Nemotron 3 Super & Ultra Models leaking metadata and chatting in longform content

0

31

June 14, 2026

Multi-Node Inference Crash on Blackwell GB10: Memory Allocation (0x51) & NCCL Timeouts (Tested on Qwen 122B & Nemotron 120B)

26

813

June 13, 2026

Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM) – Student Learning & Agentic Coding Workflow

0

33

June 12, 2026

DGX Manager — an open-source control plane for your DGX Spark cluster (looking for testers & feedback)

0

179

June 12, 2026

Pushing GB10 to the Limit: Qwen3 235B MoE + Concurrent Best-of-4 + Persistent Agent Layer. Architecture check & Optimization tips?

0

124

June 12, 2026

A Spark to beat M5 Ultra and a MegaSpark to beat 2x Rubin PRO 6000!

28

737

June 11, 2026

Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM)

0

23

June 11, 2026

NVIDIA NIM API Rate Limit Increase Request (40 → 200 RPM) – Agentic Coding Workflows

0

26

June 11, 2026

Open-webui and utilizing Nemotron VL Embed 1B

0

18

June 10, 2026

Why nemotron 3 NVFP4 models are not deterministic using vLLM?

3

258

June 10, 2026

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16

31

1972

June 10, 2026

Evaluate Clinical ASR Models Faster with Agent Skills and NVIDIA Nemotron Speech

0

18

June 9, 2026

Request to enable "Public API Endpoints" permission for my personal organization

0

22

June 9, 2026

Nemotron-3-Ultra-550B-A55B-NVFP4 on 4× DGX Spark via SGLang (TP=4 EP=4, RoCE) — it works, ~42–43 tok/s n8 peak

0

265

June 9, 2026

HTTP 402 PAYMENT_REQUIRED downloading nim/openai/gpt-oss-120b model files — same key downloads nim/nvidia models fine

7

181

June 8, 2026

Request to increase NVIDIA NIM API rate limit from 40 RPM to 250-300 RPM

1

81

June 8, 2026

Sparkrun - central command with tab completion for launching inference on Spark Clusters

90

3758

June 8, 2026

NVIDIA NIM API Rate Limit Increase Request (40 → 200 RPM) - Hermes Agent Workflows

0

35

June 7, 2026

Nemotron-3-Ultra-550B-A55B (2-bit GGUF) across 2× DGX Spark via llama.cpp RPC — it works (~5 tok/s)

7

668

June 7, 2026

Fine-tuning Nemotron-3-Nano-30B-A3B on ASUS Ascent GX10 / GB10 (DGX Spark): CUDA OOM at load, box freezes, 4-bit QLoRA fix

0

132

June 5, 2026

CUDA illegal memory access with MTP speculative decoding on Nemotron-3-Super-120B-NVFP4 (vLLM cu130-nightly, single DGX Spark GB10)

2

394

June 4, 2026

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents

0

75

June 4, 2026

Nemotron-3-Nano on Jetson Thor vLLM : ITL degrades 4.7x with concurrency, MTP rejected

6

202

June 4, 2026