NVIDIA NIM API Rate Limit Increase Request (40 → 200 RPM) - Hermes Agent Workflows (original) (raw)

Hi NVIDIA NIM Team,

I’m writing to formally request a rate limit increase for my NVIDIA NIM API account.

Current Limit: 40 RPM
Requested Limit: 200 RPM

Account Email: akmalaka1122@gmail.com
API Key (last 4 characters): AXr4
Current Limit: 40 RPM
Requested Limit: 200 RPM

Use Case: Project Hermes — Multi-Agent AI Automation Framework

I am building Project Hermes, a multi-agent AI automation pipeline that uses NVIDIA NIM as its primary inference backend. The architecture consists of 5 specialized agents running sequentially:

ORCHESTRATOR (nvidia/nemotron-3-ultra-550b-a55b) — pipeline controller
PLANNER (mistralai/mistral-large-3-675b-instruct-2512) — task decomposition
RESEARCH-BOT (deepseek-ai/deepseek-v4-flash) — intelligence gathering
BUILDER (qwen/qwen3-coder-480b-a35b-instruct) — code generation
REVIEWER (mistralai/devstral-2-123b-instruct-2512) — quality gate with eval loop

Each pipeline run triggers a sequential chain of API calls across all 5 agents. A single task (e.g. research → plan → build → review → retry) generates 10–25 API calls in rapid succession. With the current 40 RPM limit, the pipeline consistently hits HTTP 429 errors mid-run — even for simple test prompts — making development and testing extremely difficult.

Why 200 RPM:
With 5 agents and an eval loop that can retry up to 3 times per task, a single complex pipeline run can consume 15–40 requests. 200 RPM would allow stable development and testing without interruption.

This is a personal/development use case — not production traffic. I am committed to responsible API usage and staying within fair use guidelines.

Thank you for considering this request. I am happy to provide any additional information if needed.

Best regards,
Akmalaka1122