Requesting RPM Increase (40 to 200 RPM) for Multi-Agent Workflows and RAG Developmen (original) (raw)

Hello NVIDIA Support Team,

I would like to request a rate limit increase for my NVIDIA NIM API account.

Account Details

Email: montanarco.17104@gmail.com
Current limit: 40 RPM
Requested limit: 100–200 RPM (whatever is the next available tier for individual developers)

Use Case
I am using the NVIDIA NIM API for personal development, security auditing, and non-production testing of advanced multi-agent AI workflows. My setup focuses on local automation, system architecture, and specialized data engineering tasks.

A typical workflow involves:

Multi-agent orchestration with parallel subagent calls to coordinate, route, and execute complex programming tasks.
Application security checks, vulnerability scanning assistance, and secure code review using advanced code models.
Data analytics pipelines, system design generation, and detailed infrastructure implementation plans.
Building and evaluating local RAG (Retrieval-Augmented Generation) architectures that require frequent embedding and generation calls.

Models evaluated in this workflow:

Qwen (including Qwen 2.5 Coder variants)
Gemma 4
Kimi 2.6 / Moonshot models and other high-context reasoning LLMs.

Why 40 RPM is Insufficient
Even with client-side throttling and exponential backoff, a single agentic execution loop (which plans, tests, reads files, and self-corrects) triggers dozens of concurrent API calls within a few seconds. The default 40 RPM global limit causes immediate 429 “Too Many Requests” errors, completely breaking the autonomous agent loops mid-execution.

Commitment

Personal development and R&D only — no production or public-facing production services.
Low concurrency usage, will continue respecting Retry-After headers.
Full compliance with NVIDIA’s developer fair-use policy.

Thank you for considering my request.

Best regards,
Miguel