Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM) (original) (raw)

Hello NVIDIA NIM Team,

I am writing to kindly request a rate limit increase for my NVIDIA NIM API account.

Account Details:

⦁ Registered Email: zihan@mczihan.link

⦁ API Key ID (last 4 chars): 4180b

⦁ Current Limit: 40 RPM

⦁ Requested Limit: 200 RPM

Use Case:

I am an individual developer using NVIDIA NIM for personal learning and building a personal AI agent system. My primary models are:

⦁ z-ai/glm-5.1 — for general reasoning and multi-step task orchestration

⦁ google/gemma-4 — for lightweight tool-calling and quick inference tasks

My agent workflow involves sequential LLM calls — prompt construction, tool selection, result validation, and response generation. A single user interaction can easily trigger 5–8 chained requests, and the current 40 RPM limit is frequently exceeded during normal usage, causing HTTP 429 errors that disrupt the entire agent loop.

Why the Current Limit Is Restrictive:

Even with concurrency disabled and a local token-bucket rate limiter set to ~0.6 req/s, the 40 RPM ceiling is reached within seconds of starting a multi-step agent session. This makes it difficult to test and iterate on agent workflows in any meaningful way.

Commitments:

⦁ This request is strictly for personal, non-commercial development and learning purposes

⦁ I am not running any public-facing service, production workload, or reselling access

⦁ I am not using multiple accounts or key rotation to bypass limits

⦁ I will implement proper exponential-backoff retry logic on 429 errors

An increase to 200 RPM would allow me to develop, test, and iterate on my agent system reliably while staying within reasonable fair-use boundaries.

Thank you for making NVIDIA NIM available to the developer community. I appreciate your consideration.

Best regards,

Zihan