403 Forbidden ("Authorization failed") on integrate.api.nvidia.com/v1 — newly generated nvapi- key works on NVCF but not on Chat Completions endpoint (original) (raw)

NVIDIA Developer Program member here. I’m trying to use the free serverless NIM
inference endpoint at https://integrate.api.nvidia.com/v1 but every request
returns HTTP 403 with {“status”:403,“title”:“Forbidden”,“detail”:“Authorization
failed”}.

Steps already taken:
1. Generated a fresh API key (nvapi- prefix) from build.nvidia.com/settings/api-keys
2. Generated another key directly from the model's page
   (build.nvidia.com/deepseek-ai/deepseek-v4-flash) after accepting the terms
3. Tested with both curl and OpenAI Python SDK
4. Tried multiple models: deepseek-ai/deepseek-v4-flash,
   meta/llama-3.1-70b-instruct — same 403 on all
5. Confirmed the key IS valid — calling api.nvcf.nvidia.com returns 404 (Not Found),
   not 401/403, meaning authentication passes

Key observations:
- The key was just created and never used before — no rate limit possible
- HTTP 403 = Forbidden (authenticated but not authorized), not 401 (unauthorized)
  or 429 (rate limit)
- Model availability page on build.nvidia.com shows "Free Endpoint: Available"
- Account has phone and email verified

I suspect the "Public API Endpoints" permission is missing from my personal
organization, as documented in other forum threads on this same issue.

Account email: proyecto.lumen.lux@gmail.com
Organization: Personal