Request: Add google/gemma-4-26b-a4b-it as hosted API on build.nvidia.com (original) (raw)
Hi,
Could you please add google/gemma-4-26b-a4b-it as a hosted API
on build.nvidia.com?
The NIM container already exists on NGC Catalog:
Problems with the current gemma-4-31b-it hosted API:
- Thinking mode cannot be fully disabled — even with thinkingBudget:0,
it still produces empty tags that break real-time
applications like subtitle translation - Also, google/gemma-4-31b-it is unavailable most of the time on the Playground and via the hosted API (integrate.api.nvidia.com/v1/chat/completions)
- The 26B A4B has the same quality (-2/3 points) without
thinking issues on E2B/E4B models
The 26B A4B is MoE with only 3.8B active parameters —
very efficient to serve. The container is already ready!
Thank you so much,
Regards, Pedro
they allready added the model bro 😅🤷♂️