Problems moving to cuDNN Frontend (original) (raw)
September 17, 2025, 2:30pm 1
We are working on moving the implementation of our convolutional layer from the legacy cuDNN API (e.g. cudnnConvolutionBiasActivationForward) to cuDNN Frontend.
However, we are experiencing severe problems with the new API.
Internally our tensors are stored in NCHW layout and hence we are using this layout with cuDNN as well. We never encountered any problem with the legacy API with this layout. However, with the new API some configurations are just failing with NCHW layout with “no engine configuration found”. A simple example would be a stand-alone sigmoid activation with [N=2, C=1, H=1, W=2]. We could only workaround this problem by changing the shape of the tensor (which is no problem due to the nature of the activation operation).
The same problem occurs with the slice operation, where unfortunately changing the shape of the tensor is not an option.
Most (not all, e.g. activation seems to reliably find engines with 3 dimensional tensors only) problems vanish if we use NHWC layout with cuDNN Frontend (we tried that because tensor cores are optimized for this layout). However, this is a big issue for us, as a move to NHWC would imply a major rewrite of our codebase.
Another problem is performance. Basically, all operations are slower with the NCHW layout. Even slower than the legacy API (we implemented autotuning with cuDNN Frontend). E.g. we have a convolution with sigmoid activation that is up to a factor of 20 slower compared to the legacy API. We could work around this problem by using two separate graphs, but this introduces unnecessary additional memory usage.
The configuration for this graph is the following:
Input: N=1, C=256, H=80, W=128
Number of filters: 90
Kernel Size: 3x3
Padding: 1
Activation: Sigmoid
What is your advice on how to proceed? Will the situation regarding NCHW layout improve in cuDNN Frontend (resp. the new cuDNN Backend)? Or is a rewrite towards NHWC our only option?
yunzheq September 18, 2025, 5:19pm 3
Hi @winkler,
Happy to help!
To help us understand and resolve your issue, please provide the following details:
- Hardware Information: Which hardware are you using?
- cuDNN Versions: Could you share the cuDNN and cuDNN frontend versions you are currently using?
- Reproduction Script: Do you have a simple script that we can use to reproduce the issue?
- API Log: Please enable cuDNN frontend and cuDNN backend logging and attach the API log to this thread.
For API logging, you could use the following.
// For cudnn_frontend
export CUDNN_FRONTEND_LOG_FLIE=fe.log
export CUDNN_FRONTEND_LOG_INFO=1
// For cudnn_backend
export CUDNN_LOGLEVEL_DBG=3
export CUDNN_LOGDEST_DBG=be.log
Moreover, engineers monitor and track issues on the cuDNN frontend GitHub more often. Utilizing the provided bug report template there allows engineers to gather all necessary information, which can lead to a faster response time for future inquiries.
Thanks!