Incorrect Detection Results After ONNX Conversion of finetuned BodyPose2d Model in DeepStream with Missing Lower Body Key points (original) (raw)

I have fine-tuned a BodyPose2D model with 17 key points and tested it using the .hdf5 model for inference, where the results were good, and all key points were correctly detected.

However, when I converted the model to a deployable ONNX model, integrated it into the sample pipeline, and ran inference in DeepStream, I observed that the results are not good:

Debugging Steps Taken:

  1. Tested different TensorRT engine files generated from the ONNX model.
  2. Tried various DeepStream parameters to synchronize results with pre-conversion inference.
  3. Experimented with different input resolutions (320x448, 288x384, and trained model resolution 640x640) for BPNet conversion models.
  4. Verified keypoint mapping in pre-processing and post-processing steps.
    Despite these efforts, the overall detection results remain incorrect and inconsistent with the original model inference.

Questions:

Sample Outputs:

Sample ONNX model output (DeepStream inference output)

image

Sample hdf5 model output

image

Configurations

bodypose2d_pgie_config.yml

property:
gpu-id: 0
model-engine-file: /home/nvidia/deepstream_tao_apps/Iteration4-updated/bpnet_model.deploy-320-448.onnx_b1_gpu0_fp16.engine

tlt-encoded-model: ../../models/bodypose2d/model.etlt

onnx-file: /home/nvidia/deepstream_tao_apps/Iteration4-updated/bpnet_model.deploy-320-448.onnx

tlt-model-key: nvidia_tlt

#int8-calib-file: /home/nvidia/deepstream_tao_apps/apps/Models1/calibration.640.640.deploy.bin
network-input-order: 1
infer-dims: 3;320;448
#dynamic batch size
batch-size: 1

0: FP32, 1: INT8, 2: FP16 mode

network-mode: 2
num-detected-classes: 1
gie-unique-id: 1
output-blob-names: conv2d_transpose_1/BiasAdd:0;heatmap_out/BiasAdd:0
#0: Detection 1: Classifier 2: Segmentation 100: other
network-type: 100

Enable tensor metadata output

output-tensor-meta: 1
#1-Primary 2-Secondary
process-mode: 1
net-scale-factor: 0.00390625
offsets: 128.0;128.0;128.0
#0: RGB 1: BGR 2: GRAY
model-color-format: 0
maintain-aspect-ratio: 1
symmetric-padding: 1
scaling-filter: 1

class-attrs-all:
threshold: 0.8

deepstream-bodypose2d-app/bodypose2d_app_config.yml

source-list:
list: file:///home/nvidia/deepstream_tao_apps/apps/tao_others/deepstream-bodypose2d-app/trimmed_videonew.ts

output:

1:file ouput 2:fake output 3:eglsink output 4:RTSP output

type: 1

0: H264 encoder 1:H265 encoder

codec: 0
encoder type 0=Hardware 1=Software
enc-type: 0
bitrate: 2000000
udpport: 2345
rtspport: 8554
##The file name without suffix
filename: test

streammux:
width: 640
height: 480
batched-push-timeout: 40000

primary-gie:
#0:nvinfer, 1:nvinfeserver
plugin-type: 0
config-file-path: /home/nvidia/deepstream_tao_apps/configs/bodypose2d_tao/bodypose2d_pgie_config.yml
#config-file-path: ../../../configs/triton/bodypose2d_tao/bodypose2d_pgie_config.yml
#config-file-path: ../../../configs/triton-grpc/bodypose2d_tao/bodypose2d_pgie_config.yml
unique-id: 1

model-config:
config-file-path: /home/nvidia/deepstream_tao_apps/configs/bodypose2d_tao/sample_bodypose2d_model_config.yml

yuweiw April 7, 2025, 9:21am 3

Could you attach the pipeline of this scenario and do you use DeepStream in this scenario?

Please provide complete information as applicable to your setup. Thanks
Hardware Platform (Jetson / GPU)
DeepStream Version
JetPack Version (valid for Jetson only)
TensorRT Version
NVIDIA GPU Driver Version (valid for GPU only)
Issue Type( questions, new requirements, bugs)
How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Below is the complete environment setup and reproduction details for the issue:


Issue Summary

I fine-tuned a BodyPose2D model with 17 keypoints. The inference results from the .hdf5 model are accurate, with all 17 keypoints correctly detected.

However, after converting the model to ONNX and deploying it in DeepStream using the deepstream-bodypose2d-app, I observed the following:

How to Reproduce

  1. Fine-tune BodyPose2D model on a custom dataset to output 17 keypoints.
  2. Export the .hdf5 model to ONNX using tf2onnx.
  3. Deploy the ONNX model in DeepStream using the following files:
  1. Run the deepstream-bodypose2d-app.
  2. Compare inference results between .hdf5 and ONNX+DeepStream pipeline — observe missing lower body keypoints in DeepStream output.

Environment Details

Sample App Used

Sample Outputs

image

image

yuweiw April 8, 2025, 10:06am 5

Are you using DeepStream in this scenario? And we recommend that you compare the results with the same picture.

For .hdf5 i am not using deepstream. But for integration in sample BodyPose2d app converted deployable onnx model is used with Deepstream for inference.
Sample BodyPose 2d with Deepstream results.

image

yuweiw April 9, 2025, 5:40am 7

Maybe there’s something wrong with the way you transferred your hdf5 model to onnx. Let’s try to narrow it down a little bit. You can try to use TensorRT to infer your images directly. About how to achieve this, you can go to TAO forum for professional consultation about this.

yingliu July 1, 2025, 8:30am 8

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

system Closed July 15, 2025, 8:31am 9

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.