Is it possible to convert the onnx model to fp16 model? · Issue #489 · huggingface/diffusers (original) (raw)

The torch example gives parameter revision="fp16", can onnx model do the same optimization? Current onnx inference(using CUDAExecutionProvider) is slower than torch version, and used more gpu memory than torch version(12G vs 4G).