I2VGenXLPipeline - missing components? · huggingface/diffusers · Discussion #7952 (original) (raw)

Hi everyone,

I was playing with I2VGenXLPipeline. Here is corresponding Huggingface implementation.. I saw some discrepancy between method described in the paper and this implementation. Can someone help me in checking if my understanding is correct.

In the paper, they have the following diagram:
Screenshot 2024-05-15 at 2 00 28 PM

Can anyone tell me if my understanding is correct for this code? I wanted to access intermediate low dimensional video, which comes at the end of base stage, but I don't know how to exactly access it. Can anyone tell me how to access that representation? I would appreciate it.