Add VisualCloze by lzyhha · Pull Request #11377 · huggingface/diffusers (original) (raw)

@a-r-r-o-w Hello, I can make the changes, but I still have some questions about the solution and thus need to confirm that again.

I instantiate a pipeline inside another pipeline because I follow the implementation of pipeline_stable_cascade_combined.py, which instantiate two pipelines in another pipeline as follows:

self.prior_pipe = StableCascadePriorPipeline( prior=prior_prior, text_encoder=prior_text_encoder, tokenizer=prior_tokenizer, scheduler=prior_scheduler, image_encoder=prior_image_encoder, feature_extractor=prior_feature_extractor, ) self.decoder_pipe = StableCascadeDecoderPipeline( text_encoder=text_encoder, tokenizer=tokenizer, decoder=decoder, scheduler=scheduler, vqgan=vqgan, )

Instead, if we instantiate two pipelines via from_pretrained as follows. It seems that the same network will be instantiated twice and takes up twice the memory, since we use exactly the same model architecture and weights in both stages.

pipe1 = VisualClozeGenerationPipeline.from_pretrained(...) pipe2 = VisualClozeUpsamplingPipeline.from_pretrained(...)