Compiling SD1.5 for Neuron with resolution other than 512x512 fails (original) (raw)

September 11, 2024, 11:38am 1

I’m trying to export SD 1.5 into a portrait mode resolution 512x768 for use with Neuron / Inferentia 2. This is my export command:

optimum-cli export neuron \
    --model jyoung105/stable-diffusion-v1-5 \
    --task stable-diffusion \
    --batch_size 1 --num_images_per_prompt 1 \
    --height 768 --width 512 \
    stable-diffusion-v1-5.neuron

It works in 512x512 but fails with 512x768 with this error in the vae_encoder step:

***** Compiling vae_encoder *****
...........
[GCA035]  Instruction: I-5715-0 with opcode: TensorTensor couldn't be allocated in SB
Memory Location Accessed:
add.1_reload_7077_i0: 196608 Bytes per Partition and total of: 25165824 Bytes in SB
_add.1104-t7919_i0: 4 Bytes per Partition and total of: 512 Bytes in SB
add.6_i0: 2048 Bytes per Partition and total of: 262144 Bytes in SB
Total Accessed Bytes per partition by instruction: 198660
Total SB Partition Size: 196608
 - Please open a support ticket at https://github.com/aws-neuron/aws-neuron-sdk/issues/new. You may also be able to obtain more information using the 'XLA_IR_DEBUG' and 'XLA_HLO_DEBUG' environment variables.
An error occured when trying to trace vae_encoder with the error message: neuronx-cc failed with 70.
The export is failed and vae_encoder neuron model won't be stored.

Do I need any other parameters or is it a bug that needs fixing? I’m running it on AWS inf2.2xlarge instance.

Somehow I managed to export some SD1.5 checkpoint to 512x768 a couple weeks ago but I’m unable to reproduce it now. Is that a possible regression in optimum-neuron or neuronx-cc?

hgfckwla September 13, 2024, 6:04pm 2

@Jingya sorry to bother you directly, but you’re always so helpful

Any idea why compiling the model for any resolution other than 512x512 fails?

I have one Neuron model that I was able to compile for 512x768 a few weeks ago but I no longer have the setup and don’t remember the exact command, and now it always fails.

Is it something that can be fixed? Or am I doing something wrong?

Jingya September 16, 2024, 10:22am 3

Hi @hgfckwla,

The compilation error should come from AWS Neuron sdk instead of Optimum Neuron. According to AWS folks, the compilation for SD models with unequal height/width should have been supported by the SDK version following the 2.18.2 so 2.19.0 and 2.19.1: enable unequal height and width by yahavb · Pull Request #592 · huggingface/optimum-neuron · GitHub.

Can you still recall the SDK version you used for successful compilation?

Jingya September 16, 2024, 11:54am 4

I got this when compiling unequal height/width SD’s vae encoder with neuron SDK 2.19.1 on an inf2.8xlarge instance.

[NLA001]  Unhandled exception with message: === BIR error ===
Reason: Access pattern out of bound.
Instruction: identity_pool_1_I-5532-441602-tc
Opcode: TensorCopy
Instruction Source: (float32<128 x 1027> $5532[i2_369_0_0, i2_369_0_1, i1_370_6433, i3_369_0_6433, i3_369_1_0_6433_0_0, i3_369_1_0_6433_0_1, i3_369_1_0_6433_1, i3_369_1_1_0_6433_0, i3_369_1_1_0_6433_1, i3_369_1_1_1_6433_0, i3_369_1_1_1_6433_1, i2_370_6433]:5532)0:
Argument AP:
Access Pattern: [[2051,64],[1,1],[1,1027]]
Offset: 1028
Memory Location: {add.11_VN_191_ReloadStore111619}@SB<0,175096>(128x8204)#Internal DebugInfo: <add.11||UNDEF||[128, 2051, 1]>
 - Please open a support ticket at https://github.com/aws-neuron/aws-neuron-sdk/issues/new. You may also be able to obtain more information using the 'XLA_IR_DEBUG' and 'XLA_HLO_DEBUG' environment variables.

I will ask Annapurna folks. And it will be super helpful if you can share the env where you succeeded in compiling it!

hgfckwla September 16, 2024, 8:52pm 5

Hi @Jingya thanks for confirming the issue. Unfortunately I can’t find my old virtual env with the versions that worked. I think it was on my spot instance that’s now gone

Jingya September 26, 2024, 9:24am 7

No worries, I talked with the Annapurna team, they are working on a fix for the compiler regression. Thanks again for letting us know, I will add a unit test for unequal width/height once the patch is out.