Optimizing Transformer Models for Variable-Length Input SequencesHow PyTorch NestedTensors, FlashAttention2, and xFormers can Boost Performance and Reduce AI Costs