torchtitan (original) (raw)

distributed torchtitan

Topic Replies Views Activity
About the torchtitan category 0 145 September 9, 2024
Capture training graph with collectives via TorchTitan 2 19 June 19, 2025
Tensor parallel numeric mismatch 1 21 June 18, 2025
[Distributed w/ TorchTitan] Breaking Barriers: Training Long Context LLMs with 1M Sequence Length in PyTorch Using Context Parallel 8 6138 June 17, 2025
[Distributed w/ TorchTitan] Introducing Async Tensor Parallelism in PyTorch 11 12003 May 29, 2025
[Distributed w/ TorchTitan] Semi synchronous training using TorchFT 0 149 May 8, 2025
PyTorch Tensor Parallel 0 68 May 1, 2025
Dcp.save straight to cloud storage 5 159 April 15, 2025
How to avoid casting DTensor to Tensor before calling a custom operator (a CUDA kernel) 1 58 April 2, 2025
[Distributed w/ TorchTitan] Training with Zero-Bubble Pipeline Parallelism 0 2286 December 19, 2024
[Distributed w/ TorchTitan] Optimizing Checkpointing Efficiency with PyTorch DCP 0 2451 October 7, 2024
[Distributed w/ Torchtitan] Enabling Float8 All-Gather in FSDP2 0 1794 September 9, 2024