Release CUTLASS 1.2 · NVIDIA/cutlass (original) (raw)
Navigation Menu
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
Choose a tag to compare
This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
CUTLASS 1.2.0
(2018-10-26)
- Parallelized reductions across threadblocks ("Split-K")
- Improved IGEMM performance
- Batched strided WMMA GEMMs