Vulkan Implementation by 0cc4m · Pull Request #2059 · ggml-org/llama.cpp (original) (raw)

Use semaphores for synchronization instead of fences or waitidle

Rework async write/read for synchronization

Fix python script to generate shaders

Co-authored-by: Concedo 39025047+LostRuins@users.noreply.github.com

Add validation check to compare shader results to cpu results

Prepare broadcasting support for mul mat

Add repeat op

Add not-yet-enabled async backend ops

Co-authored-by: Georgi Gerganov ggerganov@gmail.com


Co-authored-by: Henri Vasserman henv@hot.ee Co-authored-by: Concedo 39025047+LostRuins@users.noreply.github.com Co-authored-by: slaren slarengh@gmail.com Co-authored-by: Georgi Gerganov ggerganov@gmail.com