GitHub - iohub/collama: VSCode AI coding assistant powered by self-hosted llama.cpp endpoint. (original) (raw)
Deprecated:AI Copilot with LLaMA.cpp
Redirect: Open Copilot
"VSCode AI coding assistant powered by self-hosted llama.cpp endpoint."
Get started
- Install Open Copilot from the VSCode marketplace.
- Set your llama.cpp server's address to something such as
http://192.168.0.101:8080in the "Cody » Llama Server Endpoint" setting. - Now enjoy coding with your localized deploy models.
chat
code completion
code generate
code explain
Quick start your model service
Windows
- Download llama.cpp binary release archive
- Unzip
llama-bxxx-bin-win-cublas-cuxx.x.x-x64.zipto folder - Download GGUF model file, for example: wizardcoder-python-13b-v1.0.Q4_K_M.gguf
- Execute
server.exestartup command.
only use cpu
D:\path_to_unzip_files\server.exe -m D:\path_to_model\wizardcoder-python-13b-v1.0.Q4_K_M.gguf -t 8 -c 1024
use gpu
D:\path_to_unzip_files\server.exe -m D:\path_to_model\wizardcoder-python-13b-v1.0.Q4_K_M.gguf -t 8 -ngl 81 -c 1024
Linux or MacOS
Please compile the llama.cpp project by yourself, and follow the same startup steps.
Contributing
All code in this repository is open source (Apache 2).
Quickstart: pnpm install && cd vscode && pnpm run dev to run a local build of the Cody VS Code extension.



