GitHub - iohub/collama: VSCode AI coding assistant powered by self-hosted llama.cpp endpoint. (original) (raw)

Deprecated:AI Copilot with LLaMA.cpp

Redirect: Open Copilot

"VSCode AI coding assistant powered by self-hosted llama.cpp endpoint."

Get started

chat

chat with llama.cpp server

code completion

code completion

code generate

code generate

code explain

explain code

Quick start your model service

Windows

  1. Download llama.cpp binary release archive
  2. Unzip llama-bxxx-bin-win-cublas-cuxx.x.x-x64.zip to folder
  3. Download GGUF model file, for example: wizardcoder-python-13b-v1.0.Q4_K_M.gguf
  4. Execute server.exe startup command.

only use cpu

D:\path_to_unzip_files\server.exe -m D:\path_to_model\wizardcoder-python-13b-v1.0.Q4_K_M.gguf -t 8 -c 1024

use gpu

D:\path_to_unzip_files\server.exe -m D:\path_to_model\wizardcoder-python-13b-v1.0.Q4_K_M.gguf -t 8 -ngl 81 -c 1024

Linux or MacOS

Please compile the llama.cpp project by yourself, and follow the same startup steps.

Contributing

All code in this repository is open source (Apache 2).

Quickstart: pnpm install && cd vscode && pnpm run dev to run a local build of the Cody VS Code extension.