Install MLC LLM Python Package — mlc-llm 0.1.0 documentation (original) (raw)

Table of Contents

MLC LLM Python Package can be installed directly from a prebuilt developer package, or built from source.

Option 1. Prebuilt Package

We provide nightly built pip wheels for MLC-LLM via pip. Select your operating system/compute platform and run the command in your terminal:

Note

❗ Whenever using Python, it is highly recommended to use conda to manage an isolated Python environment to avoid missing dependencies, incompatible versions, and package conflicts. Please make sure your conda environment has Python and pip installed.

conda activate your-environment python -m pip install --pre -U -f https://mlc.ai/wheels mlc-llm-nightly-cpu mlc-ai-nightly-cpu

Note

We need git-lfs in the system, you can install it via

conda install -c conda-forge git-lfs

If encountering issues with GLIBC not found, please install the latest glibc in conda:

conda install -c conda-forge libgcc-ng

Besides, we would recommend using Python 3.11; so if you are creating a new environment, you could use the following command:

conda create --name mlc-prebuilt python=3.11

Then you can verify installation in command line:

python -c "import mlc_llm; print(mlc_llm)"

Prints out: <module 'mlc_llm' from '/path-to-env/lib/python3.11/site-packages/mlc_llm/__init__.py'>

Option 2. Build from Source

We also provide options to build mlc runtime libraries mlc_llm from source. This step is useful when you want to make modification or obtain a specific version of mlc runtime.

Step 1. Set up build dependency. To build from source, you need to ensure that the following build dependencies are satisfied:

Set up build dependencies in Conda

make sure to start with a fresh environment

conda env remove -n mlc-chat-venv

create the conda environment with build dependency

conda create -n mlc-chat-venv -c conda-forge
"cmake>=3.24"
rust
git
python=3.11

enter the build environment

conda activate mlc-chat-venv

Note

For runtime, TVM Unity compiler is not a dependency for MLCChat CLI or Python API. Only TVM’s runtime is required, which is automatically included in 3rdparty/tvm. However, if you would like to compile your own models, you need to follow TVM Unity.

Step 2. Configure and build. A standard git-based workflow is recommended to download MLC LLM, after which you can specify build requirements with our lightweight config generation tool:

Configure and build

clone from GitHub

git clone --recursive https://github.com/mlc-ai/mlc-llm.git && cd mlc-llm/

create build directory

mkdir -p build && cd build

generate build configuration

python ../cmake/gen_cmake_config.py

build mlc_llm libraries

cmake .. && cmake --build . --parallel $(nproc) && cd ..

Note

If you are using CUDA and your compute capability is above 80, then it is require to build withset(USE_FLASHINFER ON). Otherwise, you may run into Cannot find Function issue during runtime.

To check your CUDA compute capability, you can use nvidia-smi --query-gpu=compute_cap --format=csv.

Step 3. Install via Python. We recommend that you install mlc_llm as a Python package, giving you access to mlc_llm.compile, mlc_llm.MLCEngine, and the CLI. There are two ways to do so:

export MLC_LLM_SOURCE_DIR=/path-to-mlc-llm export PYTHONPATH=$MLC_LLM_SOURCE_DIR/python:$PYTHONPATH alias mlc_llm="python -m mlc_llm"

Step 4. Validate installation. You may validate if MLC libarires and mlc_llm CLI is compiled successfully using the following command:

Validate installation

expected to see libmlc_llm.so and libtvm_runtime.so

ls -l ./build/

expected to see help message

mlc_llm chat -h

Finally, you can verify installation in command line. You should see the path you used to build from source with:

python -c "import mlc_llm; print(mlc_llm)"