Download lmdeploy-v0.2.5:
cd ~/
git clone /~
cd lmdeploy
git checkout c5f4014 # 确保为0.2.5版本
Activate conda environment:
conda activate lmdeploy
Install dependencies.
cd ~/lmdeploy
pip install -r requirements/build.txt
Create a new file named
under ~/lmdeploy
, and fill in the following content:
builder="-G Ninja"
if [ "$1" == "make" ]; then
cmake ${builder} .. \
-DCMAKE_CUDA_FLAGS="-lineinfo" \
Modify file permissions.
chmod +x
Install Ninja.
sudo apt-get install ninja-build
Create a new build folder.
cd ~/lmdeploy
mkdir build && cd build
Compile LMDeploy.
ninja install
During the compilation process, the memory may run out and result in Killed
. You can expand the swap capacity as follows, and then execute ninja install again.
# Create a 6 GB swap area. The size can be customized and combined with the disk capacity
sudo fallocate -l 6G /var/swapfile
# Modify file permissions.
sudo chmod 600 /var/swapfile
# Make swap area
sudo mkswap /var/swapfile
# Setup swap area
sudo swapon /var/swapfile
# Setup swap area automatically
sudo bash -c 'echo "/var/swapfile swap swap defaults 0 0" >> /etc/fstab'
Attention: Use vim to edit requirements/runtime.txt
, and delete the lines containing torch<=2.1.2,>=2.0.0
and triton>=2.1.0,<2.2.0
Note: To simplify dependencies, we have removed triton
. This also means that when deploying models using lmdeploy, they can only be invoked through the turbomind method, and not through the API method.
Install lmdeploy-v0.2.5 locally.
cd ~/lmdeploy
pip install -e .[serve]