202250311-WINDOWS本地4G显存Docker运行vLLM
前置: 需要去huggingface注册账号获取token:HUGGING_FACE_HUB_TOKEN 运行vLLM docker run --name LocalvLLM_qwen1.5B_Int4 --runtime nvidia --gpus all -v D:/vLLM/.cache/huggingface:/root/.cache/huggingface --env "HUGGING_FAC…
2025-11-17