Skip to content

Latest commit

 

History

History
43 lines (43 loc) · 11.8 KB

supported_models.md

File metadata and controls

43 lines (43 loc) · 11.8 KB
ModeId ModelSeries ModelType Supported Engines Supported Instances Supported Services Support China Region
glm-4-9b-chat glm4 llm vllm g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
internlm2_5-20b-chat-4bit-awq internlm2.5 llm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
internlm2_5-20b-chat internlm2.5 llm vllm g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
internlm2_5-7b-chat internlm2.5 llm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
internlm2_5-7b-chat-4bit internlm2.5 llm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
internlm2_5-1_8b-chat internlm2.5 llm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-7B-Instruct qwen2.5 llm vllm,tgi g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,inf2.8xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-72B-Instruct-AWQ qwen2.5 llm vllm,tgi g5.12xlarge,g5.24xlarge,g5.48xlarge,inf2.24xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-72B-Instruct-AWQ-inf2 qwen2.5 llm tgi inf2.24xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-72B-Instruct qwen2.5 llm vllm g5.48xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-72B-Instruct-AWQ-128k qwen2.5 llm vllm g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-32B-Instruct qwen2.5 llm vllm g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-32B-Instruct-inf2 qwen2.5 llm tgi inf2.24xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-0.5B-Instruct qwen2.5 llm vllm,tgi g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge,inf2.8xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-1.5B-Instruct qwen2.5 llm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-3B-Instruct qwen2.5 llm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-14B-Instruct-AWQ qwen2.5 llm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker,sagemaker_async,ecs
Qwen2.5-14B-Instruct qwen2.5 llm vllm g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
QwQ-32B-Preview qwen reasoning model llm huggingface,vllm g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
llama-3.3-70b-instruct-awq llama llm tgi g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
DeepSeek-R1-Distill-Qwen-32B deepseek reasoning model llm vllm g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
DeepSeek-R1-Distill-Qwen-14B deepseek reasoning model llm vllm g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
DeepSeek-R1-Distill-Qwen-7B deepseek reasoning model llm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker,sagemaker_async,ecs
DeepSeek-R1-Distill-Qwen-1.5B deepseek reasoning model llm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker,sagemaker_async,ecs
DeepSeek-R1-Distill-Qwen-1.5B_ollama deepseek reasoning model llm ollama g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker,sagemaker_async,ecs
DeepSeek-R1-Distill-Qwen-1.5B-GGUF deepseek reasoning model llm llama.cpp g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker,sagemaker_async,ecs
DeepSeek-R1-Distill-Llama-8B deepseek reasoning model llm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker,sagemaker_async,ecs
deepseek-r1-distill-llama-70b-awq deepseek reasoning model llm vllm,tgi g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
deepseek-r1-671b-1.58bit_ollama deepseek reasoning model llm ollama g5.48xlarge sagemaker,sagemaker_async,ecs
deepseek-r1-671b-1.58bit_gguf deepseek reasoning model llm llama.cpp g5.48xlarge sagemaker,sagemaker_async,ecs
deepseek-v3-UD-IQ1_M_ollama deepseek v3 llm ollama g5.48xlarge sagemaker,sagemaker_async,ecs
Baichuan-M1-14B-Instruct baichuan llm vllm,huggingface g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async,ecs
Qwen2-VL-72B-Instruct-AWQ qwen2vl vlm vllm g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async
QVQ-72B-Preview-AWQ qwen reasoning model vlm vllm g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async
Qwen2-VL-7B-Instruct qwen2vl vlm vllm g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g6e.2xlarge sagemaker,sagemaker_async
InternVL2_5-78B-AWQ internvl2.5 vlm lmdeploy g5.12xlarge,g5.24xlarge,g5.48xlarge sagemaker,sagemaker_async
txt2video-LTX comfyui video comfyui g5.4xlarge,g5.8xlarge,g6e.2xlarge sagemaker_async
whisper whisper whisper huggingface g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker_async
bge-base-en-v1.5 bge embedding vllm g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker
bge-m3 bge embedding vllm g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker,ecs
bge-reranker-v2-m3 bge rerank vllm g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge sagemaker