glm-4-9b-chat |
glm4 |
llm |
vllm |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
internlm2_5-20b-chat-4bit-awq |
internlm2.5 |
llm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
internlm2_5-20b-chat |
internlm2.5 |
llm |
vllm |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
internlm2_5-7b-chat |
internlm2.5 |
llm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
internlm2_5-7b-chat-4bit |
internlm2.5 |
llm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
❎ |
internlm2_5-1_8b-chat |
internlm2.5 |
llm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-7B-Instruct |
qwen2.5 |
llm |
vllm,tgi |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,inf2.8xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-72B-Instruct-AWQ |
qwen2.5 |
llm |
vllm,tgi |
g5.12xlarge,g5.24xlarge,g5.48xlarge,inf2.24xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-72B-Instruct-AWQ-inf2 |
qwen2.5 |
llm |
tgi |
inf2.24xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-72B-Instruct |
qwen2.5 |
llm |
vllm |
g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-72B-Instruct-AWQ-128k |
qwen2.5 |
llm |
vllm |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-32B-Instruct |
qwen2.5 |
llm |
vllm |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-32B-Instruct-inf2 |
qwen2.5 |
llm |
tgi |
inf2.24xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-0.5B-Instruct |
qwen2.5 |
llm |
vllm,tgi |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge,inf2.8xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-1.5B-Instruct |
qwen2.5 |
llm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-3B-Instruct |
qwen2.5 |
llm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-14B-Instruct-AWQ |
qwen2.5 |
llm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
Qwen2.5-14B-Instruct |
qwen2.5 |
llm |
vllm |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
QwQ-32B-Preview |
qwen reasoning model |
llm |
huggingface,vllm |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
llama-3.3-70b-instruct-awq |
llama |
llm |
tgi |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
❎ |
DeepSeek-R1-Distill-Qwen-32B |
deepseek reasoning model |
llm |
vllm |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
DeepSeek-R1-Distill-Qwen-14B |
deepseek reasoning model |
llm |
vllm |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
DeepSeek-R1-Distill-Qwen-7B |
deepseek reasoning model |
llm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
DeepSeek-R1-Distill-Qwen-1.5B |
deepseek reasoning model |
llm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
DeepSeek-R1-Distill-Qwen-1.5B_ollama |
deepseek reasoning model |
llm |
ollama |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
DeepSeek-R1-Distill-Qwen-1.5B-GGUF |
deepseek reasoning model |
llm |
llama.cpp |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
DeepSeek-R1-Distill-Llama-8B |
deepseek reasoning model |
llm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
deepseek-r1-distill-llama-70b-awq |
deepseek reasoning model |
llm |
vllm,tgi |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
deepseek-r1-671b-1.58bit_ollama |
deepseek reasoning model |
llm |
ollama |
g5.48xlarge |
sagemaker,sagemaker_async,ecs |
❎ |
deepseek-r1-671b-1.58bit_gguf |
deepseek reasoning model |
llm |
llama.cpp |
g5.48xlarge |
sagemaker,sagemaker_async,ecs |
✅ |
deepseek-v3-UD-IQ1_M_ollama |
deepseek v3 |
llm |
ollama |
g5.48xlarge |
sagemaker,sagemaker_async,ecs |
❎ |
Baichuan-M1-14B-Instruct |
baichuan |
llm |
vllm,huggingface |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async,ecs |
❎ |
Qwen2-VL-72B-Instruct-AWQ |
qwen2vl |
vlm |
vllm |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async |
✅ |
QVQ-72B-Preview-AWQ |
qwen reasoning model |
vlm |
vllm |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async |
❎ |
Qwen2-VL-7B-Instruct |
qwen2vl |
vlm |
vllm |
g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g6e.2xlarge |
sagemaker,sagemaker_async |
✅ |
InternVL2_5-78B-AWQ |
internvl2.5 |
vlm |
lmdeploy |
g5.12xlarge,g5.24xlarge,g5.48xlarge |
sagemaker,sagemaker_async |
❎ |
txt2video-LTX |
comfyui |
video |
comfyui |
g5.4xlarge,g5.8xlarge,g6e.2xlarge |
sagemaker_async |
❎ |
whisper |
whisper |
whisper |
huggingface |
g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker_async |
❎ |
bge-base-en-v1.5 |
bge |
embedding |
vllm |
g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker |
✅ |
bge-m3 |
bge |
embedding |
vllm |
g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker,ecs |
✅ |
bge-reranker-v2-m3 |
bge |
rerank |
vllm |
g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge |
sagemaker |
✅ |