Latest News 🔥
- [2025/03] We officially released EMD!
EMD (Easy Model Deployer) is a lightweight tool designed to simplify model deployment. Built for developers who need reliable and scalable model serving without complex setup.
Key Features
- One-click deployment of models to the cloud (Amazon SageMaker, Amazon ECS, Amazon EC2)
- Diverse model types (LLMs, VLMs, Embeddings, Vision, etc.)
- Rich inference engine (vLLM, TGI, Lmdeploy, etc.)
- Different instance types (CPU/GPU/AWS Inferentia)
- Convenient integration (OpenAI Compatible API, LangChain client, etc.)
Notes
- OpenAI Compatible API is supported only for Amazon ECS and Amazon EC2 deployment.
Install EMD with pip
, currently only support for Python 3.9 and above:
pip install /~https://github.com/aws-samples/easy-model-deployer/releases/download/main/emd-0.6.0-py3-none-any.whl
Visit our documentation to learn more.
emd config set-default-profile-name
Notes: If you don't set aws profile, it will use the default profile in your env (suitable for Temporary Credentials). Whenever you want to switch deployment accounts, run emd config set-default-profile-name
emd bootstrap
Notes: This is going to set up the necessary resources for model deployment. Whenever you change EMD version, run this command again.
Quickly see what models are supported by emd list-supported-models
. This command will output all information related to deployment. The following command is recommended to just check the model type. (Plese check Supported Models for complete information.)
emd list-supported-models | jq -r '.[] | "\(.model_id)\t\(.model_type)"' | column -t -s $'\t' | sort
emd deploy --model-id DeepSeek-R1-Distill-Qwen-1.5B --instance-type g5.8xlarge --engine-type vllm --framework-type fastapi --service-type sagemaker --extra-params {} --skip-confirm
Notes: Get complete parameters by emd deploy --help
and find the values of the required parameters here
When you see "Waiting for model: ...", it means the deployment task has started, you can quit the current task by ctrl+c.
Notes: For more details about the deployment parameters, please refer to Deployment parameters.
emd status
Notes: EMD allows to launch multiple deployment tasks at the same time.
Quick functional verfication or check our documentation for integration examples.
emd invoke DeepSeek-R1-Distill-Qwen-1.5B
Notes: Find ModelId in the output of emd status
. Refer to EMD Client, Langchain interface and OpenAI compatible interface for more details.
emd destroy DeepSeek-R1-Distill-Qwen-1.5B
Notes: Find ModelId in the output of emd status
.
For advanced configurations and detailed guides, visit our documentation site.
We welcome contributions! Please see CONTRIBUTING.md for guidelines.