#
gemma
Here are 5 public repositories matching this topic...
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
bloom falcon moe gemma mistral mixture-of-experts model-quantization multi-gpu-inference m2m100 llamacpp llm-inference internlm llama2 qwen baichuan2 mixtral phi-2 deepseek minicpm
-
Updated
Mar 15, 2024 - C++
-
Updated
Feb 21, 2020 - C++
Improve this page
Add a description, image, and links to the gemma topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the gemma topic, visit your repo's landing page and select "manage topics."