Name		Name	Last commit message	Last commit date
parent directory ..
assets		assets
cartesia_mlx		cartesia_mlx
MANIFEST.in		MANIFEST.in
README.md		README.md
example.py		example.py
pyproject.toml		pyproject.toml
setup.py		setup.py

README.md

Cartesia MLX

This package contains implementations for fast on-device SSM inference on Apple silicon.

Installation

To install this package, first follow the installation instructions for cartesia-metal. Next (in your Python environment) install the cartesia-mlx package:

pip install cartesia-mlx

Note: This package has been tested on macOS Sonoma 14.1 with the M3 chip.

Models

Language Models

cartesia-ai/Llamba-1B-4bit-mlx
cartesia-ai/Llamba-3B-4bit-mlx
cartesia-ai/Llamba-8B-8bit-mlx
cartesia-ai/Mohawk-v0.1-1.3B-4bit-mlx
cartesia-ai/Rene-v0.1-1.3b-4bit-mlx
cartesia-ai/mamba2-130m-8bit-mlx
cartesia-ai/mamba2-130m-mlx
cartesia-ai/mamba2-370m-8bit-mlx
cartesia-ai/mamba2-780m-8bit-mlx
cartesia-ai/mamba2-1.3b-4bit-mlx
cartesia-ai/mamba2-2.7b-4bit-mlx

Usage

A simple example script for generation can be found in cartesia-mlx/example.py. Usage example (clone this repo and run the below from within the cartesia-mlx directory):

python example.py --model cartesia-ai/Mohawk-v0.1-1.3B-4bit-mlx --prompt "Rene Descartes was"

You can pass any of the models listed above to the --model argument; for a full list of command-line options, pass --help.

Performance

Our SSM-based LMs deliver SOTA quality and throughput, with constant tokens per second (tok/s) and memory requirements, making them an ideal choice for on-device applications.

At increasing context size, the throughput of transformer-based LMs drops rapidly, and memory consumption skyrockets, making them inefficient. In contrast, our distilled pure-SSM LlaMamba retains constant tokens per second (tok/s) and memory consumption, unlocking reasoning capabilities over much larger contexts on-device. This constant memory usage is ideal for edge applications.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cartesia-mlx

cartesia-mlx

README.md

Cartesia MLX

Installation

Models

Language Models

Usage

Performance

Rene in MLX

Files

cartesia-mlx

Directory actions

More options

Directory actions

More options

Latest commit

History

cartesia-mlx

Folders and files

parent directory

README.md

Cartesia MLX

Installation

Models

Language Models

Usage

Performance

Rene in MLX