Skip to content

Releases: ggml-org/llama.cpp

b4793

28 Feb 14:35
70680c4
Compare
Choose a tag to compare
ggml : upgrade init_tensor API to return a ggml_status (#11854)

* Upgrade init_tensor API to return a ggml_status

To prepare for an 'abort-free' ggml
(ggml not to abort on OOMs but return a OOM status),
as agreeed with Diego in the ggml repo,
upgrade the init_tensor() and view_init() APIs
to return a ggml_status.

* misc fixes

---------

Co-authored-by: slaren <slarengh@gmail.com>

b4792

28 Feb 12:25
c43a3e7
Compare
Choose a tag to compare
llama : add Phi-4-mini support (supersede #12099) (#12108)

* Added Phi-4-mini-instruct support

* Update regex per ngxson

* Change the vocab base to Xenova/gpt-4o

* fix conversion update script

* no need to check longrope

* minor style fix

* fix python style

---------

Co-authored-by: Nicholas Sparks <nisparks@microsoft.com>

b4790

28 Feb 09:32
438a839
Compare
Choose a tag to compare
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizatio…

b4789

28 Feb 09:00
9c42b17
Compare
Choose a tag to compare
CUDA: fix logic for V100 + GGML_CUDA_FORCE_MMQ (#12098)

b4788

28 Feb 08:36
05e6f5a
Compare
Choose a tag to compare
ggml: aarch64: implement SVE kernels for q2_k_q8_k vector dot (#12064)

* Added SVE Support for Q2_K Quantized Models

* Use 4-space indentation in the switch cases

* removed comments lines

* Remove the loop Retain the curly bracess for better understanding of code

* Remove the comment like added for q3_k_q8_k kernel

---------

Co-authored-by: vithulep <p.m.vithule1517@gmail.com>

b4786

28 Feb 08:17
fbeda90
Compare
Choose a tag to compare
vulkan: matmul dequantization improvements (#12015)

* faster dequant for old quants

* dont use unpack for iq4_nl

* vec2 unpack for q8

b4785

28 Feb 07:57
581650b
Compare
Choose a tag to compare
vulkan: improve im2col (#11826)

* vulkan: improve im2col performance

b4784

27 Feb 08:23
b95c8af
Compare
Choose a tag to compare
cmake: Fix ggml backend dependencies and installation (#11818)

* Fix dependencies between ggml and backends

ggml backends link only to ggml-base and ggml links to all backends.

* Fix installation of ggml backends

Set up GNUInstallDirs before setting the installation directory of ggml backends

b4783

26 Feb 15:05
a800ae4
Compare
Choose a tag to compare
llava : add struct for FFI bindgen (#12079)

* add struct for FFI bindgen

* Apply suggestions from code review

---------

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

b4778

25 Feb 16:06
a82c9e7
Compare
Choose a tag to compare
vulkan: fix assertion when qy_needs_dequant (#12068)

Looks like a copy/paste bug from qx_needs_dequant.