Releases: gotzmann/llama.go
Releases · gotzmann/llama.go
v1.4: Server Mode
Better Defaults
Nothing special, more stable inference and more sane default parameters
AVX2 and NEON
Inference performance was boosted for CPUs supporting vector math.
Please use:
--neon flag for Apple Silicon (M1-M3 processors) and ARM servers
--avx for Intel and AMD CPUs which supports AVX2 instruction set
Big Models are OK
This version supports bigger / multipart LLaMA models (tested with 7B / 13B) converted into latest GGMJ binary format with custom Python script (see README).
April 12 - First Man in Space
The very first public release of LLaMA.go