[Feature request] Implement 8-bit GPT-J #5

pablogranolabar · 2022-11-13T18:25:42Z

Results in ~11Gb weights vs. 16Gb, implemented in PyTorch now as load_in_8bit=True:

https://huggingface.co/hivemind/gpt-j-6B-8bit

* use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (ggml-org#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com> Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com> Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com> Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com> Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> Co-authored-by: jammm <2500920+jammm@users.noreply.github.com> Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>

* Identifying memory pools * Support for buffer type alignment and max size * Cache memory properties * Comments * Using fixed-width integrals * Buffer allocation support

ggerganov added the enhancement New feature or request label Nov 14, 2022

ggerganov mentioned this issue Feb 26, 2023

4-bit Integer quantisation #27

Merged

8 tasks

katsu560 mentioned this issue Mar 18, 2023

add OpenBLAS detection and modify tests codes #40

Merged

ggerganov closed this as completed in #27 Mar 29, 2023

PABannier added a commit to PABannier/ggml that referenced this issue Oct 20, 2024

sync: add ggml as submodule (ggml-org#5)

94775ca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Implement 8-bit GPT-J #5

[Feature request] Implement 8-bit GPT-J #5

pablogranolabar commented Nov 13, 2022