caten/gguf
GGUF
Looking to work with quantized models? The caten/gguf
package enables you to read quantized model weights directly from GGUF files, construct a StateDict, and create tokenizers for caten/llm
. It's a convenient tool for integrating quantized models into your projects.
Please note that the quantization functionality currently supports only a limited number of bit depths. We appreciate your understanding and are continuously working to expand this feature.
(As of this writing, fp32/fp16 only, Int8 Quant will be added soon!)
[class] GGUF
A class that represents the GGUF file format.
-
(gguf-version gguf) returns a fixnum indicating the version of the GGUF file.
-
(gguf-tensor-count gguf) returns the number of tensors in the GGUF file.
-
(gguf-metadata-kv-count gguf) returns the number of metadata key-value pairs in the GGUF file.
-
(gguf-metadata gguf) returns a list of metadata key-value pairs.
-
(gguf-tensor-info gguf) returns a list of tensor information.
-
(gguf-metadata-get gguf key) to get the corresponding value of the key where key is a string.
[function] make-gguf
Creates GGUF from the given stream.
The definition is described in the following link:
In short, this function accepts the following format:
[Magic Number (4 Byte)] | [GGUF Version (4 Byte)] | [Tensor_Count (8 Byte)] | [Metadata_KV_Count (8 Byte)] | [Rest_data]
[function] load-gguf
Creates a gguf file from the given pathname.[function] load-gguf-url
Creates a gguf file from the given URL. The downloaded file will be saved in the output-directory named as filename. If the file already exists, it will not download the file again.[function] gguf->state-dict
Creates a caten/state-dict from the given gguf file's tensor-info.[function] gguf->bpe-tokenizer
(gguf->bpe-tokenizer gguf &key (metadata-tokens "tokenizer.ggml.tokens") (metadata-merges "tokenizer.ggml.merges"))
Creates a BPE tokenizer (which is caten/llm:Tokenizer) from the given gguf file's metadata.
[struct] Metadata
A structure to represent metadata in GGUF file.
(metadata-key metadata) to access the key typed as string. (metadata-value-type metadata) to access the type of value typed as a keyword. (metadata-value metadata) to access the value typed as a number, boolean, or simple-array.
[struct] Tensor-Info
Tensor-Info stores the information of a tensor in the gguf file. (tensor-info-name tensor-info) returns the name of the tensor which is a string, (tensor-info-n-dimension tensor-info) returns the rank of the tensor. (tensor-info-dimensions tensor-info) returns the shape of the tensor. (tensor-info-ggml-type tensor-info) returns the data type of the tensor which is a keyword. (tensor-info-relative-offset tensor-info) returns the offset of the tensor in the gguf file. (tensor-info-absolute-offset tensor-info) returns the absolute offset of the tensor in the stream, (inconviniently buffers are stored with this offset but caten will precompute them). (tensor-info-buffer tensor-info) returns the parsed buffer of the tensor.