Caten Documentation

  • Home
  • Quickstart
  • Development
  • API Reference
    • caten/air
    • caten/aasm
    • caten/codegen
    • caten/api
      • Overview
      • Tensor
      • Func
      • Module
      • Model
      • Initializers
      • ShapeTracker
      • Facet API
      • StateDict
    • caten/nn
      • Activation
      • Convolution
      • Criterion
      • Embedding
      • Linear
      • Normalization
      • Padding
      • Pooling
      • Encoding
      • Optimizers
  • Ready to use packages
    • Overview
    • caten/apps.gpt2
  • External Packages
    • caten/gguf
    • caten/oonx
    • caten/llm
In this article
  • GGUF
    • [class] GGUF
    • [function] make-gguf
    • [function] load-gguf
    • [function] load-gguf-url
    • [function] gguf->state-dict
    • [function] gguf->bpe-tokenizer
    • [struct] Metadata
    • [struct] Tensor-Info

caten/gguf

  1. Caten Documentation
  2. External Packages
  3. caten/gguf
|
  • Share via

  •  Edit this article

GGUF

Looking to work with quantized models? The caten/gguf package enables you to read quantized model weights directly from GGUF files, construct a StateDict, and create tokenizers for caten/llm. It's a convenient tool for integrating quantized models into your projects.

Please note that the quantization functionality currently supports only a limited number of bit depths. We appreciate your understanding and are continuously working to expand this feature.

(As of this writing, fp32/fp16 only, Int8 Quant will be added soon!)

[class] GGUF

A class that represents the GGUF file format.

  • (gguf-version gguf) returns a fixnum indicating the version of the GGUF file.

  • (gguf-tensor-count gguf) returns the number of tensors in the GGUF file.

  • (gguf-metadata-kv-count gguf) returns the number of metadata key-value pairs in the GGUF file.

  • (gguf-metadata gguf) returns a list of metadata key-value pairs.

  • (gguf-tensor-info gguf) returns a list of tensor information.

  • (gguf-metadata-get gguf key) to get the corresponding value of the key where key is a string.

[function] make-gguf

Creates GGUF from the given stream.

The definition is described in the following link:

  • https://github.com/ggerganov/ggml/blob/master/docs/gguf.md#file-structure

In short, this function accepts the following format:

[Magic Number (4 Byte)] | [GGUF Version (4 Byte)] | [Tensor_Count (8 Byte)] | [Metadata_KV_Count (8 Byte)] | [Rest_data]

[function] load-gguf

(load-gguf pathname)
Creates a gguf file from the given pathname.

[function] load-gguf-url

(load-gguf-url url filename &optional output-directory)
Creates a gguf file from the given URL. The downloaded file will be saved in the output-directory named as filename. If the file already exists, it will not download the file again.

[function] gguf->state-dict

(gguf->state-dict gguf)
Creates a caten/state-dict from the given gguf file's tensor-info.

[function] gguf->bpe-tokenizer

(gguf->bpe-tokenizer gguf &key (metadata-tokens "tokenizer.ggml.tokens") (metadata-merges "tokenizer.ggml.merges"))

Creates a BPE tokenizer (which is caten/llm:Tokenizer) from the given gguf file's metadata.

[struct] Metadata

A structure to represent metadata in GGUF file.

(metadata-key metadata) to access the key typed as string. (metadata-value-type metadata) to access the type of value typed as a keyword. (metadata-value metadata) to access the value typed as a number, boolean, or simple-array.

[struct] Tensor-Info

Tensor-Info stores the information of a tensor in the gguf file. (tensor-info-name tensor-info) returns the name of the tensor which is a string, (tensor-info-n-dimension tensor-info) returns the rank of the tensor. (tensor-info-dimensions tensor-info) returns the shape of the tensor. (tensor-info-ggml-type tensor-info) returns the data type of the tensor which is a keyword. (tensor-info-relative-offset tensor-info) returns the offset of the tensor in the gguf file. (tensor-info-absolute-offset tensor-info) returns the absolute offset of the tensor in the stream, (inconviniently buffers are stored with this offset but caten will precompute them). (tensor-info-buffer tensor-info) returns the parsed buffer of the tensor.

Search
Enter a keyword to search.