thunder.plugins.QuantizeInt4¶
- class thunder.plugins.QuantizeInt4[source]¶
Bases:
Plugin
Plugin for 4-bit integer quantization using BitsAndBytes.
This plugin applies a 4-bit linear quantization transform to model weights, reducing memory footprint and improving throughput for both training and inference.
See https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/bitsandbytes/functional.py#L889 for more details.
Methods
__init__
()Fetches the BitsAndBytes quantization executor.
setup_lookasides
()Fetches the BitsAndBytes quantization transform.
Attributes
policy