thunder.plugins.QuantizeInt4¶
- class thunder.plugins.QuantizeInt4[source]¶
Bases:
PluginPlugin for 4-bit integer quantization using BitsAndBytes.
This plugin applies a 4-bit linear quantization transform to model weights, reducing memory footprint and improving throughput for both training and inference.
See https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/bitsandbytes/functional.py#L889 for more details.
Methods
__init__()Fetches the BitsAndBytes quantization executor.
setup_lookasides()Fetches the BitsAndBytes quantization transform.
Attributes
policy