thunder.plugins.QuantizeInt4¶

class thunder.plugins.QuantizeInt4[source]¶

Bases: Plugin

Plugin for 4-bit integer quantization using BitsAndBytes.

This plugin applies a 4-bit linear quantization transform to model weights, reducing memory footprint and improving throughput for both training and inference.

See https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/bitsandbytes/functional.py#L889 for more details.

__init__()[source]¶

Methods

`__init__`()
`setup_executors`()	Fetches the BitsAndBytes quantization executor.
`setup_lookasides`()	rtype: `Optional`[`list`[`Lookaside`]]
`setup_transforms`()	Fetches the BitsAndBytes quantization transform.

Attributes

policy

setup_executors()[source]¶

Fetches the BitsAndBytes quantization executor.

Returns:: A list containing the Transformer Engine executor.
Return type:: list[Executor]

setup_transforms()[source]¶

Fetches the BitsAndBytes quantization transform.

Returns:: A list containing the Transformer Engine executor.
Return type:: list[Transform]