BitsandbytesPrecision

class lightning.pytorch.plugins.precision.BitsandbytesPrecision(mode, dtype=None, ignore_modules=None)[source]

Plugin for quantizing weights with bitsandbytes.

Warning

This is an experimental feature.

Note

The optimizer is not automatically replaced with bitsandbytes.optim.Adam8bit or equivalent 8-bit optimizers.

Parameters:

mode (Literal['nf4', 'nf4-dq', 'fp4', 'fp4-dq', 'int8', 'int8-training']) – The quantization mode to use.
dtype (Optional[dtype]) – The compute dtype to use.
ignore_modules (Optional[set[str]]) – The submodules whose Linear layers should not be replaced, for example. {"lm_head"}. This might be desirable for numerical stability. The string will be checked in as a prefix, so a value like “transformer.blocks” will ignore all linear layers in all of the transformer blocks.