The sharp edges
This section describes features in the language that are not (yet) supported by Thunder, along with their workarounds. You might encounter these when compiling a module or function with Thunder. The examples should give you a good idea of how to change your program so that it works.
Note that the fact that something is not supported today doesn’t mean it won’t be supported at some point in the future. Feel free to reach out to help us prioritize.
Inplace operations
Inplace PyTorch operations like t.add_(1.0) are not supported in Thunder yet. Support for inplace operations is coming soon.
Tensor subclasses
Thunder currently supports Python data types and PyTorch tensors as inputs of functions and models.
Subclasses of these types, e.g. lazy tensors, nested tensors, or sparse tensors are not supported today.
Tracing Python builtins, standard library operations and functions that call other languages
Calling a Python builtin, standard library operation, or a function that calls into another language is safe to trace, so long as the following rules are observed:
The function should not have side effects. For example, calling
print()
will execute theprint()
function while tracing, but since it’s not a Thunder operation it will not appear in a trace, and so future cached executions will not execute theprint()
statement.The function must not manipulate tensor data or metadata. Since the operation won’t appear in a trace, these manipulations won’t be repeated by Thunder, and may even cause a crash while tracing. To implement such operations, see Adding Custom Operators
The function must not produce different results across invocations. Again, since the operation won’t appear in traces, Thunder cannot replicate an operation that produces different results when it’s invoked, like
random.random()
will.
Using Thunder-optimized Modules
Compiling a module produces a Thunder-optimized module”. A Thunder-optimized module is less dynamic than the original module, which facilitates tracing and optimization. It has a reference to the original module, and it shares its parameters with it.
While modifying the original model’s parameters will reflect in the Thunder-optimized module, other changes to the original module will not. In particular:
Whether model is in
train
oreval
mode is captured at compilation time and constantThe structure of the module is captured at compilation time, and changing the original module’s structure will likely break the Thunder-optimized module
Non-parameter attributes of the module may or may not be captured at compile time and treated as constants
Not all features of PyTorch modules are currently supported, either. Module hooks are not supported, and adding new module attributes in a module’s forward()
method is only partially supported.