TracerPythonScript¶
- class lightning_app.components.python.tracer.TracerPythonScript(script_path, script_args=None, outputs=None, env=None, code=None, **kwargs)[source]¶
Bases:
lightning_app.core.work.LightningWork
The TracerPythonScript class enables to easily run a python script.
When subclassing this class, you can configure your own
Tracer
byconfigure_tracer()
method.The tracer is quite a magical class. It enables you to inject code into a script execution without changing it.
- Parameters
- Raises
FileNotFoundError – If the provided script_path doesn’t exists.
How does it work?
It works by executing the python script with python built-in runpy run_path method. This method takes any python globals before executing the script, e.g., you can modify classes or function from the script.
Example
>>> from lightning_app.components.python import TracerPythonScript >>> f = open("a.py", "w") >>> f.write("print('Hello World !')") 22 >>> f.close() >>> python_script = TracerPythonScript("a.py") >>> python_script.run() Hello World ! >>> os.remove("a.py")
In the example below, we subclass the
TracerPythonScript
component and override its configure_tracer method.Using the Tracer, we are patching the
__init__
method of the PyTorch Lightning Trainer. Once the script starts running and if a Trainer is instantiated, the providedpre_fn
is called and we inject a Lightning callback.This callback has a reference to the work and on every batch end, we are capturing the trainer
global_step
andbest_model_path
.Even more interesting, this component works for ANY PyTorch Lightning script and its state can be used in real time in a UI.
from lightning.app.components import TracerPythonScript from lightning.app.storage import Path from lightning.app.utilities.tracer import Tracer from pytorch_lightning import Trainer class PLTracerPythonScript(TracerPythonScript): """This component can be used for ANY PyTorch Lightning script to track its progress and extract its best model path.""" def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) # Define the component state. self.global_step = None self.best_model_path = None def configure_tracer(self) -> Tracer: from pytorch_lightning.callbacks import Callback class MyInjectedCallback(Callback): def __init__(self, lightning_work): self.lightning_work = lightning_work def on_train_start(self, trainer, pl_module) -> None: print("This code doesn't belong to the script but was injected.") print("Even the Lightning Work is available and state transfer works !") print(self.lightning_work) def on_batch_train_end(self, trainer, *_) -> None: # On every batch end, collects some information. # This is communicated automatically to the rest of the app, # so you can track your training in real time in the Lightning App UI. self.lightning_work.global_step = trainer.global_step best_model_path = trainer.checkpoint_callback.best_model_path if best_model_path: self.lightning_work.best_model_path = Path(best_model_path) # This hook would be called every time # before a Trainer `__init__` method is called. def trainer_pre_fn(trainer, *args, **kwargs): kwargs["callbacks"] = kwargs.get("callbacks", []) + [MyInjectedCallback(self)] return {}, args, kwargs tracer = super().configure_tracer() tracer.add_traced(Trainer, "__init__", pre_fn=trainer_pre_fn) return tracer if __name__ == "__main__": comp = PLTracerPythonScript(Path(__file__).parent / "pl_script.py") res = comp.run()
Once implemented, this component can easily be integrated within a larger app to execute a specific python script.
import os from pathlib import Path from examples.components.python.component_tracer import PLTracerPythonScript import lightning as L class RootFlow(L.LightningFlow): def __init__(self): super().__init__() script_path = Path(__file__).parent / "pl_script.py" self.tracer_python_script = PLTracerPythonScript(script_path) def run(self): assert os.getenv("GLOBAL_RANK", "0") == "0" if not self.tracer_python_script.has_started: self.tracer_python_script.run() if self.tracer_python_script.has_succeeded: self.stop("tracer script succeed") if self.tracer_python_script.has_failed: self.stop("tracer script failed") app = L.LightningApp(RootFlow())
- configure_tracer()[source]¶
Override this hook to customize your tracer when running PythonScript.
- Return type
Tracer
- run(params=None, restart_count=None, code_dir='.', **kwargs)[source]¶
- Parameters
params¶ (
Optional
[Dict
[str
,Any
]]) – A dictionary of arguments to be be added to script_args.restart_count¶ (
Optional
[int
]) – Passes an incrementing counter to enable the re-execution of LightningWorks.code_dir¶ (
Optional
[str
]) – A path string determining where the source is extracted, default is current directory.