# Copyright The PyTorch Lightning team.## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License."""XLA Profiler will help you debug and optimize training workload performance for your models using Cloud TPUperformance tools.Manual capture via TensorBoardThe following instructions are for capturing trace from a running program0. This [guide](https://cloud.google.com/tpu/docs/pytorch-xla-performance-profiling-tpu-vm#tpu-vm) willhelp you with the Cloud TPU setup with the required installations1. Start a TensorBoard Server>> tensorboard --logdir ./tensorboard --port 9001You could view the TensorBoard output at http://localhost:9001 on your local machine, and then open the``PROFILE`` plugin from the top right dropdown or open http://localhost:9001/#profile2. Once the code you'd like to profile is running, click on ``CAPTURE PROFILE`` button. You could enter``localhost:9012`` (default port for XLA Profiler) as the Profile Service URL. Then, you could enterthe number of milliseconds for the profiling duration, and click ``CAPTURE``3. Make sure the code is running, while you are trying to capture the traces. Also, it would lead to betterperformance insights if the profiling duration is longer than the step time4. Once the capture is finished, the page will refresh and you could browse through the insights using the``Tools`` dropdown at the top left"""importloggingfromtypingimportDictfrompytorch_lightning.profiler.baseimportBaseProfilerfrompytorch_lightning.utilitiesimport_TPU_AVAILABLEif_TPU_AVAILABLE:importtorch_xla.debug.profilerasxplog=logging.getLogger(__name__)
[docs]classXLAProfiler(BaseProfiler):STEP_FUNCTIONS={"training_step_and_backward","validation_step","test_step","predict_step"}RECORD_FUNCTIONS={"training_step_and_backward","training_step","backward","validation_step","test_step","predict_step",}def__init__(self,port:int=9012)->None:"""This Profiler will help you debug and optimize training workload performance for your models using Cloud TPU performance tools."""super().__init__(dirpath=None,filename=None)self.port=portself._recording_map:Dict={}self._step_recoding_map:Dict={}self._start_trace:bool=False
To analyze traffic and optimize your experience, we serve cookies on this site. By clicking or navigating, you agree to allow our usage of cookies. Read PyTorch Lightning's Privacy Policy.