About the Trainer category
|
|
0
|
411
|
August 26, 2020
|
Training stuck on resume
|
|
1
|
45
|
May 31, 2023
|
How to suppress trainer from printing directly to console?
|
|
0
|
10
|
May 31, 2023
|
Confusing # of optimizer steps when using gradient accumulation with DeepSpeed
|
|
0
|
16
|
May 25, 2023
|
Training when data is stored in batches
|
|
2
|
29
|
May 21, 2023
|
Trainer prints every step in validation
|
|
2
|
128
|
May 17, 2023
|
Weird result in convolutional network
|
|
2
|
80
|
May 14, 2023
|
Retraining a model with new data
|
|
1
|
55
|
May 9, 2023
|
How to use SWA with a cyclic scheduler
|
|
0
|
34
|
May 7, 2023
|
Resume training / load module from DeepSpeed checkpoint
|
|
14
|
429
|
May 6, 2023
|
Resuming training gives different model result / weights
|
|
0
|
102
|
May 4, 2023
|
Wonder if _update_learning_rates is properly implemented
|
|
0
|
37
|
April 19, 2023
|
Why is the Trainer instance saved inside the DataModule during checkpoint save?
|
|
2
|
96
|
April 11, 2023
|
Trainer.validate/test with ckpt_path does not resume global_step
|
|
3
|
44
|
April 7, 2023
|
Is gradient clipping done before or after gradients accumulation?
|
|
2
|
71
|
April 5, 2023
|
Multiple dataloaders and epoch calculation
|
|
0
|
45
|
April 1, 2023
|
How does `LightningOptimizer.zero_grad()` work?
|
|
2
|
63
|
March 31, 2023
|
Number of steps drifts for `val_check_interval` when gradient accumulation turned on
|
|
0
|
49
|
March 26, 2023
|
Global_step increased at new epoch regardless of gradient accumulation
|
|
2
|
68
|
March 26, 2023
|
Incorrect batch size being inferred using trainer.fit(), correct batch size in dataloader? What could be going wrong? [PyLightning]
|
|
1
|
65
|
March 26, 2023
|
Model Works on CPU but Error out while running on GPU
|
|
1
|
454
|
March 25, 2023
|
How to continue training for more epochs?
|
|
1
|
253
|
March 25, 2023
|
Changing batch size during trainig
|
|
3
|
240
|
March 20, 2023
|
Issue during test stage when load_from_checkpoint
|
|
4
|
1416
|
March 14, 2023
|
LR Finder MNIST
|
|
1
|
153
|
March 12, 2023
|
Modifying the Trainer when calling Trainer.fit() multiple times
|
|
2
|
742
|
February 18, 2023
|
Error while training simclr model
|
|
0
|
71
|
February 12, 2023
|
Question about auto_lr_find()
|
|
1
|
1703
|
January 31, 2023
|
How do I prevent initial validation run in Trainer 1.9.0?
|
|
1
|
92
|
January 24, 2023
|
Save_last and monitor in ModelCheckpoint
|
|
0
|
67
|
January 23, 2023
|