Fine-tuning the learning rate and batch size
Detectron2 provides several configurations for its default stochastic gradient descent solver (optimizer). However, these are the main parameters:
- The
cfg.SOLVER.IMS_PER_BATCH
hyperparameter sets the batch size or the number of images per training iteration. - The
cfg.SOLVER.BASE_LR
hyperparameter is used to set the base learning rate. - The
cfg.SOLVER.MOMENTUM
hyperparameter stores the momentum value. - The
cfg.SOLVER.NESTROV
hyperparameter dictates whether to use Nesterov’s implementation of momentum. - The
cfg.SOLVER.WARMUP_ITERS
hyperparameter stores the number of warm-up iterations. - The
cfg.SOLVER.STEPS
hyperparameter sets the iterations at which the learning rate is reduced bycfg.SOLVER.GAMMA
(another hyperparameter). - The
cfg.SOLVER.MAX_ITER
hyperparameter sets the maximum number of iterations for training. This iteration is not counted as epochs but as batches.