The performance of deep learning models are driven by various parameters but to tune all of them every time, for every given dataset, is a heuristic practice. In this paper, unlike the common practice of decaying the learning rate, we propose a step-wise training strategy where the learning rate and the batch size are tuned based on the dataset size. Here, the given dataset size is progressively increased during the training to boost the network performance without saturating the learning curve, which is seen after certain epochs. We conducted extensive experiments on multiple networks and datasets to validate the proposed training strategy. The experimental results proves our hypothesis that the learning rate, the batch size and the data size are interrelated and can improve the network accuracy if an optimal progressive step-wise training strategy is applied. The proposed strategy also reduces the overall training cost compared to the baseline approach.
This research is supported by the Institute for Infocomm Research (I2R), A*STAR. Grant number is not applicable.