Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
This paper introduces the Parametric Rectified Linear Unit (PReLU) and a new initialization method for deep neural networks, achieving state-of-the-art results on ImageNet classification, surpassing human-level performance.
Abstract Parametric Rectified Linear Units (PReLU) and a robust initialization method enable training of extremely deep rectified models, surpassing human-level performance on ImageNet classification with 4.94% top-5 error. | 1:38Explained | |
1. Introduction Convolutional neural networks achieve human-comparable accuracy in visual recognition tasks, and this work presents a method that surpasses human-level performance on the ImageNet dataset by utilizing Parametric Rectified Linear Units and a novel initialization strategy. | 2:09Explained | |
2.1. Parametric Rectifiers Parametric Rectified Linear Units (PReLU) adaptively learn parameters to improve classification accuracy with negligible computational cost and risk of overfitting compared to standard ReLU. | 2:05Explained | |
2.1. Comparison Experiments PReLU activation functions consistently improve classification accuracy over ReLU across various models, with learned coefficients indicating respect for both positive and negative filter responses and increased nonlinearity in deeper layers. | 1:48Explained | |
2.2. Initialization of Filter Weights for Rectifiers A robust initialization method is derived by considering rectifier nonlinearities, enabling the training of extremely deep rectifier networks that would otherwise stall with standard initialization techniques. | 1:44Explained | |
2.3. Comparisons with "Xavier" Initialization The proposed initialization method, accounting for rectifier nonlinearities, allows extremely deep models to converge, outperforming 'Xavier' initialization which struggles with models beyond a certain depth. | 2:04Explained | |
2.3. Discussion on Rectifiers The asymmetric nature of rectifiers necessitates algorithmic changes to account for their non-zero mean response, impacting both Fisher Information Matrix analysis and initialization strategies. | 1:58Explained | |
4. Experiments on ImageNet Experiments on ImageNet demonstrate that PReLU consistently improves accuracy over ReLU with almost no computational cost, and the best single model achieves 5.71% top-5 error, surpassing all multi-model results from ILSVRC 2014 and exceeding reported human-level performance. | 2:05Explained |