Enhanced Second-Order Optimization Techniques for Tackling Non-Convex Challenges in Machine Learning

Main Article Content

Ravindra Kumar Sharma, Chitra Singh, Anshu Singh

Abstract

In recent years, second order optimization techniques received increasing interest for their ability to increase convergence rates and performance of non-convex machine learning problems. Second order methods which impose the use of curvature information have been shown to offer faster convergence and better solution in situations where gradient information will yield poor results (high dimensionality, complex landscapes). For example, these techniques based on techniques like Newton's method and its variants modify an optimization trajectory by using the Hessian matrix or its approximations to adjust an informed path to navigate local minima or saddle points, which are common pitfalls in non-convex optimization. But the computational cost and memory requirements of second order methods have enforced this fact for not using second order methods for big scale machine learning tasks. Second order optimization, due to new algorithms which reduce the computational burden of Hessian calculations, and due to recent advancement in approximations such as quasi-Newton methods, has become more feasible for deep learning and other large scale machine learning applications. In this paper, we investigate different enhanced second order optimization methods, implement them in the machine learning context and see how they compare to the classic first order techniques. We show how this can be brought to non-convex problems with improved convergence speed and solution accuracy while demonstrating some current challenges and future research directions to further optimize these methods for non-convex machine learning tasks of large scale.

Article Details

How to Cite
Ravindra Kumar Sharma. (2023). Enhanced Second-Order Optimization Techniques for Tackling Non-Convex Challenges in Machine Learning. International Journal on Recent and Innovation Trends in Computing and Communication, 11(11s), 824–829. Retrieved from https://ijritcc.org/index.php/ijritcc/article/view/11209
Section
Articles