Reader small image

You're reading from  Statistics for Machine Learning

Product typeBook
Published inJul 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781788295758
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Pratap Dangeti
Pratap Dangeti
author image
Pratap Dangeti

Pratap Dangeti develops machine learning and deep learning solutions for structured, image, and text data at TCS, analytics and insights, innovation lab in Bangalore. He has acquired a lot of experience in both analytics and data science. He received his master's degree from IIT Bombay in its industrial engineering and operations research program. He is an artificial intelligence enthusiast. When not working, he likes to read about next-gen technologies and innovative methodologies.
Read more about Pratap Dangeti

Right arrow

Optimization of neural networks


Various techniques have been used for optimizing the weights of neural networks:

  • Stochastic gradient descent (SGD)
  • Momentum
  • Nesterov accelerated gradient (NAG)
  • Adaptive gradient (Adagrad)
  • Adadelta
  • RMSprop
  • Adaptive moment estimation (Adam)
  • Limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)

In practice, Adam is a good default choice; we will be covering its working methodology in this section. If you cannot afford full batch updates, then try out L-BFGS:

Stochastic gradient descent - SGD

Gradient descent is a way to minimize an objective function J(θ) parameterized by a model's parameter θ ε Rd by updating the parameters in the opposite direction of the gradient of the objective function with regard to the parameters. The learning rate determines the size of the steps taken to reach the minimum:

  • Batch gradient descent (all training observations utilized in each iteration)
  • SGD (one observation per iteration)
  • Mini batch gradient descent (size of about 50 training observations...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Statistics for Machine Learning
Published in: Jul 2017Publisher: PacktISBN-13: 9781788295758

Author (1)

author image
Pratap Dangeti

Pratap Dangeti develops machine learning and deep learning solutions for structured, image, and text data at TCS, analytics and insights, innovation lab in Bangalore. He has acquired a lot of experience in both analytics and data science. He received his master's degree from IIT Bombay in its industrial engineering and operations research program. He is an artificial intelligence enthusiast. When not working, he likes to read about next-gen technologies and innovative methodologies.
Read more about Pratap Dangeti