2018-11-07

优化方法，汇总

Adaptive gradient methods, such as AdaGrad (Duchi
et al., 2011), RMSProp (Tieleman & Hinton, 2012), and
Adam (Kingma & Ba, 2014) have become a default method
of choice for training feed-forward and recurrent neural
networks.

优化方法	简介	公式	适用场景	其他
Adadelta
Adagrad
Adam
SparseAdam
Adamax
ASGD	平均SGD
LBFGS
RMSprop
Rprop	resilient BP
SGD

参考

[pytorch的优化器]http://pytorch.org/docs/master/optim.html
keras的优化器
sklearn的优化器
《凸优化》