优化方法,汇总

Adaptive gradient methods, such as AdaGrad (Duchi
et al., 2011), RMSProp (Tieleman & Hinton, 2012), and
Adam (Kingma & Ba, 2014) have become a default method
of choice for training feed-forward and recurrent neural
networks.

优化方法 简介 公式 适用场景 其他
Adadelta
Adagrad
Adam
SparseAdam
Adamax
ASGD 平均SGD
LBFGS
RMSprop
Rprop resilient BP
SGD

参考