Logo des Repositoriums
 

Training Restricted Boltzmann Machines

dc.contributor.authorFischer, Asja
dc.date.accessioned2018-01-08T09:18:05Z
dc.date.available2018-01-08T09:18:05Z
dc.date.issued2015
dc.description.abstractRestricted Boltzmann Machines (RBMs), two-layered probabilistic graphical models that can also be interpreted as feed forward neural networks, enjoy much popularity for pattern analysis and generation. Training RBMs however is challenging. It is based on likelihood maximization, but the likelihood and its gradient are computationally intractable. Therefore, training algorithms such as Contrastive Divergence (CD) and learning based on Parallel Tempering (PT) rely on Markov chain Monte Carlo methods to approximate the gradient. The presented thesis contributes to understanding RBM training methods by presenting an empirical and theoretical analysis of the bias of the CD approximation and a bound on the mixing rate of PT. Furthermore, the thesis improves RBM training by proposing a new transition operator leading to faster mixing Markov chains, by investigating a different parameterization of the RBM model class referred to as centered RBMs, and by exploring estimation techniques from statistical physics to approximate the likelihood. Finally, an analysis of the representational power of deep belief networks with real-valued visible variables is given.
dc.identifier.pissn1610-1987
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/11488
dc.publisherSpringer
dc.relation.ispartofKI - Künstliche Intelligenz: Vol. 29, No. 4
dc.relation.ispartofseriesKI - Künstliche Intelligenz
dc.subjectContrastive divergence
dc.subjectDeep belief networks
dc.subjectLikelihood estimation
dc.subjectMixing rate
dc.subjectParallel tempering
dc.subjectRestricted Boltzmann machines
dc.titleTraining Restricted Boltzmann Machines
dc.typeText/Journal Article
gi.citation.endPage444
gi.citation.startPage441

Dateien