Logo des Repositoriums
 

Training a deep policy gradient-based neural network with asynchronous learners on a simulated robotic problem

dc.contributor.authorLötzsch, Winfried
dc.contributor.authorVitay, Julien
dc.contributor.authorHamker, Fred
dc.contributor.editorEibl, Maximilian
dc.contributor.editorGaedke, Martin
dc.date.accessioned2017-08-28T23:47:39Z
dc.date.available2017-08-28T23:47:39Z
dc.date.issued2017
dc.description.abstractRecent advances in deep reinforcement learning methods have attracted a lot of attention, because of their ability to use raw signals such as video streams as inputs, instead of pre-processed state variables. However, the most popular methods (value-based methods, e.g. deep Q-networks) focus on discrete action spaces (e.g. the left/right buttons), while realistic robotic applications usually require a continuous action space (for example the joint space). Policy gradient methods, such as stochastic policy gradient or deep deterministic policy gradient, propose to overcome this problem by allowing continuous action spaces. Despite their promises, they suffer from long training times as they need huge numbers of interactions to converge. In this paper, we investigate in how far a recent asynchronously parallel actor-critic approach, initially proposed to speed up discrete RL algorithms, could be used for the continuous control of robotic arms. We demonstrate the capabilities of this end-to-end learning algorithm on a simulated 2 degrees-of-freedom robotic arm and discuss its applications to more realistic scenarios.en
dc.identifier.doi10.18420/in2017_214
dc.identifier.isbn978-3-88579-669-5
dc.identifier.pissn1617-5468
dc.language.isoen
dc.publisherGesellschaft für Informatik, Bonn
dc.relation.ispartofINFORMATIK 2017
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-275
dc.subjectdeep learning
dc.subjectreinforcement learning
dc.subjectrobotics
dc.subjectpolicy gradient methods
dc.subjectactor critic
dc.subjectasynchronous learning
dc.subjectcontinuous action space
dc.titleTraining a deep policy gradient-based neural network with asynchronous learners on a simulated robotic problemen
gi.citation.endPage2154
gi.citation.startPage2143
gi.conference.date25.-29. September 2017
gi.conference.locationChemnitz
gi.conference.sessiontitleDeep Learning in heterogenen Datenbeständen

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
B29-1.pdf
Größe:
541.64 KB
Format:
Adobe Portable Document Format