Item type |
itemtype_ver1(1) |
公開日 |
2022-05-09 |
タイトル |
|
|
タイトル |
Path Integral Policy Improvement With Population Adaptation |
|
言語 |
en |
著者 |
Yamamoto, Kosuke
Ariizumi, Ryo
Hayakawa, Tomohiro
Matsuno, Fumitoshi
|
アクセス権 |
|
|
アクセス権 |
open access |
|
アクセス権URI |
http://purl.org/coar/access_right/c_abf2 |
権利 |
|
|
言語 |
en |
|
権利情報 |
“© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.” |
内容記述 |
|
|
内容記述 |
Path integral policy improvement (PI^2) is known to be an efficient reinforcement learning algorithm, particularly, if the target system is a high-dimensional dynamical system. However, PI^2 , and its existing extensions, have adjustable parameters, on which the efficiency depends significantly. This article proposes an extension of PI^2 that adjusts all of the critical parameters automatically. Motion acquisition tasks for three different types of simulated legged robots were performed to test the efficacy of the proposed algorithm. The results show that the proposed method cannot only eliminate the burden on the user to set the parameters appropriately but also improve the optimization performance significantly. For one of the acquired motions, a real robot experiment was conducted to show the validity of the motion. |
|
言語 |
en |
|
内容記述タイプ |
Abstract |
出版者 |
|
|
言語 |
en |
|
出版者 |
IEEE |
言語 |
|
|
言語 |
eng |
資源タイプ |
|
|
資源タイプresource |
http://purl.org/coar/resource_type/c_6501 |
|
タイプ |
journal article |
出版タイプ |
|
|
出版タイプ |
AM |
|
出版タイプResource |
http://purl.org/coar/version/c_ab4af688f83e57aa |
関連情報 |
|
|
関連タイプ |
isVersionOf |
|
|
識別子タイプ |
DOI |
|
|
関連識別子 |
https://doi.org/10.1109/TCYB.2020.2983923 |
収録物識別子 |
|
|
収録物識別子タイプ |
PISSN |
|
収録物識別子 |
2168-2267 |
書誌情報 |
en : IEEE Transactions on Cybernetics
巻 52,
号 1,
p. 312-322,
発行日 2022-01
|