2024-03-29T06:46:51Z
https://nagoya.repo.nii.ac.jp/oai
oai:nagoya.repo.nii.ac.jp:02002686
2023-01-16T05:08:01Z
320:321:322
Path Integral Policy Improvement With Population Adaptation
Yamamoto, Kosuke
Ariizumi, Ryo
Hayakawa, Tomohiro
Matsuno, Fumitoshi
Path integral policy improvement (PI^2) is known to be an efficient reinforcement learning algorithm, particularly, if the target system is a high-dimensional dynamical system. However, PI^2 , and its existing extensions, have adjustable parameters, on which the efficiency depends significantly. This article proposes an extension of PI^2 that adjusts all of the critical parameters automatically. Motion acquisition tasks for three different types of simulated legged robots were performed to test the efficacy of the proposed algorithm. The results show that the proposed method cannot only eliminate the burden on the user to set the parameters appropriately but also improve the optimization performance significantly. For one of the acquired motions, a real robot experiment was conducted to show the validity of the motion.
journal article
IEEE
2022-01
application/pdf
IEEE Transactions on Cybernetics
1
52
312
322
2168-2267
https://nagoya.repo.nii.ac.jp/record/2002686/files/FINAL_VERSION.pdf
eng
https://doi.org/10.1109/TCYB.2020.2983923
“© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.”