Imitation From Observation Using Behavioral Learning
Download Imitation From Observation Using Behavioral Learning full books in PDF, epub, and Kindle. Read online free Imitation From Observation Using Behavioral Learning ebook anywhere anytime directly on your device. Fast Download speed and no annoying ads. We cannot guarantee that every ebooks is available!
Imitation from Observation Using Behavioral Learning
Author | : Medric B. Djeafea Sonwa |
Publisher | : |
Total Pages | : 0 |
Release | : 2022 |
Genre | : |
ISBN | : |
Download Imitation from Observation Using Behavioral Learning Book in PDF, Epub and Kindle
Imitation from observation (IfO) is a learning paradigm that consists of training autonomous agents in a Markov Decision Process (MDP) by observing an expert's demonstrations and without access to its actions. These demonstrations could be sequences of environment states or raw visual observations of the environment. Although the setting using low-dimensional states has allowed obtaining convincing results with recent approaches, the use of visual observations remains an important challenge in IfO. One of the most common procedures adopted to solve the IfO problem is to learn a reward function from the demonstrations, but the need to understand the environment and the expert's moves through videos to appropriately reward the learning agent increases the complexity of the problem. We approach this problem with a method that focuses on the representation of the agent's behaviors in a latent space using demonstrative videos. Our approach exploits recent techniques of contrastive learning of image and video and uses a bootstrapping algorithm to progressively train a trajectory encoding function from the variation of the agent's policy. Simultaneously, this function rewards the imitating agent through a Reinforcement Learning (RL) algorithm. Our method uses a limited number of demonstrative videos and we do not have access to any expert policy. Our imitating agents in experiments show convincing performances on a set of control tasks and demonstrate that learning a behavior encoding function from videos allows for building an efficient reward function in MDP.
Imitation from Observation Using Behavioral Learning Related Books
Pages: 0
Pages: 48
Pages: 3643
Pages: 420
Pages: 194