RAL 2017: Guiding trajectory optimization by demonstrated distributions

Trajectory optimization is an essential tool for motion planning under multiple constraints of robotic manipulators. Optimization-based methods can explicitly optimize a trajectory by leveraging prior knowledge of the system and have been used in various applications such as collision avoidance. However, these methods often require a hand-coded cost function in order to achieve the desired behavior. Specifying such cost function for a complex desired behavior, e.g., disentangling a rope, is a nontrivial task that is often even infeasible. Learning from demonstration (LfD) methods offer an alternative way to program robot motion. LfD methods are less dependent on analytical models and instead learn the behavior of experts implicitly from the demonstrated trajectories. However, the problem of adapting the demonstrations to new situations, e.g., avoiding newly introduced obstacles, has not been fully investigated in the literature. In this paper, we present a motion planning framework that combines the advantages of optimization-based and demonstration-based methods. We learn a distribution of trajectories demonstrated by human experts and use it to guide the trajectory optimization
process. The resulting trajectory maintains the demonstrated behaviors, which are essential to performing the task successfully, while adapting the trajectory to avoid obstacles. In simulated experiments and with a real robotic system, we verify that our approach optimizes the trajectory to avoid obstacles and encodes the demonstrated behavior in the resulting trajectory

  • T. Osa, A. G. M. Esfahani, R. Stolkin, R. Lioutikov, J. Peters, and G. Neumann, “Guiding trajectory optimization by demonstrated distributions,” IEEE Robotics and Automation Letters (RA-L), vol. 2, iss. 2, pp. 819-826, 2017.
    [BibTeX] [Abstract] [Download PDF]

    Trajectory optimization is an essential tool for motion planning under multiple constraints of robotic manipulators. Optimization-based methods can explicitly optimize a trajectory by leveraging prior knowledge of the system and have been used in various applications such as collision avoidance. However, these methods often require a hand-coded cost function in order to achieve the desired behavior. Specifying such cost function for a complex desired behavior, e.g., disentangling a rope, is a nontrivial task that is often even infeasible. Learning from demonstration (LfD) methods offer an alternative way to program robot motion. LfD methods are less dependent on analytical models and instead learn the behavior of experts implicitly from the demonstrated trajectories. However, the problem of adapting the demonstrations to new situations, e.g., avoiding newly introduced obstacles, has not been fully investigated in the literature. In this paper, we present a motion planning framework that combines the advantages of optimization-based and demonstration-based methods. We learn a distribution of trajectories demonstrated by human experts and use it to guide the trajectory optimization process. The resulting trajectory maintains the demonstrated behaviors, which are essential to performing the task successfully, while adapting the trajectory to avoid obstacles. In simulated experiments and with a real robotic system, we verify that our approach optimizes the trajectory to avoid obstacles and encodes the demonstrated behavior in the resulting trajectory

    @article{lirolem26731,
    author = {Takayuki Osa and Amir M. Ghalamzan Esfahani and Rustam Stolkin and Rudolf Lioutikov and Jan Peters and Gerhard Neumann},
    journal = {IEEE Robotics and Automation Letters (RA-L)},
    year = {2017},
    pages = {819--826},
    title = {Guiding trajectory optimization by demonstrated distributions},
    number = {2},
    volume = {2},
    publisher = {IEEE},
    month = {January},
    abstract = {Trajectory optimization is an essential tool for motion
    planning under multiple constraints of robotic manipulators.
    Optimization-based methods can explicitly optimize a trajectory
    by leveraging prior knowledge of the system and have been used
    in various applications such as collision avoidance. However, these
    methods often require a hand-coded cost function in order to
    achieve the desired behavior. Specifying such cost function for
    a complex desired behavior, e.g., disentangling a rope, is a nontrivial
    task that is often even infeasible. Learning from demonstration
    (LfD) methods offer an alternative way to program robot
    motion. LfD methods are less dependent on analytical models
    and instead learn the behavior of experts implicitly from the
    demonstrated trajectories. However, the problem of adapting the
    demonstrations to new situations, e.g., avoiding newly introduced
    obstacles, has not been fully investigated in the literature. In this
    paper, we present a motion planning framework that combines
    the advantages of optimization-based and demonstration-based
    methods. We learn a distribution of trajectories demonstrated by
    human experts and use it to guide the trajectory optimization
    process. The resulting trajectory maintains the demonstrated
    behaviors, which are essential to performing the task successfully,
    while adapting the trajectory to avoid obstacles. In simulated
    experiments and with a real robotic system, we verify that our
    approach optimizes the trajectory to avoid obstacles and encodes
    the demonstrated behavior in the resulting trajectory},
    keywords = {ARRAY(0x55fe0a186e80)},
    url = {http://eprints.lincoln.ac.uk/26731/}
    }

Video: