Home

My name is Gerhard Neumann and I am Professor of Robotics and Autonomous Systems at the University of Lincoln. I am heading the Computational Learning for Autonomous Systems (CLAS) team which is part of the Lincoln Centre for Autonomous Systems (LCAS). Part of the team is still based at the TU Darmstadt, my former affiliation. We still heavily cooperate with other teams from TUDa such as the IAS team from Prof. Jan Peters or the group from Prof. Johannes Fuernkranz. We work on machine learning methods for autonomous systems, with a focus on reinforcement learning, policy search and imitation learning.  Our group’s research focus is the use of domain knowledge from robotics to develop new data-driven machine learning algorithms that scale favorably to the complexity of robotic tasks.

This site is still under construction. More content will be added soon.

Research Fields

Our research concentrates on the following sub-fields of machine learning:

Applications

We focus on a wide range of applications where machine learning methods could prove a huge benefit in the future. We work on specific applications (agriculture, nuclear robotics) where robots are highly needed in the next years while other application areas serve as proof-of-concept studies to evaluate our algorithms. Our application fields include:

  • Grasping and Manipulation
  • Agri-culture Robotics
  • Sort and Segregate of Nuclear Waste
  • Dynamic Motor Games, Table-Tennis, Beer-Bong…
  • Robot Swarms

News

  • 2 new IROS papers accepted!
    • J. Pajarinen, V. Kyrki, M. Koval, S. Srinivasa, J. Peters, and G. Neumann, “Hybrid control trajectory optimization under uncertainty,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.
      [BibTeX] [Abstract] [Download PDF]

      Trajectory optimization is a fundamental problem in robotics. While optimization of continuous control trajectories is well developed, many applications require both discrete and continuous, i.e. hybrid controls. Finding an optimal sequence of hybrid controls is challenging due to the exponential explosion of discrete control combinations. Our method, based on Differential Dynamic Programming (DDP), circumvents this problem by incorporating discrete actions inside DDP: we first optimize continuous mixtures of discrete actions, and, subsequently force the mixtures into fully discrete actions. Moreover, we show how our approach can be extended to partially observable Markov decision processes (POMDPs) for trajectory planning under uncertainty. We validate the approach in a car driving problem where the robot has to switch discrete gears and in a box pushing application where the robot can switch the side of the box to push. The pose and the friction parameters of the pushed box are initially unknown and only indirectly observable.

      @inproceedings{lirolem28257,
      booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
      month = {September},
      title = {Hybrid control trajectory optimization under uncertainty},
      author = {J. Pajarinen and V. Kyrki and M. Koval and S Srinivasa and J. Peters and G. Neumann},
      year = {2017},
      keywords = {ARRAY(0x7f5b2b581b38)},
      url = {http://eprints.lincoln.ac.uk/28257/},
      abstract = {Trajectory optimization is a fundamental problem in robotics. While optimization of continuous control trajectories is well developed, many applications require both discrete and continuous, i.e. hybrid controls. Finding an optimal sequence of hybrid controls is challenging due to the exponential explosion of discrete control combinations. Our method, based on Differential Dynamic Programming (DDP), circumvents this problem by incorporating discrete actions inside DDP: we first optimize continuous mixtures of discrete actions, and, subsequently force the mixtures into fully discrete actions. Moreover, we show how our approach can be extended to partially observable Markov decision processes (POMDPs) for trajectory planning under uncertainty. We validate the approach in a car driving problem where the robot has to switch discrete gears and in a box pushing application where the robot can switch the side of the box to push. The pose and the friction parameters of the pushed box are initially unknown and only indirectly observable.}
      }

    • A. Paraschos, R. Lioutikov, J. Peters, and G. Neumann, “Probabilistic prioritization of movement primitives,” IEEE Robotics and Automation Letters, vol. PP, iss. 99, 2017.
      [BibTeX] [Abstract] [Download PDF]

      Movement prioritization is a common approach to combine controllers of different tasks for redundant robots, where each task is assigned a priority. The priorities of the tasks are often hand-tuned or the result of an optimization, but seldomly learned from data. This paper combines Bayesian task prioritization with probabilistic movement primitives to prioritize full motion sequences that are learned from demonstrations. Probabilistic movement primitives (ProMPs) can encode distributions of movements over full motion sequences and provide control laws to exactly follow these distributions. The probabilistic formulation allows for a natural application of Bayesian task prioritization. We extend the ProMP controllers with an additional feedback component that accounts inaccuracies in following the distribution and allows for a more robust prioritization of primitives. We demonstrate how the task priorities can be obtained from imitation learning and how different primitives can be combined to solve even unseen task-combinations. Due to the prioritization, our approach can efficiently learn a combination of tasks without requiring individual models per task combination. Further, our approach can adapt an existing primitive library by prioritizing additional controllers, for example, for implementing obstacle avoidance. Hence, the need of retraining the whole library is avoided in many cases. We evaluate our approach on reaching movements under constraints with redundant simulated planar robots and two physical robot platforms, the humanoid robot ?iCub? and a KUKA LWR robot arm.

      @article{lirolem27901,
      volume = {PP},
      number = {99},
      month = {July},
      author = {Alexandros Paraschos and Rudolf Lioutikov and Jan Peters and Gerhard Neumann},
      booktitle = {, Proceedings of the International Conference on Intelligent Robot Systems, and IEEE Robotics and Automation Letters (RA-L)},
      title = {Probabilistic prioritization of movement primitives},
      publisher = {IEEE},
      year = {2017},
      journal = {IEEE Robotics and Automation Letters},
      keywords = {ARRAY(0x7f5b2b68d978)},
      url = {http://eprints.lincoln.ac.uk/27901/},
      abstract = {Movement prioritization is a common approach
      to combine controllers of different tasks for redundant robots,
      where each task is assigned a priority. The priorities of the
      tasks are often hand-tuned or the result of an optimization,
      but seldomly learned from data. This paper combines Bayesian
      task prioritization with probabilistic movement primitives to
      prioritize full motion sequences that are learned from demonstrations.
      Probabilistic movement primitives (ProMPs) can
      encode distributions of movements over full motion sequences
      and provide control laws to exactly follow these distributions.
      The probabilistic formulation allows for a natural application of
      Bayesian task prioritization. We extend the ProMP controllers
      with an additional feedback component that accounts inaccuracies
      in following the distribution and allows for a more
      robust prioritization of primitives. We demonstrate how the
      task priorities can be obtained from imitation learning and
      how different primitives can be combined to solve even unseen
      task-combinations. Due to the prioritization, our approach can
      efficiently learn a combination of tasks without requiring individual
      models per task combination. Further, our approach can
      adapt an existing primitive library by prioritizing additional
      controllers, for example, for implementing obstacle avoidance.
      Hence, the need of retraining the whole library is avoided in
      many cases. We evaluate our approach on reaching movements
      under constraints with redundant simulated planar robots and
      two physical robot platforms, the humanoid robot ?iCub? and
      a KUKA LWR robot arm.}
      }

  • New IJCAI paper: Contextual CMA-ES
    • A. Abdolmaleki, B. Price, N. Lau, P. Reis, and G. Neumann, “Contextual CMA-ES,” in International Joint Conference on Artificial Intelligence (IJCAI), 2017.
      [BibTeX] [Abstract] [Download PDF]

      Many stochastic search algorithms are designed to optimize a fixed objective function to learn a task, i.e., if the objective function changes slightly, for example, due to a change in the situation or context of the task, relearning is required to adapt to the new context. For instance, if we want to learn a kicking movement for a soccer robot, we have to relearn the movement for different ball locations. Such relearning is undesired as it is highly inefficient and many applications require a fast adaptation to a new context/situation. Therefore, we investigate contextual stochastic search algorithms that can learn multiple, similar tasks simultaneously. Current contextual stochastic search methods are based on policy search algorithms and suffer from premature convergence and the need for parameter tuning. In this paper, we extend the well known CMA-ES algorithm to the contextual setting and illustrate its performance on several contextual tasks. Our new algorithm, called contextual CMAES, leverages from contextual learning while it preserves all the features of standard CMA-ES such as stability, avoidance of premature convergence, step size control and a minimal amount of parameter tuning.

      @inproceedings{lirolem28141,
      booktitle = {International Joint Conference on Artificial Intelligence (IJCAI)},
      month = {August},
      title = {Contextual CMA-ES},
      author = {A. Abdolmaleki and B. Price and N. Lau and P. Reis and G. Neumann},
      year = {2017},
      keywords = {ARRAY(0x7f5b2b4dd700)},
      url = {http://eprints.lincoln.ac.uk/28141/},
      abstract = {Many stochastic search algorithms are designed to optimize a fixed objective function to learn a task, i.e., if the objective function changes slightly, for example, due to a change in the situation or context of the task, relearning is required to adapt to the new context. For instance, if we want to learn a kicking movement for a soccer robot, we have to relearn the movement for different ball locations. Such relearning is undesired as it is highly inefficient and many applications require a fast adaptation to a new context/situation. Therefore, we investigate contextual stochastic search algorithms
      that can learn multiple, similar tasks simultaneously. Current contextual stochastic search methods are based on policy search algorithms and suffer from premature convergence and the need for parameter tuning. In this paper, we extend the well known CMA-ES algorithm to the contextual setting and illustrate its performance on several contextual
      tasks. Our new algorithm, called contextual CMAES, leverages from contextual learning while it preserves all the features of standard CMA-ES such as stability, avoidance of premature convergence, step size control and a minimal amount of parameter tuning.}
      }

  • New JMLR paper accepted: “Non-parametric Policy Search with Limited Information Loss.”
    • H. van Hoof, G. Neumann, and J. Peters, “Non-parametric policy search with limited information loss,” Journal of Machine Learning Research, 2017.
      [BibTeX] [Abstract] [Download PDF]

      Learning complex control policies from non-linear and redundant sensory input is an important challenge for reinforcement learning algorithms. Non-parametric methods that approximate values functions or transition models can address this problem, by adapting to the complexity of the dataset. Yet, many current non-parametric approaches rely on unstable greedy maximization of approximate value functions, which might lead to poor convergence or oscillations in the policy update. A more robust policy update can be obtained by limiting the information loss between successive state-action distributions. In this paper, we develop a policy search algorithm with policy updates that are both robust and non-parametric. Our method can learn non-parametric control policies for infinite horizon continuous Markov decision processes with non-linear and redundant sensory representations. We investigate how we can use approximations of the kernel function to reduce the time requirements of the demanding non-parametric computations. In our experiments, we show the strong performance of the proposed method, and how it can be approximated effi- ciently. Finally, we show that our algorithm can learn a real-robot underpowered swing-up task directly from image data.

      @article{lirolem28020,
      month = {December},
      title = {Non-parametric policy search with limited information loss},
      author = {Herke van Hoof and Gerhard Neumann and Jan Peters},
      publisher = {Journal of Machine Learning Research},
      year = {2017},
      journal = {Journal of Machine Learning Research},
      keywords = {ARRAY(0x7f5b2b91caa0)},
      url = {http://eprints.lincoln.ac.uk/28020/},
      abstract = {Learning complex control policies from non-linear and redundant sensory input is an important
      challenge for reinforcement learning algorithms. Non-parametric methods that
      approximate values functions or transition models can address this problem, by adapting
      to the complexity of the dataset. Yet, many current non-parametric approaches rely on
      unstable greedy maximization of approximate value functions, which might lead to poor
      convergence or oscillations in the policy update. A more robust policy update can be obtained
      by limiting the information loss between successive state-action distributions. In this
      paper, we develop a policy search algorithm with policy updates that are both robust and
      non-parametric. Our method can learn non-parametric control policies for infinite horizon
      continuous Markov decision processes with non-linear and redundant sensory representations.
      We investigate how we can use approximations of the kernel function to reduce the
      time requirements of the demanding non-parametric computations. In our experiments, we
      show the strong performance of the proposed method, and how it can be approximated effi-
      ciently. Finally, we show that our algorithm can learn a real-robot underpowered swing-up
      task directly from image data.}
      }

  • New IJJR Paper accepted! “Learning Movement Primitive Libraries through Probabilistic Segmentation.”
    • R. Lioutikov, G. Neumann, G. Maeda, and J. Peters, “Learning movement primitive libraries through probabilistic segmentation,” International Journal of Robotics Research (IJRR), vol. 36, iss. 8, pp. 879-894, 2017.
      [BibTeX] [Abstract] [Download PDF]

      Movement primitives are a well established approach for encoding and executing movements. While the primitives themselves have been extensively researched, the concept of movement primitive libraries has not received similar attention. Libraries of movement primitives represent the skill set of an agent. Primitives can be queried and sequenced in order to solve specific tasks. The goal of this work is to segment unlabeled demonstrations into a representative set of primitives. Our proposed method differs from current approaches by taking advantage of the often neglected, mutual dependencies between the segments contained in the demonstrations and the primitives to be encoded. By exploiting this mutual dependency, we show that we can improve both the segmentation and the movement primitive library. Based on probabilistic inference our novel approach segments the demonstrations while learning a probabilistic representation of movement primitives. We demonstrate our method on two real robot applications. First, the robot segments sequences of different letters into a library, explaining the observed trajectories. Second, the robot segments demonstrations of a chair assembly task into a movement primitive library. The library is subsequently used to assemble the chair in an order not present in the demonstrations.

      @article{lirolem28021,
      volume = {36},
      number = {8},
      month = {July},
      author = {Rudolf Lioutikov and Gerhard Neumann and Guilherme Maeda and Jan Peters},
      title = {Learning movement primitive libraries through probabilistic segmentation},
      publisher = {SAGE},
      year = {2017},
      journal = {International Journal of Robotics Research (IJRR)},
      pages = {879--894},
      keywords = {ARRAY(0x7f5b2b4d79f0)},
      url = {http://eprints.lincoln.ac.uk/28021/},
      abstract = {Movement primitives are a well established approach for encoding and executing movements. While the primitives
      themselves have been extensively researched, the concept of movement primitive libraries has not received similar
      attention. Libraries of movement primitives represent the skill set of an agent. Primitives can be queried and sequenced
      in order to solve specific tasks. The goal of this work is to segment unlabeled demonstrations into a representative
      set of primitives. Our proposed method differs from current approaches by taking advantage of the often neglected,
      mutual dependencies between the segments contained in the demonstrations and the primitives to be encoded. By
      exploiting this mutual dependency, we show that we can improve both the segmentation and the movement primitive
      library. Based on probabilistic inference our novel approach segments the demonstrations while learning a probabilistic
      representation of movement primitives. We demonstrate our method on two real robot applications. First, the robot
      segments sequences of different letters into a library, explaining the observed trajectories. Second, the robot segments
      demonstrations of a chair assembly task into a movement primitive library. The library is subsequently used to assemble the chair in an order not present in the demonstrations.}
      }

  • New ICML Paper: Local Bayesian Optimization
    • R. Akrour, D. Sorokin, J. Peters, and G. Neumann, “Local Bayesian optimization of motor skills,” in International Conference on Machine Learning (ICML), 2017.
      [BibTeX] [Abstract] [Download PDF]

      Bayesian optimization is renowned for its sample efficiency but its application to higher dimensional tasks is impeded by its focus on global optimization. To scale to higher dimensional problems, we leverage the sample efficiency of Bayesian optimization in a local context. The optimization of the acquisition function is restricted to the vicinity of a Gaussian search distribution which is moved towards high value areas of the objective. The proposed informationtheoretic update of the search distribution results in a Bayesian interpretation of local stochastic search: the search distribution encodes prior knowledge on the optimum?s location and is weighted at each iteration by the likelihood of this location?s optimality. We demonstrate the effectiveness of our algorithm on several benchmark objective functions as well as a continuous robotic task in which an informative prior is obtained by imitation learning.

      @inproceedings{lirolem27902,
      booktitle = {International Conference on Machine Learning (ICML)},
      month = {August},
      title = {Local Bayesian optimization of motor skills},
      author = {R. Akrour and D. Sorokin and J. Peters and G. Neumann},
      year = {2017},
      keywords = {ARRAY(0x7f5b2b763570)},
      url = {http://eprints.lincoln.ac.uk/27902/},
      abstract = {Bayesian optimization is renowned for its sample
      efficiency but its application to higher dimensional
      tasks is impeded by its focus on global
      optimization. To scale to higher dimensional
      problems, we leverage the sample efficiency of
      Bayesian optimization in a local context. The
      optimization of the acquisition function is restricted
      to the vicinity of a Gaussian search distribution
      which is moved towards high value areas
      of the objective. The proposed informationtheoretic
      update of the search distribution results
      in a Bayesian interpretation of local stochastic
      search: the search distribution encodes prior
      knowledge on the optimum?s location and is
      weighted at each iteration by the likelihood of
      this location?s optimality. We demonstrate the
      effectiveness of our algorithm on several benchmark
      objective functions as well as a continuous
      robotic task in which an informative prior is obtained
      by imitation learning.}
      }

  • New AURO Paper: Using Probabilistic Movement Primitives in Robotics

    Well deserved, Alex!

    • A. Paraschos, C. Daniel, J. Peters, and G. Neumann, “Using probabilistic movement primitives in robotics,” Autonomous Robots, 2017.
      [BibTeX] [Abstract] [Download PDF]

      Movement Primitives are a well-established paradigm for modular movement representation and generation. They provide a data-driven representation of movements and support generalization to novel situations, temporal modulation, sequencing of primitives and controllers for executing the primitive on physical systems. However, while many MP frameworks exhibit some of these properties, there is a need for a uni- fied framework that implements all of them in a principled way. In this paper, we show that this goal can be achieved by using a probabilistic representation. Our approach models trajectory distributions learned from stochastic movements. Probabilistic operations, such as conditioning can be used to achieve generalization to novel situations or to combine and blend movements in a principled way. We derive a stochastic feedback controller that reproduces the encoded variability of the movement and the coupling of the degrees of freedom of the robot. We evaluate and compare our approach on several simulated and real robot scenarios.

      @article{lirolem27883,
      month = {December},
      title = {Using probabilistic movement primitives in robotics},
      author = {Alexandros Paraschos and Christian Daniel and Jan Peters and Gerhard Neumann},
      publisher = {Springer Verlag},
      year = {2017},
      journal = {Autonomous Robots},
      keywords = {ARRAY(0x7f5b2b795758)},
      url = {http://eprints.lincoln.ac.uk/27883/},
      abstract = {Movement Primitives are a well-established
      paradigm for modular movement representation and
      generation. They provide a data-driven representation
      of movements and support generalization to novel situations,
      temporal modulation, sequencing of primitives
      and controllers for executing the primitive on physical
      systems. However, while many MP frameworks exhibit
      some of these properties, there is a need for a uni-
      fied framework that implements all of them in a principled
      way. In this paper, we show that this goal can be
      achieved by using a probabilistic representation. Our
      approach models trajectory distributions learned from
      stochastic movements. Probabilistic operations, such as
      conditioning can be used to achieve generalization to
      novel situations or to combine and blend movements in
      a principled way. We derive a stochastic feedback controller
      that reproduces the encoded variability of the
      movement and the coupling of the degrees of freedom
      of the robot. We evaluate and compare our approach
      on several simulated and real robot scenarios.}
      }

  • New RAL Paper: “Probabilistic Prioritization of Movement Primitives”

    Alex’s last journal paper for his PhD has been accepted! Congratulations!

    • A. Paraschos, R. Lioutikov, J. Peters, and G. Neumann, “Probabilistic prioritization of movement primitives,” IEEE Robotics and Automation Letters, vol. PP, iss. 99, 2017.
      [BibTeX] [Abstract] [Download PDF]

      Movement prioritization is a common approach to combine controllers of different tasks for redundant robots, where each task is assigned a priority. The priorities of the tasks are often hand-tuned or the result of an optimization, but seldomly learned from data. This paper combines Bayesian task prioritization with probabilistic movement primitives to prioritize full motion sequences that are learned from demonstrations. Probabilistic movement primitives (ProMPs) can encode distributions of movements over full motion sequences and provide control laws to exactly follow these distributions. The probabilistic formulation allows for a natural application of Bayesian task prioritization. We extend the ProMP controllers with an additional feedback component that accounts inaccuracies in following the distribution and allows for a more robust prioritization of primitives. We demonstrate how the task priorities can be obtained from imitation learning and how different primitives can be combined to solve even unseen task-combinations. Due to the prioritization, our approach can efficiently learn a combination of tasks without requiring individual models per task combination. Further, our approach can adapt an existing primitive library by prioritizing additional controllers, for example, for implementing obstacle avoidance. Hence, the need of retraining the whole library is avoided in many cases. We evaluate our approach on reaching movements under constraints with redundant simulated planar robots and two physical robot platforms, the humanoid robot ?iCub? and a KUKA LWR robot arm.

      @article{lirolem27901,
      volume = {PP},
      number = {99},
      month = {July},
      author = {Alexandros Paraschos and Rudolf Lioutikov and Jan Peters and Gerhard Neumann},
      booktitle = {, Proceedings of the International Conference on Intelligent Robot Systems, and IEEE Robotics and Automation Letters (RA-L)},
      title = {Probabilistic prioritization of movement primitives},
      publisher = {IEEE},
      year = {2017},
      journal = {IEEE Robotics and Automation Letters},
      keywords = {ARRAY(0x7f5b2b5365a0)},
      url = {http://eprints.lincoln.ac.uk/27901/},
      abstract = {Movement prioritization is a common approach
      to combine controllers of different tasks for redundant robots,
      where each task is assigned a priority. The priorities of the
      tasks are often hand-tuned or the result of an optimization,
      but seldomly learned from data. This paper combines Bayesian
      task prioritization with probabilistic movement primitives to
      prioritize full motion sequences that are learned from demonstrations.
      Probabilistic movement primitives (ProMPs) can
      encode distributions of movements over full motion sequences
      and provide control laws to exactly follow these distributions.
      The probabilistic formulation allows for a natural application of
      Bayesian task prioritization. We extend the ProMP controllers
      with an additional feedback component that accounts inaccuracies
      in following the distribution and allows for a more
      robust prioritization of primitives. We demonstrate how the
      task priorities can be obtained from imitation learning and
      how different primitives can be combined to solve even unseen
      task-combinations. Due to the prioritization, our approach can
      efficiently learn a combination of tasks without requiring individual
      models per task combination. Further, our approach can
      adapt an existing primitive library by prioritizing additional
      controllers, for example, for implementing obstacle avoidance.
      Hence, the need of retraining the whole library is avoided in
      many cases. We evaluate our approach on reaching movements
      under constraints with redundant simulated planar robots and
      two physical robot platforms, the humanoid robot ?iCub? and
      a KUKA LWR robot arm.}
      }