In machine learning, systems which employ offline learning do not change their approximation of the target function when the initial training phase has been completed.[1]

While in online learning, only the set of possible elements is known, in offline learning, the identity of the elements as well as the order in which they are presented is known to the learner.[2]

Applications for robotics control

The ability of robots to learn is equal to create a table (information) which is filled with values. One option for doing so is programming by demonstration. Here, the table is filled with values by a human teacher. The demonstration is provided either as direct numerical control policy which is equal to a trajectory, or as an indirect objective function which is given in advance.[3]

Offline learning is working in batch mode. In step 1 the task is demonstrated and stored in the table, and in step 2 the task is reproduced by the robot.[4] The pipeline is slow and inefficient because a delay is there between behavior demonstration and skill replay.[5][6]

A short example will help to understand the idea. Suppose the robot should learn a wall following task and the internal table of the robot is empty. Before the robot gets activated in the replay mode, the human demonstrator has to teach the behavior. He is controlling the robot with teleoperation and during the learning step the skill table is generated. The process is called offline, because the robot control software is doing nothing but the device is utilized by the human operator as a pointing device for driving along the wall.[6]

See also

References

  1. Bishop, Christopher M. (2006-08-17). Pattern Recognition and Machine Learning. New York: Springer. ISBN 978-0-387-31073-2.
  2. Ben-David, Shai; Kushilevitz, Eyal; Mansour, Yishay (1997-10-01). "Online Learning versus Offline Learning". Machine Learning. 29 (1): 45–63. doi:10.1023/A:1007465907571. ISSN 0885-6125.
  3. Bajcsy, Andrea and Losey, Dylan P and O’Malley, Marcia K and Dragan, Anca D (2017). "Learning robot objectives from physical human interaction". Proceedings of Machine Learning Research. PMLR. 78: 217–226.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  4. Meyer-Delius, Daniel and Beinhofer, Maximilian and Burgard, Wolfram (2012). Occupancy grid models for robot mapping in changing environments. Twenty-Sixth AAAI Conference on Artificial Intelligence.{{cite conference}}: CS1 maint: multiple names: authors list (link)
  5. Luka Peternel and Erhan Oztop and Jan Babic (2016). A shared control method for online human-in-the-loop robot learning based on Locally Weighted Regression. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE. doi:10.1109/iros.2016.7759574.
  6. 1 2 Jun, Li and Duckett, Tom (2003). Robot behavior learning with a dynamically adaptive RBF network: Experiments in offline and online learning. Proc. 2 Intern. Conf. on Comput. Intelligence, Robotics and Autonomous System, CIRAS. Citeseer.{{cite conference}}: CS1 maint: multiple names: authors list (link)


This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.