Name of Paper

The Development of Shared Experience Learning in a Group of Mobile Robots (PhD Thesis)


Author

I.D. Kelly


Published

Department of Cybernetics, The University of Reading, Whiteknights, Berkshire, RG6 6AY, ENGLAND. April.


Abstract

Methods of increasing the rates of learning within groups of reinforcement learning agents are investigated. In particular, the sharing of experiences between groups of learning autonomous mobile robots is shown to produce faster learning rates and more robust solutions than learning without experience sharing. Shared experience learning produces these improvements by providing each agent with auxiliary sources of information. In shared experience learning each agent transmits the condition it is in, the action it tried and the reward it received. Upon receiving this information other agents update their decision making policy as if they had tried these actions and received the associated reward themselves. It is shown that in the reinforcement learning algorithm a combination of reward and punishment clearly produces better results than either one alone. Likewise, only rewarding the robot produces much better results than only punishing it. A low cost active infrared localisation and communication system using standard radio frequency, narrow-band frequency modulated technology is described. To overcome the problems of data collision in a multi-agent communication system, and to allow rapid position localisation frequency division multiplexing is utilised. This system is used for communications in the experiments on shared experience learning and relative robot position localisation in a study of flocking. The effects of leadership in a group of reactive flocking robots are also considered. It is shown that without any form of leadership (or global destination) the robots clump, that is aggregate, since they have no motive to go anywhere. When dynamic leadership is introduced it is shown that this tendency for the robots to clump together is overcome, and that true flocking behaviour occurs.


Electronic copy

0contents.ps.Z (23K)
1intro.ps.Z (167K)
2robots.ps.Z (310K)
3flock.ps.Z (3.4M)
4rlearn.ps.Z (900K)
5mutual.ps.Z (585K)
6conc.ps.Z (20K)
7refs.ps.Z (27K)