Agent task: Pickup plastic from shoreline and sort into colour coded bin.
-
Spawn position: Random along a base line.
-
Terrain: Very low friction, difficult to drive on. Obstacles include plants, rocks and a walkway. Must climb sand dune before first observing target objects.
-
Rewards: Trash pickup, trash in bin, trash in correct bin.
-
Penalties: Hitting obstacles, rear axel underwater, dropping objects, time.
-
Observations: Velocity, magnitude to trash , local position, trash & bin colours, obstacles.
-
Behaviours: 1 discrete action branch with 7 actions
-
Training steps: 2 Million
-
Result: Will keep beach clean if plastic item spawn rate is less than 1 per 200 seconds. Never took advantage of the bonus reward associated with matching bin / trash colours.