ML AGENTS

Trained ML Agents

Loader Driver

 Agent task:  Pickup plastic from shoreline and sort into colour coded bin. 


  • Spawn position:  Random along a base line.
  • Terrain: Very low friction, difficult to drive on. Obstacles include plants, rocks and a walkway. Must climb sand dune before first observing target objects.
  • Rewards: Trash pickup, trash in bin, trash in correct bin.
  • Penalties:  Hitting obstacles, rear axel underwater, dropping objects, time.
  • Observations:  Velocity, magnitude to trash , local position, trash & bin colours, obstacles.
  • Behaviours:  1 discrete action branch with 7 actions
  • Training steps:  2 Million
  • Result:  Will keep beach clean if plastic item spawn rate is less than 1 per 200 seconds. Never took advantage of the bonus reward associated with matching bin / trash colours.



Helicopter Pilot

 Agent task:  Fly to landing pad.


  • Spawn position:  Random orientation from static tower.
  • Terrain: Clear skies, no wind. Obstacles include a couple of tall trees and launch tower. Target visible from spawn.
  • Rewards: Hover above launch and landing pads as principle rewards.
  • Penalties:  Hitting obstacles, excessive yaw, time.
  • Observations:  37. Possibly way too many.
  • Behaviours:  4 discrete action branches.
  • Training steps:  2 Million
  • Result:  Crash only. Far too many intermediate rewards/penalties. Needs restart and simplification. First concentrating on stable hovering above launch pad.


Share by: