ML AGENTS

Trained ML Agents

Loader Driver

Agent task: Pickup plastic from shoreline and sort into colour coded bin.

Spawn position: Random along a base line.
Terrain: Very low friction, difficult to drive on. Obstacles include plants, rocks and a walkway. Must climb sand dune before first observing target objects.
Rewards: Trash pickup, trash in bin, trash in correct bin.
Penalties: Hitting obstacles, rear axel underwater, dropping objects, time.
Observations: Velocity, magnitude to trash , local position, trash & bin colours, obstacles.
Behaviours: 1 discrete action branch with 7 actions
Training steps: 2 Million
Result: Will keep beach clean if plastic item spawn rate is less than 1 per 200 seconds. Never took advantage of the bonus reward associated with matching bin / trash colours.

Agent task: Fly to landing pad.

Spawn position: Random orientation from static tower.
Terrain: Clear skies, no wind. Obstacles include a couple of tall trees and launch tower. Target visible from spawn.
Rewards: Hover above launch and landing pads as principle rewards.
Penalties: Hitting obstacles, excessive yaw, time.
Observations: 37. Possibly way too many.
Behaviours: 4 discrete action branches.
Training steps: 2 Million
Result: Crash only. Far too many intermediate rewards/penalties. Needs restart and simplification. First concentrating on stable hovering above launch pad.

Cerdanya Valley, Spain

No Tracking ~ No Cookies