L3 · PAI-110

Full Autonomy in Python

Combine perception and control into one robust control(obs) that reliably drives the differential-drive rover to the goal pad within tolerance on real MuJoCo physics, fixing a partial starter controller.

01
Challenge

Try this first — before any explanation.

A partial controller is pre-loaded: it runs, but it drives straight at full throttle and never reads the bearing. The pad is off to the side at (1.2, 1.2), so the rover sails past it and the run times out short of goal. That failure is the whole lesson: a real controller has to fold what it senses (goal_bearing, goal_dist) back into what it commands (left, right) every single tick. The question the miss forces: what is the one closed loop that turns toward the pad AND closes the distance, in one control(obs)?

The Bench

Fix the partial controller: fold goal_bearing and goal_dist into one control(obs) that steers and drives the rover onto the green pad — real MuJoCo physics.

never turns toward the off-axis pad.\n fwd = 0.7\n left = fwd\n right = fwd\n return (left, right)\n","task":{"goal":[1.2,1.2],"time":12,"tol":0.2},"title":"Full Autonomy in Python"}">
MUJOCO · REAL PHYSICS · IN-BROWSER

Full Autonomy in Python

Fix the partial controller: fold goal_bearing and goal_dist into one control(obs) that steers and drives the rover onto the green pad — real MuJoCo physics.

02
Model

The idea, built visually.

The rover already runs your control(obs) fifty times a second on real physics — but right now it only knows one move: drive straight. The pad is off to the side, so straight is exactly wrong; it glides past and the clock runs out.

Every autonomous robot closes one loop: sense, then act, on the same beat. Here the senses are two numbers in obs — goal_bearing, how far off your nose is from the pad, and goal_dist, how far is left. Acting is one pair, (left, right). Steer by feeding bearing into the difference between the wheels: positive bearing means the pad is to your left, so spin the right wheel faster and you rotate toward it. Drive by feeding distance into the common forward push, easing off as you arrive so you settle on the pad instead of skating over it. Turn and drive aren't two behaviors fighting — they're one command, (fwd - turn, fwd + turn), recomputed every tick. That single line is the whole of autonomy.

▣ Stage animation: The straight-driving rover sails past the off-axis pad and a timeout bar fills; rewind. obs lights up as a small dict with goal_bearing and goal_dist glowing. A bearing arrow bends into a 'turn' term that pushes the two wheel bars apart; a distance arrow bends into a shared 'fwd' term that lifts both bars together. The two combine into (fwd - turn, fwd + turn); the wheels differential-steer the rover onto a smooth arc that curves straight onto the green pad, easing to a stop inside the tolerance ring.

03
Guided practice

Build it up, step by step.

Step A (read the contract): obs gives you obs['goal_bearing'] (rad, + = pad is LEFT) and obs['goal_dist'] (m). control(obs) must return (left, right), each in [-1.0, 1.0]. Step B (steer): turn = clip(K * obs['goal_bearing'], -1, 1) with K around 1.5 — proportional, so it turns hard when badly aimed and eases as it lines up. Step C (drive + settle): pick a forward push fwd (about 0.5), and slow it (about 0.2) once obs['goal_dist'] is small so you settle on the pad; optionally set fwd = 0 while |bearing| is large so you rotate before committing. Step D (combine): return (fwd - turn, fwd + turn) — one line, every tick — and watch the rover curve onto the pad inside the time budget.

04
Feedback

How the Bench grades your run.

PASS WHEN control(obs) folds goal_bearing into a turn term and goal_dist into a forward term, returns (left, right) in range each tick, and the rover reaches the goal pad within 0.2 m inside the 12.0 s budget.

  • FAIL: the rover drove past and timed out short of the pad — control(obs) ignores obs['goal_bearing']. Add a turn term (e.g. turn = 1.5 * obs['goal_bearing']) and return (fwd - turn, fwd + turn).
  • FAIL: the rover spins in place and never closes the distance — fwd is 0 (or turn saturates both wheels the same way). Keep a positive forward push so it translates while it steers.
  • FAIL: the rover steered AWAY from the pad — your turn sign is flipped. Positive goal_bearing means the pad is to the LEFT, so the RIGHT wheel must go faster: (fwd - turn, fwd + turn).
  • FAIL: the rover reached the pad but skated over it without settling — ease the forward push (e.g. fwd = 0.2 once obs['goal_dist'] is small) so it stops inside the tolerance ring.
05
Retrieve & space

Bring back what you've already mastered.

  • From 2.1: obs['goal_bearing'] and obs['goal_dist'] are the perception outputs you filtered — name which is steering input and which is throttle input. (bearing -> steer, dist -> drive.)
  • From 2.2: the turn term turn = K * goal_bearing is which controller term, and what does raising K do? (Proportional / P; sharper turn, risks overshoot.)
  • Sign check: goal_bearing = +0.4 means the pad is to your ____, so the ____ wheel must spin faster. (LEFT; RIGHT.)
06
Mastery gate

What you must demonstrate to advance.

One control(obs) reaches the goal pad within 0.2 m inside 12.0 s on the seeded course and held-out goals, by combining a proportional turn on goal_bearing with a distance-eased forward push, returning a single in-range (left, right) per tick (L3: fuse perception and control into one robust closed loop on real physics).

07
Project

How this feeds your build.

IS the first half of the capstone — a complete autonomous rover whose control(obs) closes the perceive->act loop on real MuJoCo physics, and the direct input to 5.2, which profiles and optimizes this exact control loop toward the metal.