Learning to Drive Smoothly in Minutes

Learning to drive smoothly in minutes, using a reinforcement learning algorithm -- Soft Actor-Critic (SAC) -- and a Variational AutoEncoder (VAE) in the Donkey Car simulator.

Blog post on Medium: link

Note: the pretrained agents must be saved in logs/sac/ folder (you need to pass --exp-id 6 (index of the folder) to use the pretrained agent).

Quick Start

  1. Download simulator here or build it from source
  2. Install dependencies (cf requirements.txt)
  3. (optional but recommended) Download pre-trained VAE: VAE Level 0 VAE Level 1
  4. Train a control policy for 5000 steps using Soft Actor-Critic (SAC)
python --algo sac -vae path-to-vae.pkl -n 5000
  1. Enjoy trained agent for 2000 steps
python --algo sac -vae path-to-vae.pkl --exp-id 0 -n 2000

To train on a different level, you need to change LEVEL = 0 to LEVEL = 1 in

Train the Variational AutoEncoder (VAE)

  1. Collect images using the teleoperation mode:
python -m teleop.teleop_client --record-folder path-to-record/folder/
  1. Train a VAE:
python -m vae.train --n-epochs 50 --verbose 0 --z-size 64 -f path-to-record/folder/

Train in Teleoparation Mode

python --algo sac -vae logs/vae.pkl -n 5000 --teleop

Test in Teleoparation Mode

python -m teleop.teleop_client --algo sac -vae logs/vae.pkl --exp-id 0

Explore Latent Space

python -m vae.enjoy_latent -vae logs/level-0/vae-8.pkl

Reproducing Results

To reproduce the results shown in the video, you have to check different values in

Level 0

MAX_STEERING_DIFF = 0.15 # 0.1 for very smooth control, but it requires more steps
MAX_THROTTLE = 0.6 # MAX_THROTTLE = 0.5 is fine, but we can go faster
MAX_CTE_ERROR = 2.0 # only used in normal mode, set it to 10.0 when using teleoperation mode

Train in normal mode (smooth control), it takes ~5-10 minutes:

python --algo sac -n 8000 -vae logs/vae-level-0-dim-32.pkl

Train in normal mode (very smooth control with MAX_STEERING_DIFF = 0.1), it takes ~20 minutes:

python --algo sac -n 20000 -vae logs/vae-level-0-dim-32.pkl

Train in teleoperation mode (MAX_CTE_ERROR = 10.0), it takes ~5-10 minutes:

python --algo sac -n 8000 -vae logs/vae-level-0-dim-32.pkl --teleop

Level 1

Note: only teleoperation mode is available for level 1

MAX_THROTTLE = 0.5 # MAX_THROTTLE = 0.6 can work but it's harder to train due to the sharpest turn

Train in teleoperation mode, it takes ~10 minutes:

python --algo sac -n 15000 -vae logs/vae-level-1-dim-64.pkl --teleop

Note: although the size of the VAE is different between level 0 and 1, this is not an important factor.


Related Paper: "Learning to Drive in a Day".