Counting 3,567 Big Data & Machine Learning Frameworks, Toolsets, and Examples...
Suggestion? Feedback? Tweet @stkim1

Learning to Drive Smoothly in Minutes

Learning to drive smoothly in minutes, using a reinforcement learning algorithm -- Soft Actor-Critic (SAC) -- and a Variational AutoEncoder (VAE) in the Donkey Car simulator.

Blog post on Medium: link

Level-0 Level-1
result result
Download VAE Download VAE
Download pretrained agent Download pretrained agent

Note: the pretrained agents must be saved in logs/sac/ folder (you need to pass --exp-id 6 (index of the folder) to use the pretrained agent).

Quick Start

  1. Download simulator here or build it from source
  2. Install dependencies (cf requirements.txt)
  3. (optional but recommended) Download pre-trained VAE: VAE Level 0 VAE Level 1
  4. Train a control policy for 5000 steps using Soft Actor-Critic (SAC)
python train.py --algo sac -vae path-to-vae.pkl -n 5000
  1. Enjoy trained agent for 2000 steps
python enjoy.py --algo sac -vae path-to-vae.pkl --exp-id 0 -n 2000

To train on a different level, you need to change LEVEL = 0 to LEVEL = 1 in config.py

Train the Variational AutoEncoder (VAE)

  1. Collect images using the teleoperation mode:
python -m teleop.teleop_client --record-folder path-to-record/folder/
  1. Train a VAE:
python -m vae.train --n-epochs 50 --verbose 0 --z-size 64 -f path-to-record/folder/

Train in Teleoparation Mode

python train.py --algo sac -vae logs/vae.pkl -n 5000 --teleop

Test in Teleoparation Mode

python -m teleop.teleop_client --algo sac -vae logs/vae.pkl --exp-id 0

Explore Latent Space

python -m vae.enjoy_latent -vae logs/level-0/vae-8.pkl

Reproducing Results

To reproduce the results shown in the video, you have to check different values in config.py.

Level 0

config.py:

MAX_STEERING_DIFF = 0.15 # 0.1 for very smooth control, but it requires more steps
MAX_THROTTLE = 0.6 # MAX_THROTTLE = 0.5 is fine, but we can go faster
MAX_CTE_ERROR = 2.0 # only used in normal mode, set it to 10.0 when using teleoperation mode
LEVEL = 0

Train in normal mode (smooth control), it takes ~5-10 minutes:

python train.py --algo sac -n 8000 -vae logs/vae-level-0-dim-32.pkl

Train in normal mode (very smooth control with MAX_STEERING_DIFF = 0.1), it takes ~20 minutes:

python train.py --algo sac -n 20000 -vae logs/vae-level-0-dim-32.pkl

Train in teleoperation mode (MAX_CTE_ERROR = 10.0), it takes ~5-10 minutes:

python train.py --algo sac -n 8000 -vae logs/vae-level-0-dim-32.pkl --teleop

Level 1

Note: only teleoperation mode is available for level 1

config.py:

MAX_STEERING_DIFF = 0.15
MAX_THROTTLE = 0.5 # MAX_THROTTLE = 0.6 can work but it's harder to train due to the sharpest turn
LEVEL = 1

Train in teleoperation mode, it takes ~10 minutes:

python train.py --algo sac -n 15000 -vae logs/vae-level-1-dim-64.pkl --teleop

Note: although the size of the VAE is different between level 0 and 1, this is not an important factor.

Credits

Related Paper: "Learning to Drive in a Day".