A Reinforcement Learning Based Multiple Strategy Framework for Tracking a Moving Target

In order to combine the robustness of classical control strategies with the adaptive characteristics of reinforcement learning, this hierarchical control framework was proposed. A neural networks is in the high level (HL). It determines which classical control method to be adopted with observations. In the strategy level (SL), the strategy chosen by the control signal ($c_t$) perform the actual action. This idea is inspired from the hierarchical reinforcement learning. Compared to the end-to-end design, this hierarchical design has following advantages:

Easy to train. For the high level network, the action space is simplified into discrete, highly reduce the exploration space of the agent.
Interpretability. The lower layer of the framework is not a neural network, but a classical control algorithm, thus the agent's performance can be analysed.
Multiple timescales: The lower controller runs at the base frequency (such as the simulated world clock), while the upper layer updates the strategy at a slower frequency.

For the specific situation, i.e. robot pursuit evasion, two navigation algorithms are adopted: a dynamic window approach and a potential field approach.

The demo is based on gym and stable_baselines. Prerequisite:

pip install gym==0.14.0
pip install tensorflow==1.14
pip install stable-baselines

Then

python training_robot.py

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.idea		.idea
__pycache__		__pycache__
apf_flaw		apf_flaw
no_course_tensorboard/PPO2_1		no_course_tensorboard/PPO2_1
npy_for_fig		npy_for_fig
pursuit_tensorboard		pursuit_tensorboard
scenario1		scenario1
scenario2		scenario2
scenario3		scenario3
tmp_model		tmp_model
.~lock.run_PPO2_3-tag-episode_reward.csv#		.~lock.run_PPO2_3-tag-episode_reward.csv#
__init__.py		__init__.py
a_test.pkl		a_test.pkl
apf.py		apf.py
apf_mine.gif		apf_mine.gif
cubic_spline_planner.py		cubic_spline_planner.py
ddpg_mountain.pkl		ddpg_mountain.pkl
dwa.py		dwa.py
dwa_evader_control.py		dwa_evader_control.py
dwa_mine.gif		dwa_mine.gif
e_apf.npy		e_apf.npy
e_dwa.npy		e_dwa.npy
e_hybrid.npy		e_hybrid.npy
framework.png		framework.png
gif.py		gif.py
grid_mdp_v1.py		grid_mdp_v1.py
hybird_mine.gif		hybird_mine.gif
keyboard_event.py		keyboard_event.py
line.gif		line.gif
p_apf.npy		p_apf.npy
p_dwa.npy		p_dwa.npy
p_hybrid.npy		p_hybrid.npy
pe_map.png		pe_map.png
plot_apf_flaw.py		plot_apf_flaw.py
plot_simu_results.py		plot_simu_results.py
plot_training_result.py		plot_training_result.py
ppo2_cartpole.pkl		ppo2_cartpole.pkl
ppo_test.py		ppo_test.py
proportional_navi.py		proportional_navi.py
pursuit.pkl		pursuit.pkl
pursuit2.pkl		pursuit2.pkl
pursuit_env_torque.py		pursuit_env_torque.py
pursuit_env_v1.py		pursuit_env_v1.py
pursuit_env_v2.py		pursuit_env_v2.py
pursuit_no_course.pkl		pursuit_no_course.pkl
readme.md		readme.md
real_wheel_feedback.py		real_wheel_feedback.py
robot_model.py		robot_model.py
run_DDPG_12-tag-episode_reward.csv		run_DDPG_12-tag-episode_reward.csv
run_PPO2_2-tag-episode_reward.csv		run_PPO2_2-tag-episode_reward.csv
run_PPO2_2-tag-loss_loss.csv		run_PPO2_2-tag-loss_loss.csv
run_PPO2_3-tag-episode_reward.csv		run_PPO2_3-tag-episode_reward.csv
run_PPO2_3-tag-loss_loss.csv		run_PPO2_3-tag-loss_loss.csv
test.log		test.log
test.py		test.py
train_robot_torque.py		train_robot_torque.py
training_course.py		training_course.py
training_hard.csv		training_hard.csv
training_robot.py		training_robot.py
training_simple.csv		training_simple.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Reinforcement Learning Based Multiple Strategy Framework for Tracking a Moving Target

About

Releases

Packages

Languages

sldai/RL_pursuit_evasion

Folders and files

Latest commit

History

Repository files navigation

A Reinforcement Learning Based Multiple Strategy Framework for Tracking a Moving Target

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages