Humans-on-the-loop deep reinforcement learning with application to multi-UAVs missions in next-generation emergency response systems

Interdisciplinary Areas: Data and Engineering Applications, Autonomous and Connected Systems, Smart City, Infrastructure, Transportation, Human-Machine/Computer Interaction, Human Factors, Human-Centered Design

Project Description

Using a swarm of Unmanned Aerial Vehicles (UAVs) to support first responders in emergencies has earned significant interest due to advancement in robotics and Artificial Intelligence (AI). The UAVs of next-generation emergency response systems is expected to be capable of sensing, planning, reasoning, collaborating, and acting to accomplish their tasks. These UAVs will not require humans-in-the-loop to make all key decisions, but rather make independent decisions with human-on-the-loop setting goals and supervising the multi-UAV missions.
Despite advances in AI and robotics autonomy models, deployment of multi-UAV systems remains challenging due to uncertainties in the outcome of AI models, rapid changes in environment conditions, and emerging requirements for how a swarm of autonomous UAVs can best support emergency responders during time-critical missions. One big challenge in designing such a semi-autonomous system is identify how they can collaborate to achieve a common goal. Even with a good system design, another critical challenge for optimal deployment is learning efficiency with reinforcement learning based methods. The recruited post-doctoral fellow is expected to spearhead into the development of a humans-on-the-loop meta-model-driven deep reinforcement learning (DRL) based solution in which human maintains oversight while intelligent UAVs are empowered to autonomously make planning and enactment decisions.

Start Date


Postdoctoral Qualifications

recent PhD graduates in AAE, ECE, OR/IE, and CS with research expertise in reinforcement learning, optimal control, stochastic dynamic programming. Candidates with research experience or exposure on multi-UAV control, humanitarian logistics, or human-in-the-loop modeling and simulation are preferred.


Nan Kong,, BME
Dengfeng Sun,, AAE

Outside Collaborators

Xiaoqian Wang, Purdue ECE,
Nicole Adams, Purdue Nursing,


1. Xiaoquan Gao, Nan Kong, and Paul M Griffin (2020). “Dynamic Optimization of Drone Dispatch for Substance Overdose Rescue.” In Proceedings of the 2020 Winter Simulation Conference (WSC 2020).
2. Bin Du, Inseok Hwang, and Dengfeng Sun. Distributed State Estimation for Stochastic Linear Hybrid Systems with Finite-Time Fusion, IEEE Transactions on Aerospace and Electronic Systems, Accepted in Mar. 2021.
3. Bin Du, Kun Qian, Christian Claudel, and Dengfeng Sun. Parallelized Active Information Gathering using Multi-Sensor Network for Environment Monitoring, IEEE Transactions on Control Systems Technology, Accepted in Feb. 2021.
4. Bin Du, Jun Chen, Dengfeng Sun, Satyanarayana Manyam, and David Casbeer. UAV trajectory planning with probabilistic geo-fence via iterative chance-constrained optimization, IEEE Transactions on Intelligent Transportation Systems, Accepted in Jan. 2021.
5. Xiaoqian Wang, Yijun Huang, Ji Liu, Heng Huang (2018). New Balanced Active Learning Model and Optimization Algorithm. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI-18).