Deep Reinforcement Learning for Robot Navigation in Dynamic Environments Using Multi-Sensor Perception (Simulation)

Faculty

Faculty of Science

Department

—

Supervisor

Dr. Hakem Beitullahi, Mrs. Sabat Salih Muhamad

Researchers

Tanya Swar Hasan
Dyar Zrar Shafiq
Shahrevan Abdulaziz Shakir

Abstract

This project presents a deep reinforcement learning (DRL) framework for autonomous mobile robot navigation in dynamic environments using multi sensor perception. The system is implemented in a realistic simulation based on ROS2 and Gazebo, utilizing a TurtleBot3 Waffle equipped with LiDAR, IMU, and wheel encoders. These heterogeneous sensors are fused into a normalized observation space to enable compact and informative state representation for robust decision-making. The Soft Actor-Critic (SAC) algorithm is used for continuous control due to its stability and sample efficiency in high-dimensional action spaces Soft Actor Critic. The agent outputs linear and angular velocity commands to reach target goals while avoiding collisions with static and dynamic obstacles. A curriculum learning strategy is applied to progressively increase task difficulty, starting from simple goal-reaching in empty environments, then introducing random goals, static obstacles, and finally highly dynamic scenarios with multiple continuously moving obstacles governed by a velocity-based motion model. A key contribution is the extension to a pursuit–evasion navigation task, where the goal is not stationary but actively escapes the agent. When the robot approaches, the target dynamically moves away, forcing the agent to learn predictive interception and cornering strategies to successfully trap and reach the goal despite its evasive behavior. The reward function encourages efficient goal reaching, smooth control, and safe navigation. A real-time training dashboard monitors success rate, episode return, and SAC internal losses. Experimental results show consistent high performance across all stages, achieving over 90% success and reaching up to 98–99% in simpler settings, while maintaining strong robustness in complex dynamic environments. Overall, the results demonstrate the effectiveness of combining DRL, curriculum learning, and multi-sensor fusion for robust navigation in complex, interactive environments.

← Back to Booklet More from this Faculty →