Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal

Charalambos Theodorou, Vladan Velisavljevic, Vladimir Dyo

Research output: Contribution to journalArticlepeer-review


In dynamic indoor environments and for a Visual Simultaneous Localization and Mapping (vSLAM) system to operate, moving objects should be considered because they could affect the system’s visual odometer stability and its position estimation accuracy. vSLAM can use feature points or a sequence of images, as it is the only source of input that can perform localization while simultaneously creating a map of the environment. A vSLAM system based on ORB-SLAM3 and on YOLOR was proposed in this paper. The newly proposed system in combination with an object detection model (YOLOX) applied on extracted feature points is capable of achieving 2–4% better accuracy compared to VPS-SLAM and DS-SLAM. Static feature points such as signs and benches were used to calculate the camera position, and dynamic moving objects were eliminated by using the tracking thread. A specific custom personal dataset that includes indoor and outdoor RGB-D pictures of train stations, including dynamic objects and high density of people, ground truth data, sequence data, and video recordings of the train stations and X, Y, Z data was used to validate and evaluate the proposed method. The results show that ORB-SLAM3 with YOLOR as object detection achieves 89.54% of accuracy in dynamic indoor environments compared to previous systems such as VPS-SLAM.
Original languageEnglish
Article number7553
JournalSensors (Switzerland)
Issue number19
Publication statusPublished - 5 Oct 2022


  • visual SLAM
  • object detection
  • simultaneous localization and mapping (SLAM)

Cite this