Region-Based CNNs for Pedestrian Gender Recognition in Visual Surveillance Environments

Yaghoubi, EhsanAlirezazadeh, PendarAssunção, EduardoNeves, João C.Proença, HugoBrömme, ArslanBusch, ChristophDantcheva, AntitzaRathgeb, ChristianUhl, Andreas2020-09-152020-09-152019978-3-88579-690-9https://dl.gi.de/handle/20.500.12116/34226Inferring soft biometric labels in totally uncontrolled outdoor environments, such as surveillance scenarios, remains a challenge due to the low resolution of data and its covariates that might seriously compromise performance (e.g., occlusions and subjects pose). In this kind of data, even state-of-the-art deep-learning frameworks (such as ResNet) working in a holistic way, attain relatively poor performance, which was the main motivation for the work described in this paper. In particular, having noticed the main effect of the subjects’ “pose” factor, in this paper we describe a method that uses the body keypoints to estimate the subjects pose and define a set of regions of interest (e.g., head, torso, and legs). This information is used to learn appropriate classification models, specialized in different poses/body parts, which contributes to solid improvements in performance. This conclusion is supported by the experiments we conducted in multiple real-world outdoor scenarios, using the data acquired from advertising panels placed in crowded urban environments.enPedestrian attribute recognitionskeleton detectionpose estimationsegmentation.Region-Based CNNs for Pedestrian Gender Recognition in Visual Surveillance EnvironmentsText/Conference Paper1617-5468