Abstract:To address the issues of missed and false detections in dense crowd detection with existing pose estimation algorithms, an improved YOLOv8sPose algorithm for dense crowd pose estimation, named YOLOv8Pose-Dense Crowd (YOLOv8Pose-DC), is proposed. Firstly, a centralized intrinsic adjustment feature pyramid network is designed, which combines deformable attention mechanisms and CASPPF in a parallel manner. It globally focuses and adjusts the pyramid network from top to bottom, increasing the spatial weight of global representation within the network. This enables the improved algorithm to obtain comprehensive and distinctive feature representations. Secondly, a multi-scale dual detection head structure is proposed, which reduces computational complexity while enhancing model detection efficiency. Furthermore, the DySample module is utilized to improve the model""s upsampling efficiency. Lastly, a context-aware module is added to enhance the model""s ability to associate global information and suppress irrelevant background features, highlighting human characteristics. Experimental results show that, compared to the baseline model, YOLOv8Pose-DC increases mAP@0.5 by 3.1% and recall rate by 4.2%. The designed algorithm significantly improves performance and fully meets the needs of production.