RETINANET-BASED DIRECTED TARGET DETECTION FOR RECYCLABLE WASTE
-
摘要: 可回收垃圾分拣是垃圾处理厂的重要工作,目前人工垃圾分拣效率低,工作环境恶劣,分拣成本高,为实现垃圾分拣的自动化,基于视觉的可回收垃圾自动检测研究具有重要意义。针对传统的水平框目标检测算法在检测时易丢失目标的方向信息,定位框重合现象严重,无法获取目标真实长宽,不利于后续分拣的缺点,提出基于RetinaNet的有向目标检测算法,该算法基于RetinaNet网络进行改进,在检测头中添加角度预测模块,使用PSC角度编码器改善角度回归边界问题,引入Balanced L1 loss损失函数平衡简单样本和困难样本的梯度贡献,替换骨干网络为Swin Transformer以增强网络特征提取能力。带角度预测的网络,能更准确地定位垃圾,改进后的网络精度(mAP)达到78.4%,比原算法提高了12百分点,同时与其他角度编码器相比PSC的检测效果均优于其他方法。Abstract: To realize the automation of waste sorting, the research of vision-based automatic detection of recyclable waste is of great importance. To realize the automation of waste sorting, the traditional horizontal frame target detection algorithm loses the directional information of the target during the detection, and the overlap of the positioning frame is serious so that the true length and width of the target cannot be obtained, which is unfavorable to the subsequent sorting. The algorithm is based on the improvement of the RetinaNet network, adding the angle prediction module in the detection head, using the PSC angle encoder to improve the angle return boundary problem, introducing the Balanced L1 loss function to balance the gradient contribution of simple and difficult samples, and replacing the backbone network with the Swin Transformer to enhance the feature extraction capability of the network. The network with angle prediction can locate the garbage more accurately, and the improved network accuracy (mAP) reaches 78.4%, which is 12 percentage points higher than the original algorithm, while the detection effect of PSC is better than other methods compared with other angle encoders.
-
Key words:
- directed target detection /
- deep learning /
- waste detection /
- angle encoder
-
[1] ZHU C X,QIAN J C,WANG B R. YOLOX on embedded device with CCTV & TensorRT for intelligent multicategories garbage identification and classification[J]. IEEE Sensors Journal,2022,22(16):16522-16532. [2] CAI X,SHUANG F,SUN X,et al. Towards lightweight neural networks for garbage object detection[J]. Sensors,2022,22(19):7455. [3] 韦波,张衡,王斐,等.基于Faster R-CNN的海面垃圾检测研究[J].环境工程,2022,40(7):153-158. [4] 赵珊,刘子路,郑爱玲,等.基于MobileNetV2和IFPN改进的SSD垃圾实时分类检测方法[J].计算机应用,2022,42(增刊1):106-111. [5] MA J,SHAO W,YE H,et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia,2018,20(11):3111-3122. [6] YANG X,YAN J,FENG Z,et al. R3det:refined single-stage detector with feature refinement for rotating object[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021,35(4):3163-3171. [7] DING J,XUE N,LONG Y,et al. Learning RoI transformer for oriented object detection in aerial images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019:2849-2858. [8] XU Y,FU M,WANG Q,et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(4):1452-1459. [9] YANG X,YAN J. Arbitrary-oriented object detection with circular smooth label[C]//Computer Vision-ECCV 2020:16th European Conference,Glasgow,UK,August 23-28,2020,Proceedings,Part Ⅷ 16. Springer International Publishing,2020:677-694. [10] YANG X,YANG X,YANG J,et al. Learning high-precision bounding box for rotated object detection via kullback-leibler divergence[J]. Advances in Neural Information Processing Systems,2021,34:18381-18394. [11] YU Y,DA F. Phase-shifting coder:Predicting accurate orientation in oriented object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023:13354-13363. [12] LIN T Y,GOYAL P,GIRSHICK R,et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017:2980-2988. [13] LIU Z,LIN Y,CAO Y,et al. Swin Transformer:hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021:10012-10022. [14] PANG J,CHEN K,SHI J,et al. Libra r-cnn:towards balanced learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019:821-830. [15] LIN T Y,DOLLAR P,GIRSHICK R,et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017:2117-2125. [16] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al. An image is worth 16x16 words:Transformers for image recognition at scale[J]. ArXiv Preprint ArXiv:2010.11929,2020. [17] LIN T Y,GOYAL P,GIRSHICK R,et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017:2980-2988. [18] YUDIN D,ZAKHARENKO N,SMETANIN A,et al. Hierarchical waste detection with weakly supervised segmentation in images from recycling plants[J]. Available at SSRN 4183424. [19] RUSSAKOVSKY O,DENG J,SU H,et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision,2015,115(3):211-252. [20] XIA G S,BAI X,DING J,et al. DOTA:a large-scale dataset for object detection in aerial images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018:3974-3983.
点击查看大图
计量
- 文章访问数: 52
- HTML全文浏览量: 12
- PDF下载量: 3
- 被引次数: 0