首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于双尺度串行网络的视频异常行为检测
引用本文:吴德刚,赵利平,陈乾辉,张宇波.基于双尺度串行网络的视频异常行为检测[J].广西科学,2023,30(3):575-586.
作者姓名:吴德刚  赵利平  陈乾辉  张宇波
作者单位:商丘工学院机械工程学院, 河南商丘 476000;商丘工学院信息与电子工程学院, 河南商丘 476000;郑州大学电气与信息工程学院, 河南郑州 450000
基金项目:河南省高等学校青年骨干教师培养计划项目(2018GGJS190)和商丘工学院2022科研项目(2022KYXM02)资助。
摘    要:针对传统视频异常行为检测模型存在的性能不佳与时间开销较大的问题,从空间和时序维度构造双尺度串行网络的视频异常行为检测模型(Dual-Scale Serial Network,DSS-Net)。首先,利用深度可分离卷积对Vgg-16网络进行改进,并利用改进的特征提取器从空间维度提取特征,从而可以通过减少计算参数量来降低模型的时间开销。接着,在此基础上引入注意力机制,从而强化目标特征的表达能力。最后,利用长短期记忆(Long Short-Term Memory,LSTM)网络从时序维度提取运动视频每一帧之间的上下文时序关系。在当前主流的UCSD Ped1和Ped2数据集以及更具挑战性的UCF数据集上进行测试,结果表明,在3个数据集上DSS-Net的ROC(Receiver Operating Characteristic)线下面积(Area Under Curve,AUC)值分别达到95.30%、96.80%、80.60%,等错误率(Equal Error Rate,EER)分别达到10.60%、12.60%、18.50%,同时具有更强的实时性。相比经典的One-class Neural Network (ONN)和Aggregation of Ensembles (AOE)模型,DSS-Net在Ped1和Ped2数据集上的AUC值分别提升了0.42%和0.94%。此外,DSS-Net也在UMN、ShanghaiTech和CUHK Avenue等数据集上进行了泛化能力和鲁棒性的测试,结果与当前主流模型相比具有一定的竞争力。

关 键 词:视频异常行为检测  空间维度  时序维度  深度可分离卷积  注意力机制
收稿时间:2022/10/31 0:00:00
修稿时间:2023/1/9 0:00:00

Video Abnormal Behavior Detection with Dual-Scale Serial Network
WU Degang,ZHAO Liping,CHEN Qianhui,ZHANG Yubo.Video Abnormal Behavior Detection with Dual-Scale Serial Network[J].Guangxi Sciences,2023,30(3):575-586.
Authors:WU Degang  ZHAO Liping  CHEN Qianhui  ZHANG Yubo
Institution:College of Mechanical Engineering, Shangqiu Institute of Technology, Shangqiu, Henan, 476000, China;College of Information and Electronic Engineering, Shangqiu Institute of Technology, Shangqiu, Henan, 476000, China; College of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, Henan, 450000, China
Abstract:Aiming at the problems of poor performance and large time overhead in traditional video abnormal behavior detection models,a video abnormal behavior detection model (Dual-Scale Serial Network,DSS-Net) based on dual-scale serial network is constructed from spatial and temporal dimensions.Firstly,the Vgg-16 network is improved by using deep separable convolution,and the improved feature extractor is used to extract features from the spatial dimension,so that the time overhead of the model can be reduced by reducing the number of calculation parameters.Then,on this basis,the attention mechanism is introduced to strengthen the expression ability of the target features.Finally,the Long Short-Term Memory (LSTM) network is used to extract the context temporal relationship between each frame of motion video from the temporal dimension.The test was carried out on the current mainstream UCSD Ped1 and Ped2 datasets and the more challenging UCF dataset.The results show that the Receiver Operating Characteristic (ROC) Area Under Curve (AUC) values of the proposed model on the three data sets reach 95.30%,96.80% and 80.60% respectively,and the Equal Error Rate (EER) reaches 10.60%,12.60% and 18.50%,respectively,and it has stronger real-time performance.Compared with the classical One-class Neural Network (ONN) and Aggregation of Ensembles (AOE) models,the AUC values of the proposed model DSS-Net are increased by 0.42% and 0.94% on the Ped1 and Ped2 datasets,respectively.In addition,the proposed model DSS-Net is also tested for generalization ability and robustness on data sets such as UMN,ShanghaiTech,and CUHK Avenue,and the results are also competitive compared with the current mainstream models.
Keywords:video abnormal behavior detection  spatial dimensions  temporal dimensions  deeply separable convolution  attention mechanism
点击此处可从《广西科学》浏览原始摘要信息
点击此处可从《广西科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号