首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于HMM的融合多模态的事件检测
引用本文:张玉珍,丁思捷,王建宇,戴跃伟,陈钱.基于HMM的融合多模态的事件检测[J].系统仿真学报,2012,24(8):1638-1642.
作者姓名:张玉珍  丁思捷  王建宇  戴跃伟  陈钱
作者单位:1. 南京理工大学电子工程与光电技术学院,南京210094/江苏省光谱成像与智能感知重点实验室,南京210094/光电成像技术与系统教育部重点实验室,北京100081
2. 南京理工大学自动化学院,南京,210094
3. 江苏科技大学,镇江,212003
基金项目:光电成像技术与系统教育部重点实验室2012年开发基金(2012OEIOF03)
摘    要:事件检测是视频语义分析中的一大难题。在视频中有多种语义丰富的模态信息,融合多模态信息可以帮助准确的检索出所需的事件。提出了一种基于HMM有效融合多模态对象的足球视频语义分析方法。首先从视频中抽取音频流,然后基于CHMM将音频分类。接着根据时间对应关系将音频对象与视频流融合,然后在相应的视频流镜头中基于DHMM融合多种模态对象实现精彩事件如射门、犯规及一般事件的检测,对射门事件进一步结合比分字幕的出现检测进球事件。另外,对DHMM模型的结构、参数初始值尤其是参数约束条件进行了详细地描述。实验证实提出的算法具有较好的效率。

关 键 词:多模态融合  视频语义分析  隐马尔科夫模型  事件检测

Event Detection by Fusing Multimodal Objects Using HMM
ZHANG Yu-zhen,DING Si-jie,WANG Jian-yu,DAI Yue-wei,CHEN Qian.Event Detection by Fusing Multimodal Objects Using HMM[J].Journal of System Simulation,2012,24(8):1638-1642.
Authors:ZHANG Yu-zhen  DING Si-jie  WANG Jian-yu  DAI Yue-wei  CHEN Qian
Institution:1,2,3(1.School of Electric and Optical Engineering,Nanjing University of Science & Technology,Nanjing 210094,China; 2.Jiangsu Key Laboratory of Spectral Imaging & Intelligent Sense,Nanjing 210094,China;3.Key Laboratory of Photoelectronic Imaging Technology and System(Ministry of Education of China),Beijing Institute of Technology, Beijing 100081,China;4.School of Automation,Nanjing University of Science & Technology,Nanjing 210094,China; 5.Jiangsu University of Science and Technology,Zhenjiang 212003,China)
Abstract:Automatic detection of semantic events in sport videos is a challenging task.There is multimodal semantic information in video,and fusing multimodal information can help computer accurately retrieve events needed by people.An efficient sports video event detection method by integrating multimodal objects based on hidden Markov model(HMM) is proposed.First,the audio stream is extracted from video and classified based on continuous HMM(CHMM).Then,according to time corresponding relationship,audio objects and video stream are fused together,and highlight events such as shoots,foul and general events can be detected by in the corresponding video shots fusing multimodal objects based on discrete HMM(DHMM).Among detected shoots,scoring event can be judged on the basis of caption appearance.In addition,structure,initialization and restriction for parameters of DHMM are detailed.Experiments prove the high efficiency of the proposed method.
Keywords:fusing multimodal objects  semantic analysis of video  hidden Markov model  event detection
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号