欢迎访问《大连理工大学学报—

文章摘要

朱志国.基于URL语义分析的Web用户会话识别方法[J].,2011,(3):440-446

基于URL语义分析的Web用户会话识别方法

A method for Web user session identification based on URL semantic analysis

DOI：10.7511/dllgxb201103022

英文关键词: data mining Web usage mining data preprocessing user session identification

基金项目:国家自然科学基金资助项目70671016.

作者	单位
朱志国

摘要点击次数: 1305

全文下载次数: 1194

中文摘要:

由于现有基于时间和引用的经典会话识别方法在复杂Web使用模式挖掘中存在局限性,提出了一个基于URL语义分析的用户会话识别新方法．这个方法借助Web目录服务,将Web日志中的每一条URL记录赋予一定的语义信息,并给出一些测度指标对URL之间的语义相似度进行评价．对静态和流动两类Web日志情况进行分析,分别给出了语义奇异值鉴别方法SOA s和SOA d对用户会话进行切分识别．最后对提出的方法与现有经典方法进行了比较实验与分析,结果表明会话识别的精确率和召回率有所提高．

英文摘要:

Because classical session identification methods based on timeout-oriented and referrer-based heuristics are restricted to discover complex patterns in Web usage mining, a new method based on URL semantic analysis to identify user sessions is presented. Every URL in Web log files is given a centain semantic information with the aid of Web directory in this method and then some factors are defined to measure the semantic distance between URLs. According to static and dynamic Web logs, two semantic outliers detection methods — SOA s and SOA d, are presented respectively to segment user sessions. Finally, some comparison experiments between classical session identification method and the proposed method are conducted, and the results show that the precision ratio and recall ratio of session identification are increased.

查看全文查看/发表评论下载PDF阅读器

关闭