首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
The risk classification of BBS posts is important to the evaluation of societal risk level within a period. Using the posts collected from Tianya forum as the data source, the authors adopted the societal risk indicators from socio psychology, and conduct document-level multiple societal risk classification of BBS posts. To effectively capture the semantics and word order of documents, a shallow neural network as Paragraph Vector is applied to realize the distributed vector representations of the posts in the vector space. Based on the document vectors, the authors apply one classification method KNN to identify the societal risk category of the posts. The experimental results reveal that paragraph vector in document-level societal risk classification achieves much faster training speed and at least 10% improvements of F-measures than Bag-of-Words. Furthermore, the performance of paragraph vector is also superior to edit distance and Lucene-based search method. The present work is the first attempt of combining document embedding method with socio psychology research results to public opinions area.  相似文献   

Ensemble of multiple kNN classifiers for societal risk classification   总被引:1,自引:1,他引:0  
Societal risk classification is a fundamental and complex issue for societal risk perception. To conduct societal risk classification, Tianya Forum posts are selected as the data source, and four kinds of representations: string representation, term-frequency representation, TF-IDF representation and the distributed representation of BBS posts are applied. Using edit distance or cosine similarity as distance metric, four k-Nearest Neighbor (kNN) classifiers based on different representations are developed and compared. Owing to the priority of word order and semantic extraction of the neural network model Paragraph Vector, kNN based on the distributed representation generated by Paragraph Vector (kNN-PV) shows effectiveness for societal risk classification. Furthermore, to improve the performance of societal risk classification, through different weights, kNN-PV is combined with other three kNN classifiers as an ensemble model. Through brute force grid search method, the optimal weights are assigned to different kNN classifiers. Compared with kNN-PV, the experimental results reveal that Macro-F of the ensemble method is significantly improved for societal risk classification.  相似文献   

当今中国处于经济转型升级的关键时期,社会主要矛盾发生了历史性变化,社会风险事件发生的频率比以往更高,危害社会稳定.将公众在线的搜索和关注数据映射为潜在的社会风险事件,如何有效地自动标注风险事件以及直观、清晰地描述社会风险事件是本文关注的重点.本文尝试定义风险事件的5W框架来结构化的描述社会风险,包括地点(where)、时间(when)、人物(who)、原因(why)和发生内容(what).风险事件的5W抽取可转化为不同的机器学习任务,包括命名实体识别、风险分类以及关键词抽取.依据5W的抽取任务进而探索有效的抽取方法.通过对风险事件5W的自动抽取,将现实中社会风险这种wicked问题转化为结构化问题进行分析,为研究社会风险提供一个新的视角,对政府部门进行舆情分析与风险监测具有重要意义.  相似文献   

Online media have brought tremendous changes to civic life, public opinions, and government administration. Compared with traditional media, online media not only allow individuals to browse news and express their views more freely, but also accelerate the transmission of opinions and expand influence. As public opinions may arouse societal unrest, it is worth detecting the primary topics and uncovering the evolution trends of public opinions for societal administration. Various algorithms are developed to deal with the huge volume of unstructured online media data. In this study,dynamic topic model is employed to explore topic content evolution and prevalence evolution using the original posts published from 2013 to 2017 on the Tianya Zatan Board of Tianya Club, which is one of the most popular BBS in China. Based on semantic similarities, topics are grouped into three themes: Family life, societal affairs, and government administration. The evolution of topic prevalence and content are affected by emergent incidents. Topics on family life become popular, while themes"societal affairs" and "government administration" with bigger standard deviations are more likely to be influenced by emergent hot events. Content evolution represented by monthly pairwise distance matrix is very easy to find change points of topic content.  相似文献   

The nearly 30-year economic growth miracle brings the consequent tremendous poor-rich gap leading strong drives for social transformation in current China. Chinese top leaders have realized to increase the peoples' income, improve quality of life and construct a "harmonious society" as key missions especially in recent 10 years. How to measure a harmonious society is one important topic as different measures may lead to different development policies. This paper outlines over 10 indices relevant to measure a harmonious society. Some are global indicators, while some are contributed by domestic researchers and arouse debates. Most of those indicators require conducting surveys on social attitudes under micro levels, which is always time consuming with problem of data quality. As Internet technology advances provide ways to record and disseminate fresh community ideas and thoughts conveniently, detecting topics or emotions from on-line public opinions is becoming a trend or one supplement way to overcome those data acquisition problems. This paper discusses one approach to on-line societal risk perception using hot search words and BBS posts. Such a trial aims to provide another way to societal risk perception different from those in traditional socio psychology studies. Challenges are also indicated.  相似文献   

Modern China is undergoing a variety of social conflicts as the arrival of new era with the transformation of the principal contradiction. Then monitoring the society stable is a huge workload. Online societal risk perception is acquired by mapping on-line public concerns respectively into societal risk events including national security, economy & finance, public morals, daily life, social stability, government management, and resources & environment, and then provides one kind of measurement toward the society state. Obviously, stable and harmonious social situations are the basic guarantee for the healthy development of the stock market. Thus we concern whether the variations of the societal risk are related to stock market volatility. We study their relationships by two steps, first the relationships between search trends and societal risk perception; next the relationships between societal risk perception and stock volatility. The weekend and holiday effects in China stock market are taken into consideration. Three different econometric methods are explored to observe the impacts of variations of societal risk on Shanghai Composite Index and Shenzhen Composite Index. 3 major findings are addressed. Firstly, there exist causal relations between Baidu Index and societal risk perception. Secondly, the perception of finance & economy, social stability, and government management has distinguishing effects on the volatility of both Shanghai Composite Index and Shenzhen Composite Index. Thirdly, the weekend and holiday effects of societal risk perception on the stock market are verified. The research demonstrates that capturing societal risk based on on-line public concerns is feasible and meaningful.  相似文献   

Topics and trends of the on-line public concerns based on Tianya forum   总被引:1,自引:1,他引:0  
Many social events spread fast through the Internet and arouse wide community discussions. Those on-line public opinions emerge into diverse topics along the time. Moreover, the strength of the topics is fluctuating. How to catch both primary topics and trend of topics over the shifting on-line discussions are not only of theoretical importance for scientific research, but also of practical importance for societal management especially in current China. To try the cutting-edge text analytic technologies to deal with unstructured on-line public opinions and provide support for social problem-solving in the big data era is worth an endeavour. This paper applies dynamic topic model (DTM) to explore the changing topics of new posts collected from Tianya Zatan Board of Tianya Club, the most influential Chinese BBS in mainland China. By analysis of the hot and cold terms trends, we catch the topics shift of main on-line concerns with illustrations of topics of school bus and environment in December of 2011. An algorithm is proposed to compute the strength fluctuation of each topic. With visualized analysis of the respective main topics in several months of 2012, some patterns of the topics fluctuation on the board are summarized.  相似文献   

提出了一种集成函数是二次函数且有训练集的多准则层次分类决策方法。该方法利用决策者对训练集的分类结果(属于最高分类和不属于最高分类)构建非线性规划模型,然后采用一系列处理方法将非线性规划模型转换成线性规划模型,求解线性规划,得到训练集中各方案准则值的偏好值和相应参数,通过线性插值或样条插值得到方案在各准则下的偏好值,并计算方案集中方案的与该分类一致性指标值和与低于该分类一致性指标值的差,以确定方案是否属于该分类,然后对不属于最高分类的训练集中的方案进行分类,并构建模型。继续上述过程,直到方案集中所有方案均进行分类为止。最后实例说明该方法的有效性和可行性。  相似文献   

Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning methods, especially kernel methods with Support Vector Machines(SVMs)are very popular and effective tools. In the perspective of kernel matrix, a technique namely Eigenmatrix translation has been introduced for protein data classification. The Eigen-matrix translation strategy has a lot of nice properties which deserve more exploration. This paper investigates the major role of Eigen-matrix translation in classification. The authors propose that its importance lies in the dimension reduction of predictor attributes within the data set. This is very important when the dimension of features is huge. The authors show by numerical experiments on real biological data sets that the proposed framework is crucial and effective in improving classification accuracy. This can therefore serve as a novel perspective for future research in dimension reduction problems.  相似文献   

1 .INTRODUCTIONThere are lots of multi-criteria classification prob-lems in economic and social life . At present thereare many methods for solving multi-criteria classi-fication problems[1 ~2], ELECTRE TRI and UTA/UTADIS are the useful and efficient ones amongthem.In ELECTRE TRI ,criterionis a pseudo-cri-terion,an outranking relation is defined for eachcriterion. The concordance index and non-concord-ance index are defined according to ascertainedweight so that reliability of …  相似文献   

在多用户多模式的交通网络中, 采用考虑成对方案间相关性的成对组合Logit模型, 建立了路径选择满足Wardrop原则, 模式选择满足Logit模型的随机用户均衡模型, 构造了时间价值不同的多种用户类别下, 不同模式间路段阻抗函数满足对称条件时与之等价的数学规划问题, 并证明了所构建的数学规划问题与基于成对组合Logit的多用户多模式随机用户均衡条件的等价性, 进一步证明了模型最优解的存在性和唯一性条件. 最后用一个简单算例表明了所构建的模型的正确性和可行性.  相似文献   

仿真系统与专家系统结合方式分类方法的研究   总被引:4,自引:0,他引:4  
本文首先阐述仿真系统与专家系统之间的相似与不同之处,然后从系统分类的观点将仿真系统与专家系统的结合方式进行分类。通过这种分类方法,一方面可以将目前专家系统在仿真系统中的应用.或仿真系统在专家系统中的应用归纳进某一种类别或几种类别的组合之中,另一方面又可以发现两者结合方式之中存在的问题与不足。最后,通过一个实例说明这种分类方法是怎样帮助人们发现问题并且解决问题。  相似文献   

本文基于灰色类别的差异特性进行指标权重配置问题研究,运用灰色关联聚类将评价对象按照反映事物不同类别本质的自然差异特性划分成不同灰类.基于相同灰色类别间自然差异特性具有相似性、 不同灰色类别间自然差异特性具有差异性的特点,构建了反映灰色类别差异特性的评价指标客观权重极大熵配置模型. 通过案例分析与其他方法进行比较研究,证明了本模型的可行性及有效性,为多属性决策指标客观权重赋权问题提出了一种新的解决思路.  相似文献   

阎满富  杨志民 《系统工程》2004,22(11):12-14
研究当训练点的输出为模糊数时,支持向量机的构建问题。首先将模糊分类问题转化为求解带有模糊决策的机会约束规划问题。利用模糊模拟和基于模糊模拟的遗传算法,求解带有模糊决策的机会约束规划。在此基础上,构造模糊支持向量机(算法)。最后,给出显示模糊支持向量机特点的模糊支持向量集的定义。  相似文献   

基于模糊聚类的信息不完全确定的多准则分类方法   总被引:1,自引:0,他引:1  
针对权系数信息不完全确定且有训练集的多准则分类决策问题,提出了一种基于模糊聚类的分类方法。该方法在考虑对训练集分类的基础上,结合不完全确定的准则权系数信息等建立模糊聚类模型,通过遗传算法求解所得优化模型,得出准则权系数和聚类中心,计算方案属于各类别的隶属度,进而得到整个方案集的分类。实例说明了该方法的有效性和可行性。  相似文献   

Diffusion theory, which deals with the market growth rule governing new products, is an important research method in the field of marketing. This study deals with the two subjects as follows: One is the competition-complementarity effect on products of different categories in the process of diffusion, and the other is the competition-substitution effect on products of different generations. With Kim's model, this study indicates that the potential adopters of one product is influenced both by the adopters of different categories, and by the relative product price of different generations of the same category as well. Furthermore, in this article, a multi-generation products model is proposed according to the analysis above and an empirical study is taken on the telecommunication products. The results indicate that the model has a perfect fitness by using the limited historical data, and the parameter estimation results provide complete understanding of telecommunication products diffusion. Moreover, multi-generation products diffusion model is better than other models in terms of forecast accuracy. So, we conclude that the multi-generation products diffusion model based on competition is effective.  相似文献   

基于支持向量机的分布数据挖掘模型DSVM   总被引:1,自引:1,他引:0  
针对分布环境的数据挖掘要求,提出了基于支持向量机的分布数据挖掘模型DSVM.定义了DSVM中特征多叉树的概念,描述了基于移动Agent访问分布数据集来构建特征多叉树的方法,阐述了通过特征多叉树来反映分布环境各数据集属性总体特征的思想,并利用该数据结构和支持向量机的特点,提出了基于壳向量的分布式支持向量机增量算法来修正和完善特征多叉树,最终实现分布环境下全局的数据挖掘.实验结果表明,该模型有效地解决原有分布环境下其他挖掘算法存储开销大、执行效率差、安全性和隐私性低等问题.  相似文献   

针对BBS论坛成员信息简单,连接关系随机和成员关系模糊的特点,提出基于阈值的BBS回复关系和共同回复关系的两种方法来构建BBS成员交互特性网络,并对这两种方法构建的复杂网络特性展开分析与讨论.同时为了讨论与分析成员的行为特征和交互模式,还讨论了BBS成员相似度和关联度模型.实验结果表明基于阈值的BBS成员在线网络满足小世界特性和无尺度特性.  相似文献   

林遂芳  张海英  潘永湘 《系统仿真学报》2005,17(8):1959-1961,1965
提出一种基于动态时间规整(DTW)和学习矢量量化(LVQ)神经网络的语音识别方法。该方法用动态时间规整算法先对语音信号进行时间规整,然后通过学习矢量量化神经网络进行语音的分类识别。首先介绍利用动态时间规整和学习矢量量化进行语音识别的基本方法,然后给出DTW/LVQ混合模型的系统结构和学习算法,最后给出三种语音识别算法的实验结果。大量实验表明,混合模型的识别率,皆明显高于单一的动态时间规整和学习矢量量化的识别率。  相似文献   

一种基于支持向量机的模糊分类器   总被引:2,自引:0,他引:2  
提出了一种基于支持向量机学习的模糊分类器(FCBSVM).介绍了FCBSVM的基本思想及其结构,分析了隶属函数参数和惩罚参数C对分类规则的产生以及分类性能的影响,并提出了参数确定方法.构建这种分类器时,先选用适当的隶属函数,构造核函数.然后,以训练模式作为中心,进行模糊划分,对每个模糊划分建立一条模糊IF-THEN分类规则.最后,利用支持向量机学习方法,求出支持向量和规则的参数.这种分类器将支持向量机和模糊集合理论的优点结合起来,实现了模糊划分和模糊分类规则的自动产生.用双螺旋线数据和典型的数据集对分类器的性能进行了实验评测,验证了分类器的有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号