首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Societal risk classification is the fundamental issue for online societal risk monitoring. To show the challenge and feasibility of societal risk classification toward BBS posts, an empirical analysis is implemented in this paper. Through effectiveness analysis, Support Vector Machine based on Bag-Of-Words (BOW-SVM) is adopted for challenge validation, and the distributed document embeddings of BBS posts generated by Paragraph Vector are applied to feasibility study. Based on BOW-SVM, cross-validations of BBS posts labeled by different groups and annotators are conducted. The big fluctuation of cross-validation results indicates the differences of individual risk perceptions, which brings more challenges to societal risk classification. Furthermore, based on the distributed document embeddings of BBS posts, the pairwise similarities of more than 300 thousands BBS posts from different societal risk categories are compared. The higher similarities of BBS posts in the same societal risk category reveal that BBS posts in the same societal risk category share more features than BBS posts in different categories, which manifests the feasibility of societal risk classification of BBS posts, and also reflects the possibility to improve the performance of societal risk monitoring.  相似文献   

2.
Ensemble of multiple kNN classifiers for societal risk classification   总被引:1,自引:1,他引:0  
Societal risk classification is a fundamental and complex issue for societal risk perception. To conduct societal risk classification, Tianya Forum posts are selected as the data source, and four kinds of representations: string representation, term-frequency representation, TF-IDF representation and the distributed representation of BBS posts are applied. Using edit distance or cosine similarity as distance metric, four k-Nearest Neighbor (kNN) classifiers based on different representations are developed and compared. Owing to the priority of word order and semantic extraction of the neural network model Paragraph Vector, kNN based on the distributed representation generated by Paragraph Vector (kNN-PV) shows effectiveness for societal risk classification. Furthermore, to improve the performance of societal risk classification, through different weights, kNN-PV is combined with other three kNN classifiers as an ensemble model. Through brute force grid search method, the optimal weights are assigned to different kNN classifiers. Compared with kNN-PV, the experimental results reveal that Macro-F of the ensemble method is significantly improved for societal risk classification.  相似文献   

3.
The nearly 30-year economic growth miracle brings the consequent tremendous poor-rich gap leading strong drives for social transformation in current China. Chinese top leaders have realized to increase the peoples' income, improve quality of life and construct a "harmonious society" as key missions especially in recent 10 years. How to measure a harmonious society is one important topic as different measures may lead to different development policies. This paper outlines over 10 indices relevant to measure a harmonious society. Some are global indicators, while some are contributed by domestic researchers and arouse debates. Most of those indicators require conducting surveys on social attitudes under micro levels, which is always time consuming with problem of data quality. As Internet technology advances provide ways to record and disseminate fresh community ideas and thoughts conveniently, detecting topics or emotions from on-line public opinions is becoming a trend or one supplement way to overcome those data acquisition problems. This paper discusses one approach to on-line societal risk perception using hot search words and BBS posts. Such a trial aims to provide another way to societal risk perception different from those in traditional socio psychology studies. Challenges are also indicated.  相似文献   

4.
Online media have brought tremendous changes to civic life, public opinions, and government administration. Compared with traditional media, online media not only allow individuals to browse news and express their views more freely, but also accelerate the transmission of opinions and expand influence. As public opinions may arouse societal unrest, it is worth detecting the primary topics and uncovering the evolution trends of public opinions for societal administration. Various algorithms are developed to deal with the huge volume of unstructured online media data. In this study,dynamic topic model is employed to explore topic content evolution and prevalence evolution using the original posts published from 2013 to 2017 on the Tianya Zatan Board of Tianya Club, which is one of the most popular BBS in China. Based on semantic similarities, topics are grouped into three themes: Family life, societal affairs, and government administration. The evolution of topic prevalence and content are affected by emergent incidents. Topics on family life become popular, while themes"societal affairs" and "government administration" with bigger standard deviations are more likely to be influenced by emergent hot events. Content evolution represented by monthly pairwise distance matrix is very easy to find change points of topic content.  相似文献   

5.
当今中国处于经济转型升级的关键时期,社会主要矛盾发生了历史性变化,社会风险事件发生的频率比以往更高,危害社会稳定.将公众在线的搜索和关注数据映射为潜在的社会风险事件,如何有效地自动标注风险事件以及直观、清晰地描述社会风险事件是本文关注的重点.本文尝试定义风险事件的5W框架来结构化的描述社会风险,包括地点(where)、时间(when)、人物(who)、原因(why)和发生内容(what).风险事件的5W抽取可转化为不同的机器学习任务,包括命名实体识别、风险分类以及关键词抽取.依据5W的抽取任务进而探索有效的抽取方法.通过对风险事件5W的自动抽取,将现实中社会风险这种wicked问题转化为结构化问题进行分析,为研究社会风险提供一个新的视角,对政府部门进行舆情分析与风险监测具有重要意义.  相似文献   

6.
This paper studies the problem of radar target recognition based on radar cross section (RCS) observation sequence. First, the authors compute the discrete wavelet transform of RCS observation sequence and extract a valid statistical feature vector containing five components. These five components represent five different features of the radar target. Second, the authors establish a set-valued model to represent the relation between the feature vector and the authenticity of the radar target. By set-valued identification method, the authors can estimate the system parameter, based on which the recognition criteria is given. In order to illustrate the efficiency of the proposed recognition method, extensive simulations are given finally assuming that the true target is a cone frustum and the RCS of the false target is normally distributed. The results show that the set-valued identification method has a higher recognition rate than the traditional fuzzy classification method and evidential reasoning method.  相似文献   

7.
Topics and trends of the on-line public concerns based on Tianya forum   总被引:1,自引:1,他引:0  
Many social events spread fast through the Internet and arouse wide community discussions. Those on-line public opinions emerge into diverse topics along the time. Moreover, the strength of the topics is fluctuating. How to catch both primary topics and trend of topics over the shifting on-line discussions are not only of theoretical importance for scientific research, but also of practical importance for societal management especially in current China. To try the cutting-edge text analytic technologies to deal with unstructured on-line public opinions and provide support for social problem-solving in the big data era is worth an endeavour. This paper applies dynamic topic model (DTM) to explore the changing topics of new posts collected from Tianya Zatan Board of Tianya Club, the most influential Chinese BBS in mainland China. By analysis of the hot and cold terms trends, we catch the topics shift of main on-line concerns with illustrations of topics of school bus and environment in December of 2011. An algorithm is proposed to compute the strength fluctuation of each topic. With visualized analysis of the respective main topics in several months of 2012, some patterns of the topics fluctuation on the board are summarized.  相似文献   

8.
随着网络与信息技术的快速发展,导致网络上产生了大量的电子文本,而文本间的相似度计算是文本处理的一种重要手段。对于大规模的文本集,通常采用向量空间模型(vector space model, VSM)进行文本表示,但是该方法面临着文本向量维度较高及文本语义相似度难以度量的问题。提出一种改进的文本相似度计算方法,从大量的特征空间中选择出具有代表性的元数据特征向量元素,以降低向量空间的维度;构建领域概念树并设计基于领域概念树的文本相似度算法,对领域概念中广泛存在的同义词进行处理,以提高文本之间语义相似度度量的性能。实验结果表明:通过降维和概念相似度计算可提高文本相似度计算的性能。  相似文献   

9.
针对已有空中目标识别方法存在的经验风险大、识别率低等不足,依据空中目标的分类原则和纠错码设计原则,设计了针对该问题的纠错码,并训练了码位分类器,最后给出了基于支持向量机的空中目标大类别分类算法。该方法采用纠错编码支持向量机的多类分类技术,降低了经验风险,能对误差进行自动修正,有效地提高了识别率和识别速度。最后给出了一个算例,结果证实了该算法的有效性,并给出了与同类算法的比较结果。  相似文献   

10.
A New Wavelet-Based Document Image Segmentation Scheme~~  相似文献   

11.
基于多分类GA-SVM的高速公路AID模型   总被引:5,自引:2,他引:3  
智能检测系统已为高速公路交通事件检测提供了有效的途径.为了更加细致地了解高速公路交通运行状态,为突发事件的应急处理提供更加高效、可靠的决策支持, 将支持向量机两分类问题延伸到多分类上来.根据交通事件的发生过程,将其分为自由流状态,交通拥堵加剧状态,交通拥堵消散状态.采集VISSIM对交通事件各阶段进行仿真的原始数据集,运用主成分分析方法对交通输入特性进行降维处理,构建支持向量机多分类事件检测模型,最后用遗传算法选择支持向量机模型参数,获得了满意的检测效果.  相似文献   

12.
针对SVM在大类别模式分类中存在的问题,提出了一种基于模糊核聚类的SVM多类分类方法,并给出了一种高效的半模糊核聚类算法。该方法基于模糊核聚类方法生成模糊类,并采用树结构将多个SVM组合起来实现多类分类。模糊核聚类方法不但能够实现更为准确的聚类,而且能够挖掘模糊类的外围、不同模糊类之间的交叠情况等信息,利用这些信息能有效提高分类器的性能。实验表明,所提方法比传统方法具有更高的速度和精度。  相似文献   

13.
Modern China is undergoing a variety of social conflicts as the arrival of new era with the transformation of the principal contradiction. Then monitoring the society stable is a huge workload. Online societal risk perception is acquired by mapping on-line public concerns respectively into societal risk events including national security, economy & finance, public morals, daily life, social stability, government management, and resources & environment, and then provides one kind of measurement toward the society state. Obviously, stable and harmonious social situations are the basic guarantee for the healthy development of the stock market. Thus we concern whether the variations of the societal risk are related to stock market volatility. We study their relationships by two steps, first the relationships between search trends and societal risk perception; next the relationships between societal risk perception and stock volatility. The weekend and holiday effects in China stock market are taken into consideration. Three different econometric methods are explored to observe the impacts of variations of societal risk on Shanghai Composite Index and Shenzhen Composite Index. 3 major findings are addressed. Firstly, there exist causal relations between Baidu Index and societal risk perception. Secondly, the perception of finance & economy, social stability, and government management has distinguishing effects on the volatility of both Shanghai Composite Index and Shenzhen Composite Index. Thirdly, the weekend and holiday effects of societal risk perception on the stock market are verified. The research demonstrates that capturing societal risk based on on-line public concerns is feasible and meaningful.  相似文献   

14.
Text mining, also known as discovering knowledge from the text, which has emerged as a possible solution for the current information explosion, refers to the process of extracting non-trivial and useful patterns from unstructured text. Among the general tasks of text mining such as text clustering, summarization, etc, text classification is a subtask of intelligent information processing, which employs unsupervised learning to construct a classifier from training text by which to predict the class of unlabeled text. Because of its simplicity and objectivity in performance evaluation, text classification was usually used as a standard tool to determine the advantage or weakness of a text processing method, such as text representation, text feature selection, etc. In this paper, text classification is carried out to classify the Web documents collected from XSSC Website (http://www.xssc.ac.cn). The performance of support vector machine (SVM) and back propagation neural network (BPNN) is compared on this task. Specifically, binary text classification and multi-class text classification were conducted on the XSSC documents. Moreover, the classification results of both methods are combined to improve the accuracy of classification. An experiment is conducted to show that BPNN can compete with SVM in binary text classification; but for multi-class text classification, SVM performs much better. Furthermore, the classification is improved in both binary and multi-class with the combined method.  相似文献   

15.
This paper provides a systematic method on the enumeration of various permutation symmetric Boolean functions. The results play a crucial role on the search of permutation symmetric Boolean functions with good cryptographic properties. The proposed method is algebraic in nature. As a by-product, the authors correct and generalize the corresponding results of St?nic? and Maitra (2008). Further, the authors give a complete classification of block-symmetric bent functions based on the results of Zhao and Li (2006), and the result is the only one classification of a certain class of permutation symmetric bent functions after the classification of symmetric bent functions proposed by Savicky (1994).  相似文献   

16.
现实中P2P网贷平台可信用户和违约用户的样本分布具有非均衡性,且投资者对分类错误持有不同接受程度.本文通过使用双边权重误差测量方法和映射距离选择正负样本误差项的隶属度,构建了基于非均衡模糊近似支持向量机(DFPSVM)的P2P网贷借款人信用风险评估模型.然后,提出了借款人信用评分及评级方法.最后,借助人人贷平台借款人信用信息进行了实证分析,结果表明所构建的模型与其他模型相比具有更好的适应能力和较高的分类准确度,能有效减少样本非均衡性对分类结果的影响,显著增加负类样本分类的准确率.获得的人人贷平台借款人的信用得分、信用等级及违约率分布能够为平台控制违约风险及投资者决策提供帮助.  相似文献   

17.
基于一维距离像三阶累积量矩阵的奇异值分解 ,由非零奇异值构成奇异值矢量作为正则子空间法的输入 ,提出一种雷达目标一维距离像识别方法 ,对目标进行分类识别。该方法一方面利用三阶累积量提高了抗噪性能 ,同时又使用非零奇异值矢量减少了存储量与运算量。仿真实验结果表明 :在低信噪比 ,该方法的识别率高于特征子空间法  相似文献   

18.
19.

The algebraic methods represented by Wu’s method have made significant breakthroughs in the field of geometric theorem proving. Algebraic proofs usually involve large amounts of calculations, thus making it difficult to understand intuitively. However, if the authors look at Wu’s method from the perspective of identity,Wu’s method can be understood easily and can be used to generate new geometric propositions. To make geometric reasoning simpler, more expressive, and richer in geometric meaning, the authors establish a geometric algebraic system (point geometry built on nearly 20 basic properties/formulas about operations on points) while maintaining the advantages of the coordinate method, vector method, and particle geometry method and avoiding their disadvantages. Geometric relations in the propositions and conclusions of a geometric problem are expressed as identical equations of vector polynomials according to point geometry. Thereafter, a proof method that maintains the essence of Wu’s method is introduced to find the relationships between these equations. A test on more than 400 geometry statements shows that the proposed proof method, which is based on identical equations of vector polynomials, is simple and effective. Furthermore, when solving the original problem, this proof method can also help the authors recognize the relationship between the propositions of the problem and help the authors generate new geometric propositions.

  相似文献   

20.
针对电子健康服务管理中的多源数据融合难题,利用人工智能技术,结合多任务学习理论与支持向量机理论提出基于多任务支持向量机的数据融合方法(multi-task support vector machine for data fusion,简称mSVMDF).该方法对具有相同数据源的特征向量构造基于支持向量机的融合模型,在多任务学习框架下考虑结构稀疏性与各模型关联性的有机结合,以实现对具有不同数据源个数的多源数据的融合,并以多源影像数据与常规检验数据融合为例,开展数值实验验证方法的有效性.实验结果表明mSVMDF方法可以有效地融合具有不同数据源个数的多源数据,同时该方法具有较好的分类性能与结构稀疏性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号