首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 380 毫秒
1.
针对数控机床(computer numerical control,CNC)故障领域命名实体识别方法中存在实体规范不足及有效实体识别模型缺乏等问题,制定了领域内实体标注策略,提出了一种基于双向转换编码器(bidirectional encoder representations from transformers,BERT)的数控机床故障领域命名实体识别方法。采用BERT编码层预训练,将生成向量输入到双向长短期记忆网络(bidirectional long short-term memory,BiLSTM)交互层以提取上下文特征,最终通过条件随机域(conditional random field,CRF)推理层输出预测标签。实验结果表明,BERT-BiLSTM-CRF模型在数控机床故障领域更具优势,与现有模型相比,F1值提升大于1.85%。  相似文献   

2.
针对于目前传统的命名实体识别模型在食品案件纠纷裁判文书领域的准确率不足的问题,在双向长短时记忆网络的基础上提出一种基于双向编码器表示模型(bidirectional encoder representations from transformers,Bert)和注意力机制的命名实体识别模型.模型通过Bert层进行字向量预训练,根据上下文语意生成字向量,字向量序列输入双向长短期记忆网络(bi-directional long short-term memory,BiLSTM)层和Attention层提取语义特征,再通过条件随机场(conditional random field,CRF)层预测并输出字的最优标签序列,最终得到食品案件纠纷裁判文书中的实体.实验表明,该模型在食品纠纷法律文书上面的准确率和F1值分别达到了92.56%和90.25%,准确率相较于目前应用最多的BiLSTM-CRF模型提升了6.76%.Bert-BiL-STM-Attention-CRF模型通过对字向量的预训练,充分结合上下文语意,能够有效克服传统命名实体识别模型丢失字的多义性的问题,提高了食品案件纠纷裁判文书领域命名实体识别的准确率.  相似文献   

3.
文本数据中的实体和关系抽取是领域知识图谱构建和更新的来源.针对金融科技领域中文本数据存在重叠关系、训练数据缺乏标注样本等问题,提出一种融合主动学习思想的实体关系联合抽取方法.首先,基于主动学习,以增量的方式筛选出富有信息量的样本作为训练数据;其次,采用面向主实体的标注策略将实体关系联合抽取问题转化为序列标注问题;最后,基于改进的BERT-BiGRU-CRF模型实现领域实体与关系的联合抽取,为知识图谱构建提供支撑技术,有助于金融从业者根据领域知识进行分析、投资、交易等操作,从而降低投资风险.针对金融领域文本数据进行实验测试,实验结果表明,本文所提出的方法有效,验证了该方法后续可用于金融知识图谱的构建.  相似文献   

4.
在数据匮乏的领域,命名实体识别效果受限于欠拟合的字词特征表达,引入常规的多任务学习方法可以有所改善,但需要额外的标注成本.针对这一问题,提出了一种基于多粒度认知的命名实体识别方法,在不产生额外标注成本的前提下,增强字特征信息,提高命名实体识别效果.该方法从多粒度认知理论出发,以BiLSTM和CRF为基础模型,将字粒度下的命名实体识别任务与句子全局粒度下的实体数量预测任务相联合,共同优化字嵌入表达.三个不同类型的数据集上的多组实验表明,引入多粒度认知的方法有效地提升了命名实体识别效果.  相似文献   

5.
近年来,深度学习方法被广泛地应用于命名实体识别任务中,并取得了良好的效果.但是主流的命名实体识别都是基于序列标注的方法,这类方法依赖于足够的高质量标注语料.然而序列数据的标注成本高昂,导致命名实体识别训练集规模往往较小,这严重地限制了命名实体识别模型的最终性能.为了在不增加人工成本的前提下扩大命名实体识别的训练集规模,本文分别提出了基于EDA(Easy Data Augmentation)、基于远程监督、基于Bootstrap(自展法)的命名实体识别数据增强技术.通过在本文给出的FIND-2019数据集上进行的实验表明,这几种数据增强技术及其它们的组合能够低成本地增加训练集的规模,从而显著地提升命名实体识别模型的性能.  相似文献   

6.
提出了基于条件随机场(conditional random fields,CRF)的网页动态关系抽取算法.给出了动态关系的定义,建立了动态关系的表示模型,并用一个六维结构来表达动态关系.与传统关系抽取中基于规则或者基于分类的解决方法不同,本文认为可以将动态关系识别问题转化为一个标注问题,并提出了基于CRF的句子层面的关系标注和抽取方法.在本算法中,首先将一个句子通过语义角色标注(semantic role labeling,SRL)系统进行成分识别,然后将语义角色标注结果以及词的POS类型、词组的命名实体类型等作为CRF的训练特征,对句子成分进行标注.最后测试了大量的真实新闻网页,实验结果表明了本文提出算法的实用性和有效性.  相似文献   

7.
针对武器装备领域复杂实体的特点, 提出一种融合多特征后挂载武器装备领域知识的复杂命名实体识别方法。首先, 使用BERT 模型对武器装备领域数据进行预训练, 得到数据向量, 使用Word2Vec模型学习郑码、五笔、拼音和笔画的上下位特征, 获取特征向量。然后, 将数据向量与特征向量融合, 利用Bi-LSTM模型进行编码, 使用CRF解码得到标签序列。最后, 基于武器装备领域知识, 对标签序列进行复杂实体的触发检测, 完成复杂命名实体识别。使用环球军事网数据作为语料进行实验, 分析不同的特征组合、不同神经网络模型下的识别效果, 并提出适用于评价复杂命名实体识别结果的计算方法。实验结果表明, 提出的挂载领域知识且融合多特征的武器装备复杂命名实体识别方法的F1值达到95.37%, 优于现有方法。  相似文献   

8.
针对双向长短时记忆网络-条件随机场(bi-directional long short-term memory-conditional random field,BiLSTM-CRF)模型存在准确率低和向量无法表示上下文的问题,提出一种改进的中文命名实体识别模型。利用裁剪的双向编码器表征模型(bidirectional encoder representations from transformers,BERT)得到包含上下文信息的语义向量;输入双向门控循环单元(bidirectional gated recurrent unit,BiGRU)网络及多头自注意力层捕获序列的全局和局部特征;通过条件随机场(conditional random field,CRF)层进行序列解码标注,提取出命名实体。在人民日报和微软亚洲研究院(Microsoft research Asia,MSRA)数据集上的实验结果表明,改进模型在识别效果和速度方面都有一定提高;对BERT模型内在机理的分析表明,BERT模型主要依赖从低层和中层学习到的短语及语法信息完成命名实体识别(named entity recognition,NER)任务。  相似文献   

9.
针对利用远程监督标注文本实体过程中存在实体类别标注错误问题导致模型难以有效区分各实体的类别特征,影响模型精准度的问题,本文提出一种利用原型网络过滤训练语料中标注错误样本的远程监督命名实体识别方法,利用预训练的原型网络编码正确标注实体生成类别原型表示,过滤语料中距类别原型较远的样本.实验表明,使用原型网络有效地提高了语料的标注质量,提升了模型性能.  相似文献   

10.
互联网公开数据蕴含着大量高价值的军事情报,成为获取开源军事情报的重要数据源之一。军事领域命名实体识别是进行军事领域信息提取、问答系统、知识图谱等工作的基础性关键任务。相比较于其他领域的命名实体,军事领域命名实体边界模糊,界定困难;互联网媒体中军事术语表达不规范,随意性的简化表达现象较普遍;现阶段面向军事领域的公开语料鲜见。该文提出一种考虑实体模糊边界的标注策略,结合领域专家知识,构建了基于微博数据的军事语料集MilitaryCorpus;提出一种多神经网络协作的军事领域命名实体识别模型,该模型通过基于Transformer的双向编码器(bidirectional encoder representations from transformers, BERT)的字向量表达层获得字级别的特征,通过双向长短时记忆神经网络(bi-directional long short-term memory, BiLSTM)层抽取上下文特征形成特征矩阵,最后由条件随机场层(conditional random field, CRF)生成最优标签序列。实验结果表明:相较于基于CRF的实体识别模型,应用该文提出的BERT-BiLSTM-CRF模型召回率提高28.48%,F值提高18.65%;相较于基于BiLSTM-CRF的实体识别模型,该文模型召回率提高13.91%,F值提高8.69%;相较于基于CNN (convolutional neural networks)-BiLSTM-CRF的实体识别模型,该文模型召回率提高7.08%,F值提高5.15%。  相似文献   

11.
The discovery of the prolific Ordovician Red River reservoirs in 1995 in southeastern Saskatchewan was the catalyst for extensive exploration activity which resulted in the discovery of more than 15 new Red River pools. The best yields of Red River production to date have been from dolomite reservoirs. Understanding the processes of dolomitization is, therefore, crucial for the prediction of the connectivity, spatial distribution and heterogeneity of dolomite reservoirs.The Red River reservoirs in the Midale area consist of 3~4 thin dolomitized zones, with a total thickness of about 20 m, which occur at the top of the Yeoman Formation. Two types of replacement dolomite were recognized in the Red River reservoir: dolomitized burrow infills and dolomitized host matrix. The spatial distribution of dolomite suggests that burrowing organisms played an important role in facilitating the fluid flow in the backfilled sediments. This resulted in penecontemporaneous dolomitization of burrow infills by normal seawater. The dolomite in the host matrix is interpreted as having occurred at shallow burial by evaporitic seawater during precipitation of Lake Almar anhydrite that immediately overlies the Yeoman Formation. However, the low δ18O values of dolomited burrow infills (-5.9‰~ -7.8‰, PDB) and matrix dolomites (-6.6‰~ -8.1‰, avg. -7.4‰ PDB) compared to the estimated values for the late Ordovician marine dolomite could be attributed to modification and alteration of dolomite at higher temperatures during deeper burial, which could also be responsible for its 87Sr/86Sr ratios (0.7084~0.7088) that are higher than suggested for the late Ordovician seawaters (0.7078~0.7080). The trace amounts of saddle dolomite cement in the Red River carbonates are probably related to "cannibalization" of earlier replacement dolomite during the chemical compaction.  相似文献   

12.
AcomputergeneratorforrandomlylayeredstructuresYUJia shun1,2,HEZhen hua2(1.TheInstituteofGeologicalandNuclearSciences,NewZealand;2.StateKeyLaboratoryofOilandGasReservoirGeologyandExploitation,ChengduUniversityofTechnology,China)Abstract:Analgorithmisintrod…  相似文献   

13.
本文叙述了对海南岛及其毗邻大陆边缘白垩纪到第四纪地层岩石进行古地磁研究的全部工作过程。通过分析岩石中剩余磁矢量的磁偏角及磁倾角的变化,提出海南岛白垩纪以来经历的构造演化模式如下:早期伴随顺时针旋转而向南迁移,后期伴随逆时针转动并向北运移。联系该地区及邻区的地质、地球物理资料,对海南岛上述的构造地体运动提出以下认识:北部湾内早期有一拉张作用,主要是该作用使湾内地壳显著伸长减薄,形成北部湾盆地。从而导致了海南岛的早期构造运动,而海南岛后期的构造运动则主要是受南海海底扩张的影响。海南地体运动规律的阐明对于了解北部湾油气盆地的形成演化有重要的理论和实际意义。  相似文献   

14.
Various applications relevant to the exciton dynamics,such as the organic solar cell,the large-area organic light-emitting diodes and the thermoelectricity,are operating under temperature gradient.The potential abnormal behavior of the exicton dynamics driven by the temperature difference may affect the efficiency and performance of the corresponding devices.In the above situations,the exciton dynamics under temperature difference is mixed with  相似文献   

15.
The elongation method,originally proposed by Imamura was further developed for many years in our group.As a method towards O(N)with high efficiency and high accuracy for any dimensional systems.This treatment designed for one-dimensional(ID)polymers is now available for three-dimensional(3D)systems,but geometry optimization is now possible only for 1D-systems.As an approach toward post-Hartree-Fock,it was also extended to  相似文献   

16.
17.
The explosive growth of the Internet and database applications has driven database to be more scalable and available, and able to support on-line scaling without interrupting service. To support more client's queries without downtime and degrading the response time, more nodes have to be scaled up while the database is running. This paper presents the overview of scalable and available database that satisfies the above characteristics. And we propose a novel on-line scaling method. Our method improves the existing on-line scaling method for fast response time and higher throughputs. Our proposed method reduces unnecessary network use, i.e. , we decrease the number of data copy by reusing the backup data. Also, our on-line scaling operation can be processed parallel by selecting adequate nodes as new node. Our performance study shows that our method results in significant reduction in data copy time.  相似文献   

18.
R-Tree is a good structure for spatial searching. But in this indexing structure,either the sequence of nodes in the same level or sequence of traveling these nodes when queries are made is random. Since the possibility that the object appears in different MBR which have the same parents node is different, if we make the subnode who has the most possibility be traveled first, the time cost will be decreased in most of the cases. In some case, the possibility of a point belong to a rectangle will shows direct proportion with the size of the rectangle. But this conclusion is based on an assumption that the objects are symmetrically distributing in the area and this assumption is not always coming into existence. Now we found a more direct parameter to scale the possibility and made a little change on the structure of R-tree, to increase the possibility of founding the satisfying answer in the front sub trees. We names this structure probability based arranged R-tree (PBAR-tree).  相似文献   

19.
The geographic information service is enabled by the advancements in general Web service technology and the focused efforts of the OGC in defining XML-based Web GIS service. Based on these models, this paper addresses the issue of services chaining,the process of combining or pipelining results from several interoperable GIS Web Services to create a customized solution. This paper presents a mediated chaining architecture in which a specific service takes responsibility for performing the process that describes a service chain. We designed the Spatial Information Process Language (SIPL) for dynamic modeling and describing the service chain, also a prototype of the Spatial Information Process Execution Engine (SIPEE) is implemented for executing processes written in SIPL. Discussion of measures to improve the functionality and performance of such system will be included.  相似文献   

20.
Advances in wireless technologies and positioning technologies and spread of wireless devices, an interest in LBS (Location Based Service) is arising. To provide location based service, tracking data should have been stored in moving object database management system (called MODBMS) with proper policies and managed efficiently. So the methods which acquire the location information at regular time intervals then, store and manage have been studied. In this paper, we suggest tracking data management techniques using topology that is corresponding to the moving path of moving object. In our techniques, we update the MODBMS when moving object arrived at a street intersection or a curved road which is represented as the node in topology and predict the location at past and future with attribute of topology and linear function. In this technique, location data that are corresponding to the node in topology are stored, thus reduce the number of update and amount of data. Also in case predicting the location,because topology are used as well as existing location information, accuracy for prediction is increased than applying linear function or spline function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号