首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 51 毫秒
1.
兼语结构是汉语中常见的一种动词结构,由述宾短语与主谓短语共享兼语,结构复杂,给句法分析造成困难,因此兼语识别工作对于语义解析及下游任务都具有重要意义。但现存兼语语料库较少,面向中文抽象语义表示(AMR)标注体系的兼语语料库构建仍处于空白阶段。针对这一现状,该文总结出一套兼语语料库标注规范,构建了包含4 760个兼语句的面向中文AMR标注体系的兼语语料库。基于构建的语料库,采用LA-BiLSTM-CRF模型识别兼语结构,达到了86.06%的F1,并分析了识别结果,提出了改进方向。  相似文献   

2.
该文提出了一种基于Viterbi解码的中文合成音库韵律短语边界自动标注方法,以降低大语料库单元拼接合成系统的构建成本。该方法分为模型训练和韵律标注两阶段:模型训练阶段得到频谱、基频和音素时长的上下文相关隐Markov模型(hidden Markov model,HMM);标注阶段借助训练得到的模型采用Viterbi解码完成韵律短语自动标注。实验结果表明:该方法进行韵律短语边界标注时的F-score值达到77.64%,超过了人工标注时不同标注人员之间的一致性水平;另外该方法可以方便地增加待标注韵律属性,具有良好的扩展性。  相似文献   

3.
为了进一步提高完全句法分析标注的准确率,对人工修正的完全句法分析语料进行剖析,从分词、词性和句法结构三个层面检验一致性,总结标注结果不一致的类型,并提出基于分层的自动发现不一致现象的方法及相应的消解策略。实验表明,利用该方法可使语料库标注的准确率提高2.5%。  相似文献   

4.
基于关联规则挖掘的汉语语义搭配规则获取方法   总被引:1,自引:0,他引:1  
针对自然语言处理系统在短语分析时的词汇排歧和结构排歧需要,本文提出了一种基于语料库的汉语短语语义搭配规则自动获取方法.该方法以《知网》为语义知识资源,在标注了句法语义信息的汉语短语熟语料库基础上,先采用数据挖掘中元规则制导的交叉层关联规则挖掘方法,自动发现汉语短语的语义搭配规律,再根据统计结果自动优选后生成语义搭配规则库.实验结果表明该方法是切实可行的.运用该方法自动获取的语义搭配规则具有较好的排歧效果.  相似文献   

5.
近些年来语料库语言学的发展较为迅速,语料库的建设成为一项重要的工作.在对语料加工的过程中,保证词性标注的一致性也成为建设高质量语料库的重要问题.目前国内外对汉语语料库词性标注结果的校对,还停留在人工校对上,对词性标注结果不一致现象尚未进行系统的研究.对于词性标注方法不是很成熟的维吾尔语语料库来说,词性校对方面的研究工作更少.首先概要介绍了一种维吾尔语的标注方法,并受一些文献的启发,根据维吾尔语的特点对其进行词性标注自动校对的研究,并分析其适用于维吾尔语词性校对的可行性,进而提高维吾尔语词性标注的正确率.  相似文献   

6.
朝鲜语中存在大量特殊短语结构,因此在朝汉翻译中,如何准确翻译这些特殊短语显得尤为重要,此举有利于提高机器翻译的精度与效率。本文基于韩国"世宗计划"标注语料库,通过对特殊短语结构进行语言特征分析,构建规则库,以迭代方式提取特殊短语结构及其分布,并以中心词为""的特殊短语为例,进行自动提取实验,取得满意的效果。  相似文献   

7.
以现代哈萨克语短语识别与短语块库构建技术研究工程为背景,以NP和VP结构的歧义类型研究及消除为目的,提取统计方法来处理NP和VP结构的歧义问题.该方法在已经统计与分析出的哈萨克语短语基础上,对哈萨克语NP和VP短语组合结构歧义做全面分析和整理.用互信息方法解决NP和VP的歧义问题准确率(72%)并不高.为了达到更好的准确率就需要数量较大的训练语料库,但是目前实验环境并没有足够的语料.因此,基于规则方法标注好语料并采用人工方式完善训练语料库,再使用最大熵方法来处理歧义问题.实验结果表明,基于统计方法解决NP和VP结构的歧义问题是有效的,其准确率在封闭测试中达到了80.1%.  相似文献   

8.
为了提高词性标注模型训练语料的质量,设计了一种利用FP-Growth算法从训练语料库中自动获取词性标注规则的方法,并将该方法与基于Apriori算法的词性标注规则获取方法进行了对比实验.实验结果显示,对于0.1万、0.2万和1万词级的小规模语料库,2种方法获取的词性标注规则条数均相同,但基于FP-Growth算法的时间耗费分别仅为基于Apriori算法的0.013 866%,0.010 399%,0.003 132%;对于10万、100万词级的训练语料库,基于Apriori算法无法获取任何规则,但基于FP-Growth算法依然可以在合理时间内获取有效的规则.这说明,基于FP-Growth算法的词性标注规则获取方法是可行且高效的,满足在优化训练语料库时能从不同规模的语料库中自动获取词性标注规则的实际需求.  相似文献   

9.
通过本项研究,我们对100万词级现代蒙古语语料库做了短语标注,建立了现代蒙古语基本短语库。这一成果。对今后建立一个面向信息处理的、具有较强通用性的蒙古语语义分类和描述体系,提供了必要的前提条件。局部测试结果表明,标注软件对简单句子标注基本短语的召回率和准确率分别达到了92.93%和86.79%。今后有必要深入研究语义信息的获取、语法信息的细化以及蒙古语短语的歧义结构种类、产生歧义结构的原因等问题。  相似文献   

10.
借鉴并利用基于短语的因子化机器翻译方法,结合基于隐马尔科夫模型的词性标注系统实现了蒙古文的自动词性标注.首先使用基于短语的因子化机器翻译方法对词表词进行标注,然后用基于隐马尔科夫模型的词性标注方法对生词进行标注.实验结果表明,采取的蒙古文词性标注方法的准确率达到97.91%.最后,将该方法标注的词性融入到蒙汉统计机器翻译系统后,译文质量有了较大提高,进一步证明该方法的有效性和实用性.  相似文献   

11.
There are numerous geometric objects stored in the spatial databases. An importance function in a spatial database is that users can browse the geometric objects as a map efficiently. Thus the spatial database should display the geometric objects users concern about swiftly onto the display window. This process includes two operations:retrieve data from database and then draw them onto screen. Accordingly, to improve the efficiency, we should try to reduce time of both retrieving object and displaying them. The former can be achieved with the aid of spatial index such as R-tree, the latter require to simplify the objects. Simplification means that objects are shown with sufficient but not with unnecessary detail which depend on the scale of browse. So the major problem is how to retrieve data at different detail level efficiently. This paper introduces the implementation of a multi-scale index in the spatial database SISP (Spatial Information Shared Platform) which is generalized from R-tree. The difference between the generalization and the R-tree lies on two facets: One is that every node and geometric object in the generalization is assigned with a importance value which denote the importance of them, and every vertex in the objects are assigned with a importance value,too. The importance value can be use to decide which data should be retrieve from disk in a query. The other difference is that geometric objects in the generalization are divided into one or more sub-blocks, and vertexes are total ordered by their importance value. With the help of the generalized R-tree, one can easily retrieve data at different detail levels.Some experiments are performed on real-life data to evaluate the performance of solutions that separately use normal spatial index and multi-scale spatial index. The results show that the solution using multi-scale index in SISP is satisfying.  相似文献   

12.
The discovery of the prolific Ordovician Red River reservoirs in 1995 in southeastern Saskatchewan was the catalyst for extensive exploration activity which resulted in the discovery of more than 15 new Red River pools. The best yields of Red River production to date have been from dolomite reservoirs. Understanding the processes of dolomitization is, therefore, crucial for the prediction of the connectivity, spatial distribution and heterogeneity of dolomite reservoirs.The Red River reservoirs in the Midale area consist of 3~4 thin dolomitized zones, with a total thickness of about 20 m, which occur at the top of the Yeoman Formation. Two types of replacement dolomite were recognized in the Red River reservoir: dolomitized burrow infills and dolomitized host matrix. The spatial distribution of dolomite suggests that burrowing organisms played an important role in facilitating the fluid flow in the backfilled sediments. This resulted in penecontemporaneous dolomitization of burrow infills by normal seawater. The dolomite in the host matrix is interpreted as having occurred at shallow burial by evaporitic seawater during precipitation of Lake Almar anhydrite that immediately overlies the Yeoman Formation. However, the low δ18O values of dolomited burrow infills (-5.9‰~ -7.8‰, PDB) and matrix dolomites (-6.6‰~ -8.1‰, avg. -7.4‰ PDB) compared to the estimated values for the late Ordovician marine dolomite could be attributed to modification and alteration of dolomite at higher temperatures during deeper burial, which could also be responsible for its 87Sr/86Sr ratios (0.7084~0.7088) that are higher than suggested for the late Ordovician seawaters (0.7078~0.7080). The trace amounts of saddle dolomite cement in the Red River carbonates are probably related to "cannibalization" of earlier replacement dolomite during the chemical compaction.  相似文献   

13.
AcomputergeneratorforrandomlylayeredstructuresYUJia shun1,2,HEZhen hua2(1.TheInstituteofGeologicalandNuclearSciences,NewZealand;2.StateKeyLaboratoryofOilandGasReservoirGeologyandExploitation,ChengduUniversityofTechnology,China)Abstract:Analgorithmisintrod…  相似文献   

14.
Instead of following Fock’s expansion,we solve the Schrodinger equation for some quantum mechanical manybody systems such as electrons in atoms and charged excitons in quantum wells in a similar way in hyperspherical coordinates by expanding the wave functions into orthonormal complete basis sets of the hyperspherical hannonics(HHs)of hyperangles and generalized Laguerre polynomials(GLPs)of the hyperradius.This leads the equation to  相似文献   

15.
Future mobile communication systems aim at providing very high data transmission rates, even in high-mobility scenarios such as high-speed wheel-track trains, maglev trains, highway vehicles, airplanes, guided missiles or spacecraft. A particularly important commercial application is the strong and increasing worldwide demand for high- speed broadband wireless communications (up to 574.8 km/ h test speeds or 380 km/h commercial speeds) in railways, providing data, voice and video services for applications such as onboard entertainment services to passengers, train control, train dispatch, train sensor status handling and sur- veillance. In such high-mobility scenarios, there are a number of communication challenges, including fast hand- over, location updating, high-speed channel modeling, estimation and equalization, anti-Doppler spreading tech- niques, fast power control, and dedicated network architec- ture. Because signal transmission in very high-speed scenarios will inevitably experience serious deterioration, it is imperative to develop key broadband mobile communi- cation techniques for such high-speed vehicles.  相似文献   

16.
17.
本文叙述了对海南岛及其毗邻大陆边缘白垩纪到第四纪地层岩石进行古地磁研究的全部工作过程。通过分析岩石中剩余磁矢量的磁偏角及磁倾角的变化,提出海南岛白垩纪以来经历的构造演化模式如下:早期伴随顺时针旋转而向南迁移,后期伴随逆时针转动并向北运移。联系该地区及邻区的地质、地球物理资料,对海南岛上述的构造地体运动提出以下认识:北部湾内早期有一拉张作用,主要是该作用使湾内地壳显著伸长减薄,形成北部湾盆地。从而导致了海南岛的早期构造运动,而海南岛后期的构造运动则主要是受南海海底扩张的影响。海南地体运动规律的阐明对于了解北部湾油气盆地的形成演化有重要的理论和实际意义。  相似文献   

18.
Being the primary media of geographical information and the elementary objects manipulated, almost all of maps adopt the layer-based model to represent geographic information in the existent GIS. However, it is difficult to extend the map represented in layer-based model. Furthermore, in Web-Based GIS, It is slow to transmit the spatial data for map viewing. In this paper, for solving the questions above, we have proposed a new method for representing the spatial data. That is scale-based model. In this model we represent maps in three levels: scale-view, block, and spatial object, and organize the maps in a set of map layers, named Scale-View, which associates some given scales.Lastly, a prototype Web-Based GIS using the proposed spatial data representation is described briefly.  相似文献   

19.
Various applications relevant to the exciton dynamics,such as the organic solar cell,the large-area organic light-emitting diodes and the thermoelectricity,are operating under temperature gradient.The potential abnormal behavior of the exicton dynamics driven by the temperature difference may affect the efficiency and performance of the corresponding devices.In the above situations,the exciton dynamics under temperature difference is mixed with  相似文献   

20.
The elongation method,originally proposed by Imamura was further developed for many years in our group.As a method towards O(N)with high efficiency and high accuracy for any dimensional systems.This treatment designed for one-dimensional(ID)polymers is now available for three-dimensional(3D)systems,but geometry optimization is now possible only for 1D-systems.As an approach toward post-Hartree-Fock,it was also extended to  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号