首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 387 毫秒
1.
Q-learning作为一种无模型的值迭代强化学习算法,被广泛应用于移动机器人在非结构环境下的导航任务中。针对Q-learning在移动机器人导航中环境探索和利用存在矛盾关系导致收敛速度慢的问题,该文在Q-learning算法的基础上,受啮齿类动物可以利用嗅觉线索来进行空间定向和导航的启发,提出一种基于气味奖励引导的Q-learning环境认知策略。该算法通过改善Q-learning中的动作选择策略来减少对环境的无用探索,在动作选择策略中融入了环境气味奖励的引导,并提出了嗅觉因子来平衡动作选择策略中Q-learning和气味奖励引导的权重关系。为了验证算法的有效性,在Tolman老鼠实验所用的迷宫环境中进行了仿真实验,动态仿真结果表明,相比Q-learning算法,基于气味奖励引导的Q-learning算法在环境认知过程中,可减少对环境的无用探索,并增强对环境的认知学习能力,且提高算法的收敛速度。  相似文献   

2.
经典Q-learning强化学习模型中学习率为一固定参数,无法有效反映认知学习的动态过程。提出了一种将学习速率表征为时变参数的Q-Learning强化学习模型,给出了利用近期历史行为数据估计阶段性学习速率的方法。为了评估验证该模型的性能,设计了条件刺激与操作行为奖励无关→相关→无关三个阶段动态试验范式,用以观察和分析鸽子在随机强化、固定强化,以及固定强化关系消退等不同条件下的学习行为变化过程,采用动物触屏行为系统完成了3只鸽子颜色刺激-啄屏抉择认知训练,利用训练过程中不同session的行为数据对动态学习率进行了最小二乘估计。分析结果表明:可以获得更小的行为预测误差,误差下降收敛的速度更快,同时学习率的动态变化过程可以有效的反映动物认知行为训练过程中的内在学习状态。  相似文献   

3.
在卫星通信系统中,频率和信道是十分珍稀的资源,针对如何利用可靠且高效的方法来进行资源的开发这一亟需解决的难题,提出了一种基于Q-learning深度强化学习的动态卫星信道分配算法DRL-DCA,该算法将卫星和环境交互建模为马尔科夫决策过程,通过环境的反馈提升卫星的决策能力,实现用户业务请求的高效应答,提升卫星通信的服务质量,降低通信阻塞发生概率。仿真分析表明该算法能够有效地提升通信的吞吐量,降低通信的阻塞率。  相似文献   

4.
针对道路长期性能养护决策中庞大的数据分析问题,将深度确定性策略梯度(deep deterministic policy gradient, DDPG)强化学习模型引入到了养护决策分析中,将道路性能的提升及养护资金的有效利用作为机器学习的奖励目标,建立了一套科学有效的沥青路面长期性能养护决策方法,经过与DQN(deep Q-learning network)算法和Q-Learning算法进行对比,DDPG算法所需要的采样数据更少、收敛速度更快,表现更为优异,可有效提升道路服役性能的评估效率,对沥青路面多目标长期养护决策方案的制定起着重要的推动作用。  相似文献   

5.
针对现有合作学习算法存在频繁通信、能量消耗过大等问题,应用目标跟踪建立任务模型,文章提出一种基于Q学习和TD误差(Q-learning and TD error,QT)的传感器节点任务调度算法。具体包括将传感器节点任务调度问题映射成Q学习可解决的学习问题,建立邻居节点间的协作机制以及定义延迟回报、状态空间等基本学习元素。在协作机制中,QT使得传感器节点利用个体和群体的TD误差,通过动态改变自身的学习速度来平衡自身利益和群体利益。此外,QT根据Metropolis准则提高节点学习前期的探索概率,优化任务选择。实验结果表明:QT具备根据当前环境进行动态调度任务的能力;相比其他任务调度算法,QT消耗合理的能量使得单位性能提高了17.26%。  相似文献   

6.
基于情感和认知的学习与决策模型是一个新的强化学习算法,该模型将来自情感的内在奖励和来自认知的外部奖励作为学习和决策的动机.将该模型引入到基于行为的机器人控制体系中,构造了一个新的移动机器人导航控制系统.通过基于在线的情感认知学习,形成合理的行为协调机制,从而使机器人实现自主导航任务.仿真实验结果表明:将情感系统引入到机器人系统中能提高学习速度,所提出的控制策略能有效地改善机器人在未知的环境中自主导航的能力.  相似文献   

7.
通过多属性决策的理论对自主能力进行评估,实现了混合主动交互的可变自主系统.结合水面无人系统的导航任务,给出了基于多属性决策理论的自主能力确定模型.利用该模型,能够评价当前形势下可行的自主等级,并基于用户偏好做出选择,解决了未知环境下无人系统自主等级合理确定的问题,具有较好的效果.研究结果表明基于多属性决策的自主等级确定...  相似文献   

8.
在总结油价预测模型的基础上,视原油价格动态过程为马尔科夫链,利用数理统计方法计算得到马尔科夫链状态转移概率;根据马尔科夫链的性质,在其状态转移概率基础上,得到其长游程的稳态概率;继之利用动态规划技术,进行原油进口时机决策.利用这种方法对实际油价进行原油进口时机决策.结果表明,新决策方法能够有效规避"买涨不买跌",且简单易行,有效提高数据利用率.  相似文献   

9.
基于神经网络增强学习算法的工艺任务分配方法   总被引:1,自引:0,他引:1  
在任务分配问题中,如果Markov决策过程模型的状态-动作空间很大就会出现"维数灾难".针对这一问题,提出一种基于BP神经网络的增强学习策略.利用BP神经网络良好的泛化能力,存储和逼近增强学习中状态-动作对的Q值,设计了基于Q学习的最优行为选择策略和Q学习的BP神经网络模型与算法.将所提方法应用于工艺任务分配问题,经过Matlab软件仿真实验,结果证实了该方法具有良好的性能和行为逼近能力.该方法进一步提高了增强学习理论在任务分配问题中的应用价值.  相似文献   

10.
目前,多目标跟踪算法仍面临诸多挑战,例如遮挡、快速运动等所造成的影响难以完全规避.为了解决上述问题,提出一种基于马尔科夫决策过程的多目标跟踪算法.该算法将每个目标建模成一个马尔科夫决策过程,通过最大化奖励函数来驱动状态间的转移,并将强化学习训练用于数据关联相似度函数,有效地解决了目标遮挡问题.同时,为了解决物体快速运动...  相似文献   

11.
Language markedness is a common phenomenon in languages, and is reflected from hearing, vision and sense, i.e. the variation in the three aspects such as phonology, morphology and semantics. This paper focuses on the interpretation of markedness in language use following the three perspectives, i.e. pragmatic interpretation, psychological interpretation and cognitive interpretation, with an aim to define the function of markedness.  相似文献   

12.
The Williston Basin is a significant petroleum province, containing oil production zones that include the Middle Cambrian to Lower Ordovician, Upper Ordovician, Middle Devonian, Upper Devonian and Mississippian and within the Jurassic and Cretaceous. The oils of the Williston Basin exhibit a wide range of geochemical characteristics defined as "oil families", although the geochemical signature of the Cambrian Deadwood Formation and Lower Ordovician Winnipeg reservoired oils does not match any "oil family". Despite their close stratigraphic proximity, it is evident that the oils of the Lower Palaeozoic within the Williston Basin are distinct. This suggests the presence of a new "oil family" within the Williston Basin. Diagnostic geochemical signatures occur in the gasoline range chromatograms, within saturate fraction gas chromatograms and biomarker fingerprints. However, some of the established criteria and cross-plots that are currently used to segregate oils into distinct genetic families within the basin do not always meet with success, particularly when applied to the Lower Palaeozoic oils of the Deadwood and Winnipeg Formation.  相似文献   

13.
王慧 《科技信息》2008,(10):240-240
Wuthering Heights, Emily Bronte's only novel, was published in December of 1847 under the pseudonym Ellis Bell. The book did not gain immediate success, but it is now thought one of the finest novels in the English language. Catherine is the key character of this masterpiece, because everybody and everything center on her though she had a short life. We can understand this masterpiece better if we know Catherine well.  相似文献   

14.
The discovery of the prolific Ordovician Red River reservoirs in 1995 in southeastern Saskatchewan was the catalyst for extensive exploration activity which resulted in the discovery of more than 15 new Red River pools. The best yields of Red River production to date have been from dolomite reservoirs. Understanding the processes of dolomitization is, therefore, crucial for the prediction of the connectivity, spatial distribution and heterogeneity of dolomite reservoirs.The Red River reservoirs in the Midale area consist of 3~4 thin dolomitized zones, with a total thickness of about 20 m, which occur at the top of the Yeoman Formation. Two types of replacement dolomite were recognized in the Red River reservoir: dolomitized burrow infills and dolomitized host matrix. The spatial distribution of dolomite suggests that burrowing organisms played an important role in facilitating the fluid flow in the backfilled sediments. This resulted in penecontemporaneous dolomitization of burrow infills by normal seawater. The dolomite in the host matrix is interpreted as having occurred at shallow burial by evaporitic seawater during precipitation of Lake Almar anhydrite that immediately overlies the Yeoman Formation. However, the low δ18O values of dolomited burrow infills (-5.9‰~ -7.8‰, PDB) and matrix dolomites (-6.6‰~ -8.1‰, avg. -7.4‰ PDB) compared to the estimated values for the late Ordovician marine dolomite could be attributed to modification and alteration of dolomite at higher temperatures during deeper burial, which could also be responsible for its 87Sr/86Sr ratios (0.7084~0.7088) that are higher than suggested for the late Ordovician seawaters (0.7078~0.7080). The trace amounts of saddle dolomite cement in the Red River carbonates are probably related to "cannibalization" of earlier replacement dolomite during the chemical compaction.  相似文献   

15.
Location based services is promising due to its novel working style and contents.A software platform is proposed to provide application programs of typical location based services and support new applications developing efficiently. The analysis shows that this scheme is easy implemented, low cost and adapt to all kinds of mobile nework system.  相似文献   

16.
以AC-13级配为基础,将橡胶颗粒代替部分集料掺入混合料中,以低温弯曲试验为评价方法对不同橡胶颗粒掺量下沥青混合料的低温抗裂性进行研究,并引入应变能密度值对混合料的低温抗裂性进行综合评价.试验结果表明:橡胶颗粒沥青混合料试件的破坏微应变均超过2 300,满足冬寒区的技术指标;无论是否掺加橡胶颗粒,随着温度的下降,沥青混合料破坏时的最大弯拉强度增大,弯拉应变降低,劲度模量增大;弯曲应变能密度在胶粒掺量为1%左右时具有较大的弯曲应变能密度值,此时橡胶颗粒沥青混合料具有较好的低温抗裂性.  相似文献   

17.
AcomputergeneratorforrandomlylayeredstructuresYUJia shun1,2,HEZhen hua2(1.TheInstituteofGeologicalandNuclearSciences,NewZealand;2.StateKeyLaboratoryofOilandGasReservoirGeologyandExploitation,ChengduUniversityofTechnology,China)Abstract:Analgorithmisintrod…  相似文献   

18.
理论推导与室内实验相结合,建立了低渗透非均质砂岩油藏启动压力梯度确定方法。首先借助油藏流场与电场相似的原理,推导了非均质砂岩油藏启动压力梯度计算公式。其次基于稳定流实验方法,建立了非均质砂岩油藏启动压力梯度测试方法。结果表明:低渗透非均质砂岩油藏的启动压力梯度确定遵循两个等效原则。平面非均质油藏的启动压力梯度等于各级渗透率段的启动压力梯度关于长度的加权平均;纵向非均质油藏的启动压力梯度等于各渗透率层的启动压力梯度关于渗透率与渗流面积乘积的加权平均。研究成果可用于有效指导低渗透非均质砂岩油藏的合理井距确定,促进该类油藏的高效开发。  相似文献   

19.
Quality traits in wheat (Triticum aestirum L.) were studied by quantitative trait locus (QTL) analysis in a recombinant inbred line (RIL) population, a set of 131 lines derived from Chuan 35050 × Shannong 483 cross (ChSh). Grains from RILs were assayed for 21 quality traits related to protein and starch. A total of 35 putative QTLs for 19 traits with a single QTL explaining 7.99-40.52% of phenotypic variations were detected on 10 chromosomes, 1D, 2A, 2D, 3B, 3D, 5A, 6A, 6B, 6D, and 7B. The additive effects of 30 QTLs were positive, contributed by Chuan 35050, the remaining 5 QTLs were negative with the additive effect contributed by Shannong 483. For protein traits, 15 QTLs were obtained and most of them were located on chromosomes 1 D, 3B and 6D, while 20 QTLs for starch traits were detected and most of them were located on chromosomes 3D, 6B and 7B. Only 7 QTLs for protein and starch traits were co-located in three regions on chromosomes 1D, 2A and 2D. These protein and starch trait QTLs showed a distinct distribution pattern in certain regions and chromosomes. Twenty-two QTLs were clustered in 6 regions of 5 chromosomes. Two QTL clusters for protein traits were located on chromosomes 1D and 3B, respectively, three clusters for starch traits on chromosomes 3D, 6B and 7B, and one cluster including protein and starch traits on chromosome 1D.  相似文献   

20.
As an American modern novelist who were famous in the literary world, Hemingway was not a person who always followed the trend but a sharp observer. At the same time, he was a tragedy maestro, he paid great attention on existence, fate and end-result. The dramatis personae's tragedy of his works was an extreme limit by all means tragedy on the meaning of fearless challenge that failed. The beauty of tragedy was not produced on the destruction of life, but now this kind of value was in the impact activity. They performed for the reader about the tragedy on challenging for the limit and the death.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号