期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

王金田唐昊程文娟毕翔《系统工程学报》2011,26(5)

研究电子零售市场上两个销售商在彼此没有信息交互情况下的异步动态定价问题.基于性能势理论,建立了同时适用于平均和折扣两种优化准则下的异步定价策略的Q学习和WoLF-PHC算法,通过一个数值例子比较了相关算法的学习优化效果.仿真结果表明,Q学习和WoLF-PHC算法都能较好地解决异步动态定价问题,但由于后者采用混合策略和可变学习率,故能更好地适应环境变化,并具有更好的学习优化效果. 相似文献

2.

Multi-agent reinforcement learning based on policies of global objective

张化祥黄上腾《系统工程与电子技术(英文版)》2005,16(3)

1 .INTRODUCTIONBecause an agent’s rewardis a function of all agents’joint action, when applying RL[1]to multi-agent do-mains ,some fundamental change should be made .Byadopting single agentQlearning[2]to Markovgames,several algorithms have been proposed,suchas Littman’s mini maxQ-learning( mini max-Q)[3],Hu et al’s NashQ-learning(Nash-Q)[4 ,5], Claus etal’s cooperative multi-agentQ-learning[6], Bowlinget al’s multi-agent learningQ-learning using a vari-able learning rate[7 ~9],… 相似文献

3.

A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning

MA Ye CHANG Tianqing FAN Wenhui 《系统工程与电子技术(英文版)》2021,(3):642-657

In the evolutionary game of the same task for groups,the changes in game rules,personal interests,the crowd size,and external supervision cause uncertain effect... 相似文献

4.

基于强化学习的倒立摆起摆与平衡全过程控制 总被引：4，自引：0，他引：4

张荣陈卫东《系统工程与电子技术》2004,26(1):72-76

倒立摆的控制是一种典型的非线性控制问题。本文的目标是在假设不知道任何倒立摆模型的前提下,采用强化学习控制器实现倒立摆的起摆和平衡的全过程控制。为提高学习效率,采用了任务分解的方法,将整个控制任务分解为起摆和平衡两个子任务,对于不同的子任务根据其特点采用不同的强化学习算法。在Matlab/Simulink上进行仿真实验,结果证明,该方法在合理的时间内可以学习到成功的控制方法。相似文献

5.

基于多Agent的可控网络安全系统研究

周剑岚刘先荣宋四新《系统工程与电子技术》2008,30(6)

移动安全Agent扫描各客户主机的漏洞,采集记录异常活动的审计日志,实现事前和事后的安全保障,但移动Agent自身的通信和迁移的安全性同样重要.首先结合硬件特征属性密钥和用户信息,实现基于Agent技术的多因素认证系统,在认证基础上,利用非对称加密技术和密钥管理,保障Agent通信和迁移的安全性.Agent作为软件,容易受到外部破坏,采用检测代理,通过Agent的协作,利用地址解析协议对网内节点的扫描,将广域网扫描机制转化为简单易行的内网扫描,从而保障客户主机中认证Agent的部署可靠性.实验结果表明,该系统效率高,可扩展性、通用性好. 相似文献

6.

基于多Agent系统的计算机生成兵力建模研究 总被引：1，自引：0，他引：1

陈坚廖守亿邓方林《系统工程与电子技术》2008,30(10)

在计算机生成兵力(computer generated forces,CGF)的研究中,引入了多Agent系统(multi-agentsystems,MAS)理论,并以面向对象Petri网(object-oriented Petri nets,OPN)为基础,建立了一种通用的适合CGF的MAS形式化模型ArmyMAS.ArmyMAS描述了作战实体Agent、管理Agent和配置等三个单元,形象地刻画了CGF的结构与行为特性,同时可以利用Petri网的相关分析方法和工具对模型进行分析和验证.最后利用Ar-myMAS对弹道导弹攻防对抗CGF系统进行建模和分析,验证了该模型的有效性. 相似文献

7.

Performance evaluation for damping controllers of power systems based on multi-agent models

Ancheng Xue Yiguang Hong 《系统科学与复杂性》2009,22(1):77-87

This paper proposes a multi-layer multi-agent model for the performance evaluation of power systems, which is different from the existing multi-agent ones. To describe the impact of the structure of the networked power system, the proposed model consists of three kinds of agents that form three layers: control agents such as the generators and associated controllers, information agents to exchange the information based on the wide area measurement system (WAMS) or transmit control signals to the power system stabilizers (PSSs), and network-node agents such as the generation nodes and load nodes connected with transmission lines. An optimal index is presented to evaluate the performance of damping controllers to the system's inter-area oscillation with respect to the information-layer topology. Then, the authors show that the inter-area information exchange is more powerful than the exchange within a given area to control the inter-area low frequency oscillation based on simulation analysis. This work was supported in part by the National Natural Science Foundation of China under Grants Nos. 50707035, 50595411, 60425307, 60221301, and 50607005, in part by the 111 project (B08013) and Program for Changjiang Scholars and Innovative Research Team in University (IRT0515) and in part by the Program for New Century Excellent Talents in University (NCET-05-0216). 相似文献

8.

UAV maneuvering decision-making algorithm based on deep reinforcement learning under the guidance of expert experience

ZHAN Guang;ZHANG Kun;LI Ke;PIAO Haiyin 《系统工程与电子技术(英文版)》2024,(3):644-665

Autonomous umanned aerial vehicle(UAV) manipulation is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battlefield. A large amount of research has been devoted to improving the autonomous decision-making ability of UAV in an interactive environment, where finding the optimal maneuvering decisionmaking policy became one of the key issues for enabling the intelligence of UAV. In this paper, we propose a maneuvering decision-making algorithm for autonomous air-delivery based on deep reinforcement learning under the guidance of expert experience. Specifically, we refine the guidance towards area and guidance towards specific point tasks for the air-delivery process based on the traditional air-to-surface fire control methods.Moreover, we construct the UAV maneuvering decision-making model based on Markov decision processes(MDPs). Specifically, we present a reward shaping method for the guidance towards area and guidance towards specific point tasks using potential-based function and expert-guided advice. The proposed algorithm could accelerate the convergence of the maneuvering decision-making policy and increase the stability of the policy in terms of the output during the later stage of training process. The effectiveness of the proposed maneuvering decision-making policy is illustrated by the curves of training parameters and extensive experimental results for testing the trained policy. 相似文献

9.

Novel ensemble learning based on multiple section distribution in distributed environment

Fang Min 《系统工程与电子技术(英文版)》2008,19(2):377-380

Because most ensemble learning algorithms use the centralized model, and the training instances must be centralized on a single station, it is difficult to centralize the training data on a station. A distributed ensemble learning algorithm is proposed which has two kinds of weight genes of instances that denote the global distribution and the local distribution. Instead of the repeated sampling method in the standard ensemble learning, non-balance sampling from each station is used to train the base classifier set of each station. The concept of the effective nearby region for local integration classifier is proposed, and is used for the dynamic integration method of multiple classifiers in distributed environment. The experiments show that the ensemble learning algorithm in distributed environment proposed could reduce the time of training the base classifiers effectively, and ensure the classify performance is as same as the centralized learning method. 相似文献

10.

基于影响图的多智能体学习算法

钟麟陈丽娟佟明安张圣云《系统工程学报》2008,23(3):377-380

提出一种多智能体学习算法.用影响图作为 agent 表示工具,给定 agent 的一个初始模型和它的历史行为,在能力、信念和优先学习的基础上来构建新的模型.学习方法是把其它 agent 的历史行为作为训练集,利用神经网络以及决策知识和专家知识来修改影响图中各结点的连接关系.针对与 agent 历史行为不一致的情况,本文把它看成效用函数发生了随机偏差,通过 Markov chain-Monte Carlo 技术进行模拟,实现效用函数的调整.最后利用多机编队协同空战作为例子说明算法的实用性. 相似文献

11.

面向伙伴选择的模糊Markov博弈控制及仿真研究 总被引：1，自引：0，他引：1

王惠符策谢益武许瑞雪杨小佳《系统仿真学报》2007,19(15):3572-3576

针对不确定条件下的伙伴选择决策问题,把自适应模糊控制系统理论及神经网络理论引入到Markov博弈中,提出一种基于多智能体的伙伴选择模糊控制模型。该模型引入基于ANFIS和神经网络的模糊神经网络,实现了一种全新的进行值函数逼近的梯度下降Q学习的算法。并应用该模型对伙伴选择问题进行研究,对多影响因素进行FNN学习,将输出量作为标准Markov博弈模型的输入量,得到影响的策略,最后研究了一个应用实例,利用具体历史数据对建模方法和模型进行了验证和分析。相似文献

12.

Incremental support vector machine algorithm based on multi-kernel learning

下载免费PDF全文

Zhi yu Li Jun feng Zhang Shou song Hu 《系统工程与电子技术(英文版)》2011,22(4):702-706

A new incremental support vector machine (SVM) algorithm is proposed which is based on multiple kernel learning.Through introducing multiple kernel learning into the SVM incremental learning,large scale data set learning problem can be solved effectively.Furthermore,different punishments are adopted in allusion to the training subset and the acquired support vectors,which may help to improve the performance of SVM.Simulation results indicate that the proposed algorithm can not only solve the model selection problem in SVM incremental learning,but also improve the classification or prediction precision. 相似文献

13.

基于Shifted Legendre正交多项式的迭代学习控制方法

张丽萍杨富文《系统工程与电子技术》2005,27(3):483-485

考虑用迭代学习控制方法来解决一类线性时变连续系统的终端控制问题。运用ShiftedLegendre正交多项式的展开技术,利用其正交性和边值条件,将线性时变系统的微分方程转化为代数方程,避免了在判断误差收敛条件的过程中求解线性时变系统状态转移矩阵。并采用高阶学习律来求控制输入的ShiftedLegendre系数向量,仿真实例验证了该方法的有效性。相似文献

14.

Iterative learning based fault diagnosis for discrete linear uncertain systems

Wei Cao Ming Sun 《系统工程与电子技术(英文版)》2014,(3):496-501

In order to detect and estimate faults in discrete lin-ear time-varying uncertain systems, the discrete iterative learning strategy is applied in fault diagnosis, and a novel fault detection and estimation algorithm is proposed. And the threshold limited technology is adopted in the proposed algorithm. Within the chosen optimal time region, residual signals are used in the proposed algorithm to correct the introduced virtual faults with iterative learning rules, making the virtual faults close to these occurred in practical systems. And the same method is repeated in the rest optimal time regions, thereby reaching the aim of fault diagnosis. The proposed algorithm not only completes fault detection and estimation for discrete linear time-varying uncertain systems, but also improves the reliability of fault detection and decreases the false alarm rate. The final simulation results verify the validity of the proposed algorithm. 相似文献

15.

基于迭代学习控制的PID控制器设计 总被引：4，自引：0，他引：4

张怀相原魁邹伟《系统工程与电子技术》2006,28(8):1225-1228

针对传统的经验PID整定方法,提出了一种新的PID参数整定算法。该算法首先利用PD型迭代学习控制来进行期望轨迹的跟踪控制,然后根据迭代学习控制的输入输出数据序列,通过强跟踪滤波器来进行参数辨识,可获得对应于期望轨迹的优化的PID控制参数。给出了迭代学习控制的收敛条件,以及如何利用强跟踪滤波器来进行参数辨识。仿真和实验结果表明,采用该算法设计PID控制器,被控系统可以获得较佳的动态性能和较强的鲁棒性。相似文献

16.

基于混合学习矢量量化算法的遥感影像分类

崔宝侠刘伟《系统工程与电子技术》2005,27(6):1090-1092

在分析了Kohonen自组织特征映射网络(SOFM)和学习矢量量化(LVQ)算法的基础上,提出一种基于改进的SOFM算法和LVQ2算法的混合学习矢量量化(HLVQ)方法,并建立了基于HLVQ的遥感影像非监督和监督分类的一般模型。通过与传统的统计分类方法和LVQ2网络分类器比较,HLVQ分类器总的分类性能更好、识别率更高。相似文献

17.

Immune multi-agent model using vaccine for cooperative air-defense system of systems for surface warship formation based on danger theory

下载免费PDF全文

Jun Wang Xiaozhe Zhao Beiping Xu Wei Wang Zhiyong Niu 《系统工程与电子技术(英文版)》2013,(6):946-953

Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems （SoS） for surface warship formation （CASoSSWF） to the biological immune system （BIS） according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system （MAS）, the vaccine and the danger theory （DT）. Moreover a new immune multi-agent model using vaccine based on DT （IMMUVBDT） for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system. 相似文献

18.

基于2-D系统理论的D型闭环迭代学习控制

丁伟东孙志毅吴聚华阎学文《系统仿真学报》2002,14(11):1528-1530

将2－D线性连续－离散系统理论应用于连续线性迭代学习控制系统中，给出能很好反映失代学习控制过程的数学模型－2－D线性连续－离散系统Roessor模型，在2－D系统理论基础上证明了D型闭环迭代学习控制律的收敛性，根据该理论设计的闭环迭代学习控制器，受到的限制较小。相似文献

19.

基于泛函网络的多维函数逼近理论及学习算法 总被引：7，自引：1，他引：7

周永权赵斌焦李成《系统工程与电子技术》2005,27(5):906-909

提出一种多维函数逼近的泛函网络逼近方法,设计了一类用于函数逼近的可分离泛函网络,给出了基于泛函网络的函数逼近学习算法。而泛函网络的参数通过解方程组得到,它们能逼近给定函数到预定的精度。仿真结果表明,这种逼近方法简单可行,具有较快的收敛速度和良好的逼近性能。相似文献

20.

示例学习与特征选择的规划模型方法

李敏强寇纪淞戴林《系统工程学报》2000,15(2):163-167,207

以扩张矩阵理论为基础,应用数学规划理论提出了一种规划模型求解方法,可以更好地实现概念学习和特征提取。与传统的启发式算法相比,采用遗传算法求解的规划模型可以找到多个全局最优解以及可行解。实例计算表明了该方法的有效性。相似文献