首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Iterative methods that take advantage of efficient block operations and block communications are popular research topics in parallel computation. These methods are especially important on Massively Parallel Processors (MPP). This paper presents a block variant of the GMRES method for solving general unsymmetric linear systems. It is shown that the new algorithm with block sizes, denoted by BVGMRES (s.m), is theoretically equivalent to the GMRES (s·m) method. The numerical results show that this algorithm can be more efficient than the standard GMRES method on a cache besed single CPU computer with optimized BLAS kernels. Furthermore, the gain in efficiency is more significant on MPPs due to both efficient block operations and efficient block data communications. Our numerical results also show that in comparison to the standard GMRES method, the more PEs that are used on an MPP, the more efficient the BVGMRES(s,m) algorithm is.  相似文献   

2.
并行程序可以充分发掘硬件计算能力并提高程序性能,但是在多核集群环境中编写并行程序十分复杂。该文提出了面向多核集群的并行编程框架,Horde。Horde提供了一组简单易用的消息传递接口和事件驱动(event-driven)编程模型,用以帮助程序员表达算法逻辑中潜在的并行性,将计算分解与底层硬件结构去耦合,从而简化编写并行程序的复杂度,灵活地在不同的底层结构的集群上进行映射并能保持良好的性能。此外,Horde也提供了有效的任务对象迁移机制,可以实现动态负载均衡与在线容错。在128核集群上的实验表明:Horde可以有效执行并行程序,并且可以实现高效的任务对象迁移。  相似文献   

3.
This paper reports on progress made in the first 3 years of ATR's “CAM-Brain” Project, which aims to use “evolutionary engineering” techniques to build/grow/evolve a RAM-and-cellular-automata based artificial brain consisting of thousands of interconnected neural network modules inside special hardware such as MIT's Cellular Automata Machine “CAM-8”, or NTT's Content Addressable Memory System “CAM-System”. The states of a billion (later a trillion) 3D cellular automata cells, and millions of cellular automata rules which govern their state changes, can be stored relatively cheaply in giga(tera)bytes of RAM. After 3 years work, the CA rules are almost ready. MIT's “CAM-8” (essentially a serial device) can update 200,000,000 CA cells a second. It is possible that NTT's “CAM-System” (essentially a massively parallel device) may be able to update a trillion CA cells a second. Hence all the ingredients will soon be ready to create a revolutionary new technology which will allow thousands of evolved neural network modules to be assembled into artificial brains. This in turn will probably create not only a new research field, but hopefully a whole new industry, namely “brain building”. Building artificial brains with a billion neurons is the aim of ATR's 8 year “CAM-Brain” research project, ending in 2001.  相似文献   

4.
Eichenbaum and colleagues observed that the same place did or did not activate the "goal-approach" cells in hippocampus depending on whether the place was the way for rats to approach specific goal. Parallel with this, the present neuroimage study revealed that, the same type of items could activate the hippocampus more when it was related to the task at hand than when it not. Participants were scanned by fMRI while they made judgments on the type of relationships contained in the word-pairs (e.g., Does the word pair, "furniture-table", contain a "category-exemplar" relationship?). Event-related analysis revealed that the forming of "task-related" association activated hippocampus more than that of "task-unrelated", even if it was the same type of items, and, this hippocampal difference was not caused by the different judgment requirements, nor by the effects of "yes" response. Consistently, the post-judgment cued-recall test exhibited a better retrieval performance for "task-related" associations than for the same type but "task-unrelated" associations. Results also showed that, the semantic relatedness between the to-be-associated individual words (e.g., the related word pair "healthy-hospital" versus the unrelated word pair "price-way") was not enough to activate the hippocampus when it was "task-unrelated". Generally, we proposed that, through participating in forming of "task-related" associations and consolidating of episodic memory, hippocampus enabled the organism to keep the information that owned great survival values in mind for future usage.  相似文献   

5.
The design and implementation of a scalable parallel mining system target for big graph analysis has proven to be challenging. In this study, we propose a parallel data mining system for analyzing big graph data generated on a Bulk Synchronous Parallel (BSP) computing model named BSP-based Parallel Graph Mining (BPGM). This system has four sets of parallel graph mining algorithms programmed in the BSP parallel model and a well-designed workflow engine optimized for cloud computing to invoke these algorithms. Experimental results show that the graph mining algorithm components in BPGM are efficient and have better performance than big cloud-based parallel data miner and BC-BSP.  相似文献   

6.
三维Poisson方程边值问题的块三对角可扩展并行算法   总被引:1,自引:1,他引:0  
为探讨三维Poisson方程带Dirichlet边界条件边值问题的并行求解方法,本文使用块三对角可扩展并行算法对该系统进行求解,提出了反映差分格式内在并行性的概念——差分格式的并行度,利用此概念说了明差分格式自身内在并行性与并行算法性能的关系。此外,本文方法在上海大学“自强3000”计算机。七的数值实验表明,实验的结果与理论分析一致;在保证精度的前提下得到了线性加速比,其并行效率达到90%以上。  相似文献   

7.
Parallel frequent pattern discovery algorithms exploit parallel and distributed computing resources to relieve the sequential bottlenecks of current frequent pattern mining (FPM) algorithms. Thus, parallel FPM algorithms achieve better scalability and performance, so they are attracting much attention in the data mining research community. This paper presents a comprehensive survey of the state-of-the-art parallel and distributed frequent pattern mining algorithms with more emphasis on pattern discovery from complex data (e.g., sequences and graphs) on various platforms. A review of typical parallel FPM algorithms uncovers the major challenges, methodologies, and research problems in the field of parallel frequent pattern discovery, such as work-load balancing, finding good data layouts, and data decomposition. This survey also indicates a dramatic shift of the research interest in the field from the simple parallel frequent itemset mining on traditional parallel and distributed platforms to parallel pattern mining of more complex data on emerging architectures, such as multi-core systems and the increasingly mature grid infrastructure.  相似文献   

8.
0IntroductionParallel-structure manipulators are attracting moreattentionfromthe designers andscholars duetotheir mer-its of compact structure,high precision,highrepeatabili-ty,high stiffness and large load-volume ratio,etc.Re-search on parallel manipulators has been a hot issue fortheir applications in bioengineering[1],precision engi-neering[2],precisionsenor systems[3,4],mechanical man-ufacturing[5,6]and so on.Infact,it is very difficult for these parallel manipu-lator systemsto attainfurth…  相似文献   

9.
Parallel algorithms have been designed for the past 20 years initially by parallelising existing sequential algorithms for many different parallel architectures. More recently parallel strategies have been identified and utilised resulting in many new parallel algorithms. However the analysis of such algorithms reveals that further strategies can be applied to increase the parallelism. One of these, i.e., increasing the computational capacity in each processing node can reduce the congestion/communication for shared memory/distributed memory multiprocessor systems and dramatically improve the performance of the algorithm. Two algorithms are identified and studied, i.e., the cyclic reduction method for solving large tridiagonal linear systems in which the odd/even sequence is increased to a ‘stride of 3’ or more resulting in an improved algorithm. Similarly the Gaussian Elimination method for solving linear systems in which one element is eliminated at a time can be adapted to parallel form in which two elements are simultaneously eliminated resulting in the Parallel Implicit Elimination (P.I.E.) method. Numerical results are presented to support the analyses.  相似文献   

10.
In the K-means clustering algorithm, each data point is uniquely placed into one category. The clustering quality is heavily dependent on the initial cluster centroid. Different initializations can yield varied results; local adjustment cannot save the clustering result from poor local optima. If there is an anomaly in a cluster, it will seriously affect the cluster mean value. The K-means clustering algorithm is only suitable for clusters with convex shapes. We therefore propose a novel clustering algorithm CARDBK—"centroid all rank distance(CARD)" which means that all centroids are sorted by distance value from one point and "BK" are the initials of "batch K-means"—in which one point not only modifies a cluster centroid nearest to this point but also modifies multiple clusters centroids adjacent to this point, and the degree of influence of a point on a cluster centroid depends on the distance value between this point and the other nearer cluster centroids. Experimental results showed that our CARDBK algorithm outperformed other algorithms when tested on a number of different data sets based on the following performance indexes: entropy, purity, F1 value, Rand index and normalized mutual information(NMI). Our algorithm manifested to be more stable, linearly scalable and faster.  相似文献   

11.
双杠早倒技术是完成杠下类型动作的一项重要的基本技术。本文用应力电测法和定点摄影同步结合,对早倒技术的动力学和运动学特征进行了较详细的定性、定量分析,并从理论上提出了一些新的观点:一、在无重量瞬时,要及时调整两手握杠的方法,防止脱手。二、在无重量时相,必须迅速、准确地调整身体婆势,发展各种难新动作。三、在完成后回环成手倒立等动作时,可近似地利用单杠上的“正掏技术”。所有这些将对更好地学习和掌握双杠早倒技术、发展难新动作、迅速提高运动技术水平起促进作用。  相似文献   

12.
“C位”是源自韩国的外来词,在2018年热度极高,被广泛应用于社交网络及大众媒体。搜集相关语料,简述“C位”的构成,并探究“C位”在汉语中的发展、使用情况,又从模因论的角度对“C位”进行分析,最终通过“C位”与本土词“锦鲤”的对比,探究其最终成为弱势语言模因的原因。  相似文献   

13.
MPI环境下FDTD高效率网络并行计算研究   总被引:1,自引:1,他引:0  
FDTD(FiniteDifferenceTimeDomain)计算复杂电磁场问题存在计算时间长和内存耗用大的难题,并行计算可以减少单机处理量,是解决该难题的有效途径.本文针对网络并行系统特点结合FDTD算法,提出了有效的优化步骤,采用MPI并行函数库实现高效率FDTD并行计算.在一套16台微机组成的网络并行计算机系统上完成了三维FDTD并行计算举例.计算结果证明了该方案的正确性,并且得到了较高的并行效率.  相似文献   

14.
对DPPEJ(Distributed Parallel Programming Environment for Java)软件包的工作原理作了详细的介绍,并采用Java语言实现了运行于DPPEJ环境下的FFT迭代算法和递归算法的并行计算程序.通过对实验数据的性能分析表明,由于Java代码的简单性和可移植性,基于DPPEJ的Java并行计算程序通过使用RMI远程方法调用可以较好地利用网络环境下的计算资源进行分布式的并行计算,具有一定的应用前景.  相似文献   

15.
工业建筑的可靠性维修设计   总被引:1,自引:0,他引:1  
郭院成  鲁志鹏  赵明 《河南科学》2000,18(3):305-308
工业建筑结构处于腐蚀性介质环境中 ,各部分结构构件承载能力在其使用过程中衰减速率不同 ,导致维修时结构体系各控制截面可靠度水平不同。考虑到各破坏模式间的串联关系及控制截面间的并联关系 ,按等可靠度分配原则要求各控制截面的维修可靠度增量有所不同。本文在此基础上研究了工业建筑结构体系的可靠性维修设计方法 ,对实现维修结构的可靠度水平具有重要的理论指导意义。  相似文献   

16.
In order to effectively program Parallel Computing on NOW (Network of workstation), users must be able to evaluate how well the system performs for a given application. In this paper, we present an framework that can be used to evaluate tree structured computing on NOW. Based on this framework, we derive a model for the famous parallel programming paradigm-divide and conquer. We discuss how this model can be used to evaluate performance and how it can be used to restructure the application to improve performance. Supported by the Foundation of Teaching Reform Facing the 21th Century of National Education Committee Zhang Jianjun: born in 1970, Master student  相似文献   

17.
模糊c-均值聚类(FCM)的算法是在硬c-均值算法(HCM)发展而来的,虽然改进了硬c-均值算法的聚类效果,但带来了时间复杂度的增加.提出了一种基于协议分析分类的并行入侵检测模型,根据协议分析将大的数据集进行分类,构成不同的数据集,先对各个数据集进行FCM聚类,然后对每个FCM聚类的结果再次进行FCM聚类,构成并行处理系统.采用协议分析技术结合高速数据包捕捉、协议解析等技术来进行分布式入侵检测,可以提高入侵检测的速度.  相似文献   

18.
基于区域分解和MPI的线性带状方程组归并迭代解法器   总被引:1,自引:0,他引:1  
线性带状方程组并行解法器往往基于两层迭代的区域分解方法,采用M P I(m essage pass ing in terface)实现,因此导致的总迭代次数太多或者进程通信开销太大都会使解法器效率低下。该文通过研究减少迭代次数和降低进程通信开销的方法,设计了一种适合区域分解和M P I系统的高效的归并迭代并行解法器。这种解法器通过引入全局加速收敛算法,把两层迭代归并为一层迭代,有效减少了迭代求解的总次数,并且采用分块并行技术降低M P I系统上加速收敛算法的进程通信开销。实验证明归并迭代并行解法器能够保证和串行解法器大致相当的总迭代次数,分块并行加速收敛技术能够降低接近1/2的全局进程通信时间。  相似文献   

19.
给出了在多台微机与SUN工作站的互连网上实现的一个并行计算环境 :HCPC(Heteroge neousComputersParallelComputing)系统 ,并通过对其性能的分析和在ART1神经网络上的模拟实现 ,验证了HCPC系统的功能  相似文献   

20.
分析了Matlab并行计算工具箱中各部件的关系,对分布式并行计算环境中的关键参数进行了设置,构建了并行计算机群。将基于Matlab机群的分布式并行处理引入到图像匹配中。以灰度相关匹配算法为例,结合并行处理对图像灰度匹配进行并行实现。实验结果表明:并行化处理能有效缩短匹配时间,对进一步研究并行图像处理有一定的指导意义。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号