期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Yang Liu Bin Wu Hongxu Wang andPengjiang Ma 《清华大学学报》2014,(1):33-38

The design and implementation of a scalable parallel mining system target for big graph analysis has proven to be challenging. In this study, we propose a parallel data mining system for analyzing big graph data generated on a Bulk Synchronous Parallel （BSP） computing model named BSP-based Parallel Graph Mining （BPGM）. This system has four sets of parallel graph mining algorithms programmed in the BSP parallel model and a well-designed workflow engine optimized for cloud computing to invoke these algorithms. Experimental results show that the graph mining algorithm components in BPGM are efficient and have better performance than big cloud-based parallel data miner and BC-BSP. 相似文献

2.

Integrated research of parallel computing: Status and future

GuoLiang Chen GuangZhong Sun Yun Xu Bai Long 《科学通报(英文版)》2009,54(11):1845-1853

In the past twenty years, the research group in University of Science and Technology of China has developed an integrated research method for parallel computing, which is a combination of ＂Architecture- Algorithm-Programming-Application＂. This method is also called the ecological environment of parallel computing research. In this paper, we survey the current status of integrated research method for parallel computing and by combining the impact of multi-core systems, cloud computing and personal high performance computer, we present our outlook on the future development of parallel computing. 相似文献

3.

Distances Between Phylogenetic Trees： A Survey

Feng Shi Qilong Feng Jianer Chen Lusheng Wang Jianxin Wang 《清华大学学报》2013,(5):490-499

Phylogenetic trees have been widely used in the study of evolutionary biology for representing the tree-like evolution of a collection of species. However, different data sets and different methods often lead to the construction of different phylogenetic trees for the same set of species. Therefore, comparing these trees to determine similarities or, equivalently, dissimilarities, becomes the fundamental issue. Typically, Tree Bisection and Reconnection(TBR)and Subtree Prune and Regraft(SPR) distances have been proposed to facilitate the comparison between different phylogenetic trees. In this paper, we give a survey on the aspects of computational complexity, fixed-parameter algorithms, and approximation algorithms for computing the TBR and SPR distances of phylogenetic trees. 相似文献

4.

Estimation of Cloud Node Acquisition Waseem Ahmed and Yongwei Wu简

Waseem Ahmed Yongwei Wu 《清华大学学报》2014,(1):1-12

Over the past decade, there has been a paradigm shift leading consumers and enterprises to the adoption of cloud computing services. Even though most cases are still in the early stages of transition, there has been a steady increase in the implementation of the pay-as-you-go or pay-as-you-grow models offered by cloud providers. Whether applied as an extension of virtual infrastructure, software, or platform as a service, many users are still challenged by the estimation of adequate resource allocation and the wide variations in pricing. Customers require a simple method of predicting future demand in terms of the number of nodes to be allocated in the cloud environment. In this paper, we review and discuss existing methodologies for estimating the demand for cloud nodes and their corresponding pricing policies. Based on our review, we propose a novel approach using the Hidden Markov Model to estimate the acquisition of cloud nodes. 相似文献

5.

Perceptual stimulus――A Bayesian-based integration of multi-visual-cue approach and its application

JianRu Xue NanNing Zheng XiaoPin Zhong LinJiang Ping 《科学通报(英文版)》2008,53(18):2886-2897

With the view that visual cue could be taken as a kind of stimulus, the study of the mechanism in the visual perception process by using visual cues in their probabilistic representation eventually leads to a class of statistical Integration of multiple visual cues （IMVC） methods which have been applied widely in perceptual grouping, video analysis, and other basic problems in computer vision. In this paper, a survey on the basic ideas and recent advances of IMVC methods is presented, and much focus is on the models and algorithms of IMVC for video analysis within the framework of Bayesian estimation. Furthermore, two typical problems in video analysis, robust visual tracking and “switching problem” in muIti-targèt tracking （MTT） are taken as test beds to verify a series of Bayesian-based IMVC methods proposed by the authors. Furthermore, the relations between the statistical IMVC and the visual perception process, as well as potential future research work for IMVC, are discussed. 相似文献

6.

The theory and practice in the evolution of trusted computing

Dengguo Feng Yu Qin Wei Feng Jianxiong Shao 《科学通报(英文版)》2014,59(32):4173-4189

Trusted computing （TC） is an emerging tech- nology to enhance the security of various computing plat- forms by a dedicated secure chip （TPM/TCM）, which is widely accepted by both the industrial and academic world. This paper attempts to sketch the evolution of TC from the view of our theoretical and engineering work. In theory, we focus on protocol design and security analysis. We have proposed the first ECDAA protocol scheme based on q-SDH assumption, which highlights a new way to design direct anonymous attestation scheme. In technical evolu- tion, we discuss the key technologies of trust chain, trusted network connection and TC testing and evaluation. We break through several key technologies such as trusted boot, OS measurement and remote attestation, and imple- ment a TC system from TPM/TCM to network. We also design and implement a testing and evaluation system of TC platform, which is the first one put into practical application in China. Finally, with the rapid development of cloud computing and mobile applications, TC is moving toward some new directions, such as the trust in cloud and mobile environments, new TPM standard, and flexible trust execution environment trust establishment method. 相似文献

7.

Parallel Frequent Pattern Discovery： Challenges and Methodology

张宇宙王建勇周立柱《清华大学学报》2007,12(6):719-728

Parallel frequent pattern discovery algorithms exploit parallel and distributed computing resources to relieve the sequential bottlenecks of current frequent pattern mining （FPM） algorithms. Thus, parallel FPM algorithms achieve better scalability and performance, so they are attracting much attention in the data mining research community. This paper presents a comprehensive survey of the state-of-the-art parallel and distributed frequent pattern mining algorithms with more emphasis on pattern discovery from complex data （e.g., sequences and graphs） on various platforms. A review of typical parallel FPM algorithms uncovers the major challenges, methodologies, and research problems in the field of parallel frequent pattern discovery, such as work-load balancing, finding good data layouts, and data decomposition. This survey also indicates a dramatic shift of the research interest in the field from the simple parallel frequent itemset mining on traditional parallel and distributed platforms to parallel pattern mining of more complex data on emerging architectures, such as multi-core systems and the increasingly mature grid infrastructure. 相似文献

8.

A Survey of Bitmap Index Compression Algorithms for Big Data

Zhen Chen Yuhao Wen Junwei Cao Wenxun Zheng Jiahui Chang Yinjun Wu Ge Ma Mourad Hakmaoui Guodong Peng 《清华大学学报》2015,(1):100-115

With the growing popularity of Internet applications and the widespread use of mobile Internet, Internet traffic has maintained rapid growth over the past two decades. Internet Traffic Archival Systems(ITAS) for packets or flow records have become more and more widely used in network monitoring, network troubleshooting, and user behavior and experience analysis. Among the three key technologies in ITAS, we focus on bitmap index compression algorithm and give a detailed survey in this paper. The current state-of-the-art bitmap index encoding schemes include: BBC, WAH, PLWAH, EWAH, PWAH, CONCISE, COMPAX, VLC, DF-WAH, and VAL-WAH. Based on differences in segmentation, chunking, merge compress, and Near Identical(NI) features, we provide a thorough categorization of the state-of-the-art bitmap index compression algorithms. We also propose some new bitmap index encoding algorithms, such as SECOMPAX, ICX, MASC, and PLWAH+, and present the state diagrams for their encoding algorithms. We then evaluate their CPU and GPU implementations with a real Internet trace from CAIDA. Finally, we summarize and discuss the future direction of bitmap index compression algorithms. Beyond the application in network security and network forensic, bitmap index compression with faster bitwise-logical operations and reduced search space is widely used in analysis in genome data, geographical information system, graph databases, image retrieval, Internet of things, etc. It is expected that bitmap index compression will thrive and be prosperous again in Big Data era since 1980s. 相似文献

9.

An Energy-Saving Task Scheduling Strategy Based on Vacation Queuing Theory in Cloud Computing

Chunling Cheng Jun Li Ying Wang 《清华大学学报》2015,(1):28-39

High energy consumption is one of the key issues of cloud computing systems. Incoming jobs in cloud computing environments have the nature of randomness, and compute nodes have to be powered on all the time to await incoming tasks. This results in a great waste of energy. An energy-saving task scheduling algorithm based on the vacation queuing model for cloud computing systems is proposed in this paper. First, we use the vacation queuing model with exhaustive service to model the task schedule of a heterogeneous cloud computing system.Next, based on the busy period and busy cycle under steady state, we analyze the expectations of task sojourn time and energy consumption of compute nodes in the heterogeneous cloud computing system. Subsequently, we propose a task scheduling algorithm based on similar tasks to reduce the energy consumption. Simulation results show that the proposed algorithm can reduce the energy consumption of the cloud computing system effectively while meeting the task performance. 相似文献

10.

Combined Timetabling Procedure and Complete Local Search for No-Wait Job Shop Scheduling with Total Tardiness

杨玉珍顾幸生《东华大学学报(英文版)》2014,(2):83-91

The strong non-deterministic polynomial-hard （NP-hard） character of job shop scheduling problem （JSSP） has been acknowledged widely and it becomes stronger when attaches the no-wait constraint, which widely exists in many production processes, such as chemistry process, metallurgical process. However, compared with the massive research on traditional job shop problem, little attention has been paid on the no-wait constraint. Therefore, in this paper, we have dealt with this problem by decomposing it into two sub-problems, the timetabling and sequencing problems, in traditional frame work. A new efficient combined non-order timetabling method, coordinated with objective of total tardiness, is proposed for the timetabling problems. As for the sequencing one, we have presented a modified complete local search with memory combined by crossover operator and distance counting. The entire algorithm was tested on well-known benchmark problems and compared with several existing algorithms. Computational experiments showed that our proposed algorithm performed both effectively and efficiently. 相似文献

11.

Fine-grained and heterogeneous proxy re-encryption for secure cloud storage

Peng Xu Hongwu Chen Deqing Zou Hai Jin 《科学通报(英文版)》2014,59(32):4201-4209

Cloud is an emerging computing paradigm. It has drawn extensive attention from both academia and industry. But its security issues have been considered as a critical obstacle in its rapid development. When data owners store their data as plaintext in cloud, they lose the security of their cloud data due to the arbitrary accessibility, specially accessed by the un-trusted cloud. In order to protect the confidentiality of data owners＇ cloud data, a promising idea is to encrypt data by data owners before storing them in cloud. However, the straightforward employment of the traditional encryption algorithms can not solve the problem well, since it is hard for data owners to manage their private keys, if they want to securely share their cloud data with others in a fine-grained manner. In this paper, we propose a fine-grained and heterogeneous proxy re-encryption （FH- PRE） system to protect the confidentiality of data owners＇ cloud data. By applying the FH-PRE system in cloud, data owners＇ cloud data can be securely stored in cloud and shared in a fine-grained manner. Moreover, the heteroge- neity support makes our FH-PRE system more efficient than the previous work. Additionally, it provides the secure data sharing between two heterogeneous cloud systems, which are equipped with different cryptographic primitives. 相似文献

12.

Performance Improvement of Distributed Systems by Autotuning of the Configuration Parameters

张帆曹军威刘连臣吴澄《清华大学学报》2011,16(4):440-448

The performance of distributed computing systems is partially dependent on configuration parameters recorded in configuration files. Evolutionary strategies, with their ability to have a global view of the structural information, have been shown to effectively improve performance. However, most of these methods consume too much measurement time. This paper introduces an ordinal optimization based strategy combined with a back propagation neural network for autotuning of the configuration parameters. The strategy was first proposed in the automation community for complex manufacturing system optimization and is customized here for improving distributed system performance. The method is compared with the covariance matrix algorithm. Tests using a real distributed system with three-tier servers show that the strategy reduces the testing time by 40% on average at a reasonable performance cost. 相似文献

13.

An Anomalous Behavior Detection Model in Cloud Computing

《清华大学学报》2016,(3)

This paper proposes an anomalous behavior detection model based on cloud computing. Virtual Machines(VMs) are one of the key components of cloud Infrastructure as a Service(Iaa S). The security of such VMs is critical to Iaa S security. Many studies have been done on cloud computing security issues, but research into VM security issues, especially regarding VM network traffic anomalous behavior detection, remains inadequate.More and more studies show that communication among internal nodes exhibits complex patterns. Communication among VMs in cloud computing is invisible. Researchers find such issues challenging, and few solutions have been proposed—leaving cloud computing vulnerable to network attacks. This paper proposes a model that uses Software-Defined Networks(SDN) to implement traffic redirection. Our model can capture inter-VM traffic, detect known and unknown anomalous network behaviors, adopt hybrid techniques to analyze VM network behaviors, and control network systems. The experimental results indicate that the effectiveness of our approach is greater than 90%, and prove the feasibility of the model. 相似文献

14.

Chemical and Genetic Diversity of Some Ligularia Species （Compositae） in Northwestern Yunnan and Southwestern Sichuan of China

Chiaki Kuroda Ryo Hanait Xun Gong Yuemao Shen Motoo Tori 《复旦学报(自然科学版)》2005,44(5):742-743

Ligularia Cass., （Compositae） is a highly diversified genus, and more than 100 species of which are distributed in the eastern Qinghai-Tibet Plateau and adjacent areas. Ligularia species have been studied with respect to secondary metabolites, and many sesquiterpenes of the furanoeremophilane type have been isolated from them. In order to find correlates among these variations, and ultimately understand the diversity-generating mechanism of Ligularia species in the Hengduan Mountains, we initiated an extensive study that uses furanoeremophilanes as a chemical index and the DNA sequence as a genetic index. Furanoeremophilanes have been detected conventionally by Ehrlich＇s test, which has been used in a search for novel natural products. As for the DNA sequence, we determined the nucleotide sequence of the atpB-rbcL intergenic region in the present study. 相似文献

15.

Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units

QinGang Xiong Bo Li Ji Xu XiaoJian Fang XiaoWei Wang LiMin Wang XianFeng He Wei Ge 《科学通报(英文版)》2012,57(7):707-715

Many-core processors, such as graphic processing units (GPUs), are promising platforms for intrinsic parallel algorithms such as the lattice Boltzmann method (LBM). Although tremendous speedup has been obtained on a single GPU compared with mainstream CPUs, the performance of the LBM for multiple GPUs has not been studied extensively and systematically. In this article, we carry out LBM simulation on a GPU cluster with many nodes, each having multiple Fermi GPUs. Asynchronous execution with CUDA stream functions, OpenMP and non-blocking MPI communication are incorporated to improve efficiency. The algorithm is tested for two-dimensional Couette flow and the results are in good agreement with the analytical solution. For both the oneand two-dimensional decomposition of space, the algorithm performs well as most of the communication time is hidden. Direct numerical simulation of a two-dimensional gas-solid suspension containing more than one million solid particles and one billion gas lattice cells demonstrates the potential of this algorithm in large-scale engineering applications. The algorithm can be directly extended to the three-dimensional decomposition of space and other modeling methods including explicit grid-based methods. 相似文献

16.

Genome-Wide Interaction-Based Association of Human Diseases—A Survey

Xuan Guo ;Ning Yu ;Feng Gu ;Xiaojun Ding ;Jianxin Wang ;Yi Pan 《清华大学学报》2014,(6):596-616

Genome-Wide Association Studies（GWASs） aim to identify genetic variants that are associated with disease by assaying and analyzing hundreds of thousands of Single Nucleotide Polymorphisms（SNPs）. Although traditional single-locus statistical approaches have been standardized and led to many interesting findings, a substantial number of recent GWASs indicate that for most disorders, the individual SNPs explain only a small fraction of the genetic causes. Consequently, exploring multi-SNPs interactions in the hope of discovering more significant associations has attracted more attentions. Due to the huge search space for complicated multilocus interactions, many fast and effective methods have recently been proposed for detecting disease-associated epistatic interactions using GWAS data. In this paper, we provide a critical review and comparison of eight popular methods, i.e., BOOST, TEAM, epi Forest, EDCF, SNPHarvester, epi MODE, MECPM, and MIC, which are used for detecting gene-gene interactions among genetic loci. In views of the assumption model on the data and searching strategies, we divide the methods into seven categories. Moreover, the evaluation methodologies,including detecting powers, disease models for simulation, resources of real GWAS data, and the control of false discover rate, are elaborated as references for new approach developers. At the end of the paper, we summarize the methods and discuss the future directions in genome-wide association studies for detecting epistatic interactions. 相似文献

17.

Rise and Fall of the Peer-to-Peer Empire

Baochun Li Yuan Feng Bo Li 《清华大学学报》2012,(1):1-16

The essence of the peer-to-peer design philosophy is to design protocols for end hosts, or "peers", to work in collaboration to achieve a certain design objective, such as the sharing of a large file. From a theoretical perspective, it has been recognized that the peer-to-peer design paradigm resembles gossip protocols, and with appropriate algorithmic design, it maximizes the network flow rates in multicast sessions. Over the past ten years, research on peer-to-peer computing and systems, a unique and intriguing category of distributed systems, has received a tremendous amount of research attention from academia and indus-try alike. Peer-to-peer computing eventually culminated in a number of successful commercial systems, showing the viability of their design philosophy in the Internet. The peer-to-peer design paradigm has pushed all design choices of innovative protocols to the edge of the Internet, and in most cases to end hosts themselves. It represents one of the best incarnation of the end-to-end argument, one of the frequently dis-puted design philosophies that guided the design of the Internet. Yet, research on peer-to-peer computing has recently receded from the spotlight, and suffered from a precipitous fall that was as dramatic as its me-teoric rise to the culmination of its popularity. This article presents a cursory glimpse of existing results over the past ten years in peer-to-peer computing, with a particular focus on understanding what has stimulated its rise in popularity, what has contributed to its commercial success, and eventually, what has led to its pre-cipitous fall in research attention. Our insights in this article may be beneficial when we develop our thoughts on the design paradigm of cloud computing. 相似文献

18.

On Peer-Assisted Data Dissemination in Data Center Networks: Analysis and Implementation简

Yaxiong Zhao Jie Wu and Cong Liu 《清华大学学报》2014,(1):51-64

Data Center Networks （DCNs） are the fundamental infrastructure for cloud computing. Driven by the massive parallel computing tasks in cloud computing, one-to-many data dissemination becomes one of the most important traffic patterns in DCNs. Many architectures and protocols are proposed to meet this demand. However, these proposals either require complicated configurations on switches and servers, or cannot deliver an optimal performance. In this paper, we propose the peer-assisted data dissemination for DCNs. This approach utilizes the rich physical connections with high bandwidths and mutli-path connections, to facilitate efficient one-to-many data dissemination. We prove that an optimal P2P data dissemination schedule exists for FatTree, a specially- designed DCN architecture. We then present a theoretical analysis of this algorithm in the general multi-rooted tree topology, a widely-used DCN architecture. Additionally, we explore the performance of an intuitive line structure for data dissemination. Our analysis and experimental results prove that this simple structure is able to produce a comparable performance to the optimal algorithm. Since DCN applications heavily rely on virtualization to achieve optimal resource sharing, we present a general implementation method for the proposed algorithms, which aims to mitigate the impact of the potentially-high churn rate of the virtual machines. 相似文献

19.

Transparent Computing: Spatio-Temporal Extension on von Neumann Architecture for Cloud Services

Yaoxue Zhang Yuezhi Zhou 《清华大学学报》2013,18(1):10-21

The rapid advancements in hardware, software, and computer networks have facilitated the shift of the computing paradigm from mainframe to cloud computing, in which users can get their desired services anytime, anywhere, and by any means. However, cloud computing also presents many challenges, one of which is the difficulty in allowing users to freely obtain desired services, such as heterogeneous OSes and applications, via different light-weight devices. We have proposed a new paradigm by spatio-temporally extending the von Neumann architecture, called transparent computing, to centrally store and manage the commodity programs including OS codes, while streaming them to be run in non-state clients. This leads to a service-centric computing environment, in which users can select the desired services on demand, without concern for these services’ administration, such as their installation, maintenance, management, and upgrade. In this paper, we introduce a novel concept, namely Meta OS, to support such program streaming through a distributed 4VP + platform. Based on this platform, a pilot system has been implemented, which supports Windows and Linux environments. We verify the effectiveness of the platform through both real deployments and testbed experiments. The evaluation results suggest that the 4VP + platform is a feasible and promising solution for the future computing infrastructure for cloud services. 相似文献

20.

I-sieve: An Inline High Performance Deduplication System Used in Cloud Storage

Jibin Wang Zhigang Zhao Zhaogang Xu Hu Zhang Liang Li Ying Guo 《清华大学学报》2015,(1):17-27

Data deduplication is an emerging and widely employed method for current storage systems. As this technology is gradually applied in inline scenarios such as with virtual machines and cloud storage systems, this study proposes a novel deduplication architecture called I-sieve. The goal of I-sieve is to realize a high performance data sieve system based on i SCSI in the cloud storage system. We also design the corresponding index and mapping tables and present a multi-level cache using a solid state drive to reduce RAM consumption and to optimize lookup performance. A prototype of I-sieve is implemented based on the open source i SCSI target, and many experiments have been conducted driven by virtual machine images and testing tools. The evaluation results show excellent deduplication and foreground performance. More importantly, I-sieve can co-exist with the existing deduplication systems as long as they support the i SCSI protocol. 相似文献