Frequency and Similarity-Aware Partitioning for Cloud Storage Based on Space-Time Utility Maximization Model |
| |
Affiliation: | Jianjiang Li;Jie Wu;Zhanning Ma;Department of Computer Science and Technology, University of Science and Technology Beijing;Department of Computer and Information Sciences, Temple University; |
| |
Abstract: | With the rise of various cloud services, the problem of redundant data is more prominent in the cloud storage systems. How to assign a set of documents to a distributed file system, which can not only reduce storage space, but also ensure the access efficiency as much as possible, is an urgent problem which needs to be solved.Space-efficiency mainly uses data de-duplication technologies, while access-efficiency requires gathering the files with high similarity on a server. Based on the study of other data de-duplication technologies, especially the Similarity-Aware Partitioning(SAP) algorithm, this paper proposes the Frequency and Similarity-Aware Partitioning(FSAP) algorithm for cloud storage. The FSAP algorithm is a more reasonable data partitioning algorithm than the SAP algorithm. Meanwhile, this paper proposes the Space-Time Utility Maximization Model(STUMM), which is useful in balancing the relationship between space-efficiency and access-efficiency. Finally, this paper uses 100 web files downloaded from CNN for testing, and the results show that, relative to using the algorithms associated with the SAP algorithm(including the SAP-Space-Delta algorithm and the SAP-Space-Dedup algorithm), the FSAP algorithm based on STUMM reaches higher compression ratio and a more balanced distribution of data blocks. |
| |
Keywords: | |
本文献已被 CNKI 万方数据 等数据库收录! |
|