博碩士論文 110522007 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:3 、訪客IP:3.145.42.94
姓名 王伯倫(Po-Lun Wang)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱
(GraLoc: Preserving Graph Locality to Minimize Read and Write Amplification on NAND Flash Memory)
相關論文
★ 重新思考虛擬記憶體管理的方式以開放通道式固態硬碟最大限度地減少深度學習推薦系統演算法的讀寫流量★ 開啟製程相似檢查方法在組裝超級塊上以最小化額外的寫入延遲
★ LaDy: Enabling Locality-aware Deduplication Technology on Shingled Magnetic Recording Drives★ On Minimizing Writing Overhead to Establish a Low-latency LSM-tree on Skyrmion-based Racetrack Memory
★ WABE: Rethinking B-epsilon-tree to Minimize Write-amplification on NAND Flash Memory★ Rethinking Bϵ tree Indexing Structure over NVM with the Support of Multi-write Modes
★ Prophet’s Insight: Unleashing Deduplication System Performance in Multi-tier Storage Systems★ Freeing the Power of High Parallelism: Accelerating the Bϵ-Tree Indexing Scheme Performance on Open-Channel SSD
★ Applying Content-Defined Chunking to OCSSD-based Deduplication Systems★ On Minimizing Writing Overhead to Establish a Low-latency LSM-tree on Skyrmion-based Racetrack Memory
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 (2026-12-1以後開放)
摘要(中) 由於社群網路應用以及和圖形資料庫系統的普及,大規模圖形 (Large-scale Graph) 已經變成了應用在資料挖掘及深度學習中的主流資料結構。然而,現代計算機系統中有限的隨機存取記憶體 (Dynamic Random-access Memory) 空間,是對整個大規模圖形進行同時處理的挑戰。因此,考慮到可用的計算機資源,將大規模圖形分成多個子圖進行並行處理是有其必要性的。不過,現今的圖分割 (Graph Partitioning) 方法,無法在圖分割的過程中,完善的處理讀取和儲存對於儲存裝置所造成的負擔。
為了在圖分割的過程中找到最佳分割邊,以確保在子圖平衡的同時也能保留大規模圖形的原始特性,我們必須常常對儲存裝置進行加載或是存儲,又因為 NAND 快閃記憶體的基本讀寫單元為一個頁面,使得這樣的問題變得更為嚴重。本文介紹了 GraLoc,一種圖分割解決方法,旨在考慮資料擺放時的圖形局部性,使其更適合應用於 NAND 快閃記憶體。為了減少圖分割過程中所造成的讀取放大 (Read Amplification),目標是最大化在單一頁面中的圖形局部性,我們的實驗結果展現出圖分割過程中在儲存效能的顯著進步。
摘要(英) Due to the prevalence of social network applications and graph database systems, large-scale graphs have become a mainstream data structure for data mining and machine learning. However, the challenge arises from the limited size of dynamic random-access memory (DRAM), preventing the simultaneous processing of entire large-scale graphs in modern computer systems. Consequently, the need arises to partition the overall large-scale graph into multiple sub-graphs for parallel processing and to align with available computing resources. Unfortunately, existing graph partitioning solutions do not adequately address the load/store overhead on the storage device during the partitioning process. As the graph partitioning process involves finding the optimal splitting edge to ensure sub-graph balance and preserve the original properties of the large-scale graph, there is frequent loading from and storing to the storage device. This issue is further exacerbated on NAND flash memory due to its basic read/write unit (i.e., a page size). This paper introduces GraLoc, a graph partitioning solution designed to be NAND flash memory-friendly by considering graph locality during data placement. The proposed solution aims to maximize graph locality within a single page, effectively minimizing read amplification during the graph partitioning process. Our experiments demonstrate a significant improvement in storage performance during graph partitioning.
關鍵字(中) ★ 圖形數據
★ NAND 快閃記憶體
★ 局部性
★ 邊分割
關鍵字(英) ★ Graph data
★ NAND flash memory
★ Locality
★ Edge partitioning
論文目次 Contents

1 Introduction 1
2 Technical Background & Motivation 3
2.1 Graph Partitioning 3
2.2 NAND Flash Memory 4
2.3 Related Works 5
2.4 Motivation 6
3 Locality-Aware Partitioning 8
3.1 GraLoc Overview 8
3.2 Detailed Components 9
3.2.1 Locality-aware Graph Pre-processor 9
3.2.2 Subgraph Descriptor 10
3.2.3 Partition Reader 12
3.3 Overhead Analysis 13
4 Experiments 14
4.1 Environmental Settings 14
4.2 Evaluation Results 15
4.2.1 Balance 15
4.2.2 Replication factor 16
4.2.3 Flash memory I/O performance 17
5 Concluding Remarks 20
Bibliography 21
參考文獻 [1] Bonnie Berger, Rohit Singht, and Jinbo Xu. “Graph Algorithms for Biological Systems Analysis”. In: Proceedings of the Nineteenth Annual ACM-SIAM Sympo- sium on Discrete Algorithms. SODA ’08. San Francisco, California: Society for In- dustrial and Applied Mathematics, 2008, 142–151.
[2] Louise Quick, Paul Wilkinson, and David Hardcastle. “Using Pregel-like Large Scale Graph Processing Frameworks for Social Network Analysis”. In: 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 2012, pp. 457–463.
[3] Maciej Besta et al. “Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph Queries”. In: ACM Comput. Surv. 56.2 (2023). ISSN: 0360-0300.
[4] Kiran Kumar Matam, Hanieh Hashemi, and Murali Annavaram. “MultiLogVC: Efficient Out-of-Core Graph Processing Framework for Flash Storage”. In: 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 2021, pp. 245–255.
[5] Sang-Woo Jun et al. “GraFBoost: Using Accelerated Flash Storage for External Graph Analytics”. In: 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 2018, pp. 411–424.
[6] Kiran Kumar Matam et al. “GraphSSD: Graph Semantics Aware SSD”. In: Pro- ceedings of the 46th International Symposium on Computer Architecture. ISCA ’19. Phoenix, Arizona: Association for Computing Machinery, 2019, 116–128. ISBN: 9781450366694.
[7] Arash Tavakkol et al. “MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices”. In: 16th USENIX Conference on File and Storage Technologies (FAST 18). Oakland, CA: USENIX Association, Feb. 2018, pp. 49–66. ISBN: 978-1-931971-42-3.
[8] Chenzi Zhang et al. “Graph Edge Partitioning via Neighborhood Heuristic”. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Dis- covery and Data Mining. KDD ’17. Halifax, NS, Canada: Association for Comput- ing Machinery, 2017, 605–614. ISBN: 9781450348874.
[9] Fatemeh Rahimian et al. “A Distributed Algorithm for Large-Scale Graph Par- titioning”. In: ACM Trans. Auton. Adapt. Syst. 10.2 (2015). ISSN: 1556-4665.
[10] Robert Ryan McCune, Tim Weninger, and Greg Madey. “Thinking Like a Ver- tex: A Survey of Vertex-Centric Frameworks for Large-Scale Distributed Graph Processing”. In: ACM Comput. Surv. 48.2 (2015). ISSN: 0360-0300.
[11] Teng Ma, Zhitao Li, and Ning Liu. “Log-ROC: Log Structured RAID on Open- Channel SSD”. In: 2022 IEEE 40th International Conference on Computer Design (ICCD). 2022, pp. 332–335.
[12] Matias Bjørling et al. “ZNS: Avoiding the Block Interface Tax for Flash-based SSDs”. In: 2021 USENIX Annual Technical Conference (USENIX ATC 21). USENIX Association, July 2021, pp. 689–703. ISBN: 978-1-939133-23-6.
[13] Da Zheng et al. “FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs”. In: 13th USENIX Conference on File and Storage Technologies (FAST 15). Santa Clara, CA: USENIX Association, Feb. 2015, pp. 45–58. ISBN: 978-1-931971-201.
[14] Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. “X-Stream: Edge- Centric Graph Processing Using Streaming Partitions”. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. SOSP ’13. Farminton, Pennsylvania: Association for Computing Machinery, 2013, 472–488. ISBN: 9781450323888.
[15] Tianqi Chen and Carlos Guestrin. “XGBoost: A Scalable Tree Boosting System”. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’16. San Francisco, California, USA: Associa- tion for Computing Machinery, 2016, 785–794. ISBN: 9781450342322.
[16] Hao Yin et al. “Local Higher-Order Graph Clustering”. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’17. Halifax, NS, Canada: Association for Computing Machinery, 2017, 555–564. ISBN: 9781450348874.
[17] Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. “Predicting Positive and Negative Links in Online Social Networks”. In: Proceedings of the 19th Inter- national Conference on World Wide Web. WWW ’10. Raleigh, North Carolina, USA: Association for Computing Machinery, 2010, 641–650. ISBN: 9781605587998.
[18] Jure Leskovec et al. “Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters”. In: Internet Mathematics
6.1 (2009), pp. 29–123.
[19] Jure Leskovec, Lada A. Adamic, and Bernardo A. Huberman. “The Dynamics of Viral Marketing”. In: ACM Trans. Web 1.1 (2007), 5–es. ISSN: 1559-1131.
[20] Lars Backstrom et al. “Group Formation in Large Social Networks: Member- ship, Growth, and Evolution”. In: Proceedings of the 12th ACM SIGKDD Interna- tional Conference on Knowledge Discovery and Data Mining. KDD ’06. Philadelphia, PA, USA: Association for Computing Machinery, 2006, 44–54. ISBN: 1595933395.
[21] Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. “Graphs over Time: Den- sification Laws, Shrinking Diameters and Possible Explanations”. In: Proceed- ings of the Eleventh ACM SIGKDD International Conference on Knowledge Discov- ery in Data Mining. KDD ’05. Chicago, Illinois, USA: Association for Computing Machinery, 2005, 177–187. ISBN: 159593135X.
[22] Jaewon Yang and Jure Leskovec. “Defining and Evaluating Network Communi- ties Based on Ground-Truth”. In: MDS ’12. Beijing, China: Association for Com- puting Machinery, 2012. ISBN: 9781450315463.
指導教授 陳增益(Tseng-Yi Chen) 審核日期 2023-12-11
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明