運用權重式字尾樹之分散式天文序列資料索引系統;Distributed Astronomical Sequential Data Indexing System with Weighted Suffix Tree

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/65819

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/65819

題名:	運用權重式字尾樹之分散式天文序列資料索引系統;Distributed Astronomical Sequential Data Indexing System with Weighted Suffix Tree
作者:	劉書宏;Liou,Shu-Hung
貢獻者:	資訊工程學系
關鍵詞:	分散式系統;權重式字尾樹
日期:	2014-08-27
上傳時間:	2014-10-15 17:11:02 (UTC+8)
出版者:	國立中央大學
摘要:	現在儲存裝置的容量愈來愈大，資料的成長速度相當快速，連帶計算所需記憶體也是非常驚人，所以如何針對這大量的資料作結構化的整理，且在不破壞資料結構下讓多台機器來處理，是此篇論文要解決的問題。由於科技演進，天文學家得以藉由數位資料將許多的觀測紀錄儲存起來。因為天文資料內的字串中，各元素之順序有其依賴性，所以我們以字尾樹為基礎來設計，並以權重方式為天文資料做特別的處理。然而字尾樹這個資料結構所需要的記憶體相當驚人，在以 TB 為單位來計算的天文資料來說，單一機器並無法負擔，所以我們想去解決天文資料過大所造成字尾樹結構龐大的問題。因此我們也以分散式的架構，將資料分散處理，而在分散式架構下，如何維持個機器之間維護資料的連續性與獨立不重疊特性也是此篇論文要解決的重點。以天文資料為來源，以此論文產生之系統，可以將相似變化的星體聚集在同一條分支，而不需由人工去歸類。此外因為星體資料隨時間以及地點不斷增加，分散式的架構可以讓資料在任何時間、任何地點被加入，研究人員可以在以此論文為基礎之系統來搜尋他所需要的資訊。而我們實現了同時進行詢問與建立字尾樹的機制，讓研究人員不會因為即時的資料加入而造成結果的錯誤。在未來我們也期望能以此架構來維護分析所有具順序依賴性的資料串流。;Today the storage of devices becomes larger and larger. Therefore, computer needs more time to deal with the data which store in device. In this paper, we provide a method to solve the issue. In astronomical field, telescope will record a lot of data from universe. Because of the continuity of astronomical data, we use a special data structure to maintain the astronomical data. The special data structure is an advance suffix tree, we call it weighted suffix tree. However, researchers find that constructing a suffix tree spends huge memory space in computer system. In order to reduce the memory usage, we use files in disk as external memory. But the usage of external memory cause the increase of I/O overhead, we still have to resolve it. We design a kind of weighted suffix tree which can be applied on distributed system. The distributed weighted suffix tree is designed to help the analysis of astronomical data. In the future, we also hope this data structure can support any kind of data which is continuous and sequential.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	384	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....