English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 41632502      線上人數 : 3703
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/74652


    題名: 基於分散式階層化字尾樹之大量序列資料探勘;Large Scale Sequential Pattern Mining based on Distributed Hierarchical Suffix Tree
    作者: 蘇立鼎;Su, Li-Ding
    貢獻者: 資訊工程學系
    關鍵詞: 分散式系統;分散式運算;資料探勘;階層化字尾樹;Distributed System;Distributed Computing;Data Mining;Hierarchical Suffix Tree
    日期: 2017-07-18
    上傳時間: 2017-10-27 14:35:17 (UTC+8)
    出版者: 國立中央大學
    摘要: 在科學的領域中,天文學具有很重要的地位。由於近年來觀測技術及硬體設備不斷提升,讓天文領域的研究者,能進行更多樣化的分析,而天文望遠鏡所觀測的天文數據資料,歷日曠久不斷累積增加,數據量已逐漸增長到PB(Petabyte)等級的巨量資料(Big Data)。面對單機系統無法負荷的巨大數據量,需使用分散式運算,才能夠有效加速處理分析的運算時間。
    本論文提出了基於Hadoop的分散式架構下,用以協助天文學者分類變星(variable stars)星體的字尾樹系統,系統使用MapReduce與Spark兩種分散式運算框架設計,系統在建構字尾樹的階段,是將大量星體隨時間改變亮度的序列資料,以字尾樹的形式轉成樹狀結構,儲存至分散式的檔案系統中,並支援對於後續資料的新增。利用字尾樹的特性,能讓使用者進行高效率的查詢,此外,系統的檢索階段引入了階層化(Hierarchical)的概念,能夠調整樹中資料的細膩程度,除了能找出因觀測或計算誤差產生的類似序列,亦能夠因應不同的分類方式,提供更宏觀的查詢,讓天文研究者在分類星體時,能依照不同的需求選擇相應的細膩度,來快速地找到,擁有相同或是相似特徵的星體編號。
    ;In the field of science, astronomy has a very important status. As the observation technology and hardware equipment in recent years continue to improve, so that researchers in the field of astronomy can do more diversified analysis, and the amount of data observed by astronomical telescope continue to increase, and has gradually increased to Petabyte level.
    In this paper, a suffix tree system based on distributed sturcture of Hadoop is proposed to assist astronomers to classify variable stars. The system is designed with MapReduce and Spark frameworks. In the stage of constructing suffix tree, the system converts a large amount of data, which is the sequence of star brightness changing over time, into a suffix tree structure, then stores the tree in the distributed file system; the system also supports appending following observation data. Using the characteristics of the suffix tree allows users to query efficiently. Moreover, the query stage of the system introduces the hierarchical concept, which can adjust the preciseness of the data in the tree, allows the system to not only find out the similar sequence generated by observation or calculation errors but also provide more diversified query in response to different classification methods. According to different needs, astronomical researchers can select the preciseness of data to classify stars, and quickly find the ID of same or similar characteristics of the star.
    顯示於類別:[資訊工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML146檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明