博碩士論文 105522008 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:10 、訪客IP:3.226.248.180
姓名 駱鍇頡(Kai-Jie Lo)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 基於分散式運算架構探索時序性小行星軌跡
(Exploration of Sequential Asteroid Trajectories with a Distributed Computing System)
相關論文
★ 應用自組織映射圖網路及倒傳遞網路於探勘通信資料庫之潛在用戶★ 基於社群網路特徵之企業電子郵件分類
★ 社群網路中多階層影響力傳播探勘之研究★ 以點對點技術為基礎之整合性資訊管理 及分析系統
★ 在分散式雲端平台上對不同巨量天文應用之資料區域性適用策略研究★ 應用資料倉儲技術探索點對點網路環境知識之研究
★ 從交易資料庫中以自我推導方式探勘具有多層次FP-tree★ 建構儲存體容量被動遷徙政策於生命週期管理系統之研究
★ 應用服務探勘於發現複合服務之研究★ 利用權重字尾樹中頻繁事件序改善入侵偵測系統
★ 有效率的處理在資料倉儲上連續的聚合查詢★ 入侵偵測系統:使用以函數為基礎的系統呼叫序列
★ 有效率的在資料方體上進行多維度及多層次的關聯規則探勘★ 在網路學習上的社群關聯及權重之課程建議
★ 在社群網路服務中找出不活躍的使用者★ 利用階層式權重字尾樹找出在天文觀測紀錄中變化相似的序列
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 (全文檔遺失)
請聯絡國立中央大學圖書館資訊系統組 TEL:(03)422-7151轉57422,或E-mail聯絡
摘要(中) 由於天文觀測資料相當之龐大,長時間觀測下來的數據往往會到達PB以上等級。這不僅僅對天文人員造成分析上的困擾,也在分析過程中耗費難以想像的時間。雖然電腦規格日益進步,但是一般普通電腦還是無法獨自處理全部的數據,因為可能會遭遇到記憶體或者硬碟的空間不足以及運算耗時的問題。所以本論文提出,以基於分散式運算演算法的方法來處理天文資料,且要使其結果符合時序性質,以致可以有效且精確的處理天文數據。本論文將以泛星計畫 (Pan-STARRS, Panoramic Survey Telescope and Rapid Response System) 作為實驗的資料來源。 本論文以 Hadoop 分散式檔案系統作為儲存設備,以良好的擴充性以及可靠性作為考量。搭配 Apache Spark 作為分散式運算框架,能夠更有效率利用分散式的演算法來的尋找在天體中的小行星軌跡。為了能夠讓Spark 能夠更緊密的與 Hadoop 系統做接合,本論文也利用 Hadoop Yarn作為此系統之叢集資源管理器。
本論文於資料前處理階段,將會在原始資料上進行k-d Tree的範圍搜尋來消除雜訊。緊接著,會以分散式 Hough Transform 演算法找出可能為路徑的線,來作為第一次分群的條件。之後,會以第一次的分群結果,基於時序性的配對方式找出可能軌跡片段的兩個點,運算其速度以及方向,作為第二次分群的條件。再來,將第二次分群的結果,進行改編過的Floyd-Warshall 運算 Transitive Closure,進而得到軌跡的最大樣式(maximal patterns)。最後在輸出軌跡前,必須判斷軌跡最大樣式之相似度,將 Hough Transform 離散化取樣所產生的重複軌跡去除。
摘要(英) The amount of astronomical observational data is greatly increasing, along with long-term data entered into the petabytes (PB) scale. This presents a problem for analysis as well as a time-consuming puzzle. Although the current computer standard is improving, the ordinary personal computer encounters space exhaustion and associated problems. The purpose of this thesis is to study astronomical observational data, with results compiled as sequential property; data is used from the Panoramic Survey Telescope and Rapid Response System (Pan-STARRS).
The Hadoop Distributed File System (HDFS) is used for storage, as it is well-known for creating excellent scalability and reliability. This approach also adopts Apache Spark as the distributed computing framework to effectively use distributed algorithms and explore asteroid trajectories; similarly, the Hadoop Yarn is used as the cluster manager for this system.
This approach can be split into seven stages. First, there are processing range queries by k-dimensional (k-d) trees to filter noise. Second, it processes the distributed Hough transform algorithm to determine a line for grouping. Third, it filters detections by the standard deviation of magnitudes. Fourth, it pairs every two detections into a pair-based sequential property and calculates its velocity and direction as a condition for the next grouping stage. Fifth, it groups pairs by the Hough transform’s rho, theta, velocity and direction. Sixth, it uses the adapted Floyd-Warshall algorithm to compute transitive closure and establish maximal patterns. Finally, it deduplicates asteroid trajectories before outputting the result.
關鍵字(中) ★ 大數據
★ 分散式運算
★ 小行星軌跡
★ Hough Transform
★ Transitive Closure
關鍵字(英) ★ Big Data
★ Distributed Computing
★ Asteroid Trajectory
★ Hough Transform
★ Transitive Closure
論文目次 摘要 ......................................................................................................................... i
Abstract ................................................................................................................. ii
誌謝 .......................................................................................................................iii
目錄 ....................................................................................................................... iv
圖目錄 ................................................................................................................... vi
表目錄 ................................................................................................................. viii
一、諸論 .................................................................................................................. 1
1-1研究背景及動機 ................................................................................................... 1
1-2研究目的 ............................................................................................................. 2
1-3論文架構 ..............................................................................................................3
二、文獻探討 .......................................................................................................... 4
2-1泛星計畫 ............................................................................................................. 4
2-2 主帶小行星 ......................................................................................................... 5
2-3 小行星軌跡研究 ................................................................................................. 6
2-4 Apache Hadoop ................................................................................................. 7
2-5 Apache Spark .................................................................................................... 10
三、組織與架構 ...................................................................................................... 14
3-1 分散式運算系統 ................................................................................................. 14
3-2探索時序性小行星軌跡 .................................................................................... 14
3-2-1資料前處理 ......................................................................................... 15
3-2-2第一次分群(Hough Transform) ........................................................ 15
3-2-3 去除雜訊點 ......................................................................................... 16
3-2-4 生成時序性軌跡片段.......................................................................... 16
3-2-5第二次分群(軌跡片段特徵) ............................................................... 17
3-2-6 生成時序性軌跡 ................................................................................. 17
四、研究方法 ......................................................................................................... 19
4-1 探索時序性小行星軌跡 ..................................................................................... 19
4-1-1 利用k-d tree的範圍查詢做前處理................................................... 20
4-1-2 第一次分群(Hough Transform) ........................................................ 21
4-1-2-1 Hough Transform ................................................................ 21
vi
4-1-2-2 在MapReduce概念下透過 Hough Transform 分群 ......... 27
4-1-3 去除雜訊點 ........................................................................................ 28
4-1-4 生成時序性軌跡片段 ......................................................................... 29
4-1-5 透過軌跡片段特徵再次分群 .............................................................. 32
4-1-6生成時序性小行星軌跡 ...................................................................... 33
4-2 去重複時序性小行星軌跡 ............................................................................... 39
五、實驗 ............................................................................................................... 42
5-1 實驗資料 ........................................................................................................... 42
5-2 實驗參數及環境介紹 ....................................................................................... 43
5-3 整體結果 ........................................................................................................... 45
5-3-1 前處理階段......................................................................................... 47
5-3-2 第一次分群 ........................................................................................ 48
5-3-3 生成時序性軌跡片段以及第二次分群階段 ...................................... 49
5-3-4 軌跡連線階段以及軌跡去重複階段 .................................................. 51
5-4 執行時間 ........................................................................................................... 52
六、結論 ................................................................................................................ 53
參考文獻 ................................................................................................................54
附錄一、32 天區軌跡長度統計直方圖 ..................................................................... 56
參考文獻 [1] N. Kaiser, H. Aussel, B. E. Burke, H. Boesgaard, K. Chambers, M. R. Chun, et al., "Pan-STARRS: a large synoptic survey telescope array," in Survey and Other Telescope Technologies and Discoveries, 2002, pp. 154-165.
[2] Pan-STARRS official website. Available: http://pswww.ifa.hawaii.edu/pswww/?page_id=154
[3] C.-S. Huang, M.-F. Tsai, P.-H. Huang, L.-D. Su, and K.-S. Lee, "Distributed asteroid discovery system for large astronomical data," Journal of Network and Computer Applications, vol. 93, pp. 27-37, 2017.
[4] M. Williams. What is the asteroid belt. Available: https://www.universetoday.com/32856/asteroid-belt/
[5] NASA, Asteroid. Available: https://ssd.jpl.nasa.gov/?asteroids
[6] N. Kaiser, H. Aussel, B. E. Burke, H. Boesgaard, K. Chambers, M. R. Chun, et al., "Pan-STARRS: a large synoptic survey telescope array," in Survey and Other Telescope Technologies and Discoveries, 2002, pp. 154-165.
[7] Apache Hadoop official website. Available: http://hadoop.apache.org/
[8] T. White, Hadoop: The Definitive Guide, 3rd Edition: O′Reilly Media, 2012.
[9] Apache Spark official website. Available: https://spark.apache.org/
[10] H. Karau, R. Warren, High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark: O′Reilly Media, 2017.
[11] Rosen, Kenneth H, Discrete mathematics and its applications: McGraw-Hill Education, 2007
[12] R. O. Duda and P. E. Hart, "Use of the Hough transformation to detect lines and curves in pictures," Communications of the ACM, vol. 15, pp. 11-15, 1972.
55
[13] Hadoop :- Inside MapReduce ( Process of Shuffling , sorting )–Part II. Available: https://haritbigdata.files.wordpress.com/2015/07/mapreduce.png
[14] Submitting User Applications with spark-submit, AWS Big Data Blog. Available: https://aws.amazon.com/tw/blogs/big-data/submitting-user-applications-with-spark-submit/
[15] J. L. Bentley, "Multidimensional binary search trees used for associative searching," Communications of the ACM, vol. 18, pp. 509-517, 1975.
[16] R. K. Satzoda, S. Suchitra, and T. Srikanthan, "Parallelizing the Hough transform computation," IEEE Signal Processing Letters, vol. 15, pp. 297-300, 2008.
[17] N. S. O.-S. Peak. Magnitude. Available: https://web.archive.org/web/20080206074842/http://www.nso.edu/PR/answerbook/magnitude.html
[18] W. Gellert, S. Gottwald, M. Hellwich, H. Kästner, and H. Küstner, "The VNR Concise Encyclopedia of Mathematics.", 1989.
指導教授 蔡孟峰(Meng-Feng Tsai) 審核日期 2018-7-12
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明