以往天文學者總是利用人工方式進行觀測紀錄的前處理以及各種分析工作,但隨著各種新式天文台計畫的啟動,觀測呈現爆炸性的增加,以人工處理這些每日數以TB 計的資料量將會是非常不切實際的行為。爲了能夠應映天文觀測紀錄的快速成長,我們需要新的大型資料管理架構,而新的數據資料分析方法更是至關重要。本計劃將會著眼於以下目標: 1.建構自動化資訊前處理系統由於天體和地球運轉關係,必須集合不同地區天文台資料,才能組成一個全天星體觀測紀錄。且天文觀測受限於硬體,氣候,時間,溫差等物理限制,須將不同資料修正至相同刻度,為此需要建立自動化的天文資訊前處理機制,此系統將可以利用網路擷取技術自動抓取不同站台資訊,並且根據天文台所紀錄的歷史觀測狀況自動修正並統整資料。 2.發展時序分析與關聯規則之策略機制天文領域非常重要的一環在於找尋各種星體間的相似或相異特徵,進而將星體分門別類。科學家以往使用手動人工比對的方式不但效率且較難發掘隱含,複雜或未知的特徵。爲此我們將引入資料探勘技術,從大量的資料中自動搜尋隱藏於其中的有著特殊關聯性的資訊。爲了應付大量儲存以及計算問題,我們會將系統建置於分散式環境中,同時讓使用者能夠快速存取及分析資料而不必擔心底層資料管理及維護問題 Astronomical researchers have been manually registering and maintaining observation data for various analysis processes. But with the ongoing construction of observatories from several international projects, the size of observation data has exploded. Manually processing several tera-bytes of data each day becomes impractical. Responding to this challenge, we need to construct large scale information management system, as well as the efficient methodology for data analysis. We have the following goals to achieve in this project: 1. Constructing an automatic information preparation system. Because of the movements of earth and astronomical objects, a complete set of observation records requires gathering data from world-wide observatories. Limited by factors such as hardware, weather, time, or temperature, we also need to calibrate and clarify heterogeneous data sources before data integration. Considering the rapidly growing data size, data preparation has to be processed automatically and efficiently. We will implement this preparation system with the accessibility of computer network and perform necessary calibration or transformation based on historical data features. The clarified data then can be integrated for further analysis and researches. 2. Develop astronomical time-series pattern mining and associated rule mining methodologies. Discovering the similarities between astronomical objects, and accordingly classify those objects, is an important process for many astronomical researches. Manually or semi-automatically comparison processes are unable to handle the huge scale data size in today’s research environment. We will introduce various data-mining or knowledge discovering techniques to facilitate discovering the unnoticed, unknown, or complicated features and relationships. The system will be developed based on the state of art distributed framework, to provide efficient quality of service without extra efforts on detailed data management 研究期間:10008 ~ 10107