博碩士論文 111526006 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:104 、訪客IP:3.144.108.200
姓名 翁培馨(Pei-Hsin Weng)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 基於因果勝算比推薦設計資料市集所需的索引維度
(Recommend Data-Mart Index Dimensions Based on Causality Odds Ratio Measures)
相關論文
★ 應用自組織映射圖網路及倒傳遞網路於探勘通信資料庫之潛在用戶★ 基於社群網路特徵之企業電子郵件分類
★ 行動網路用戶時序行為分析★ 社群網路中多階層影響力傳播探勘之研究
★ 以點對點技術為基礎之整合性資訊管理 及分析系統★ 在分散式雲端平台上對不同巨量天文應用之資料區域性適用策略研究
★ 應用資料倉儲技術探索點對點網路環境知識之研究★ 從交易資料庫中以自我推導方式探勘具有多層次FP-tree
★ 建構儲存體容量被動遷徙政策於生命週期管理系統之研究★ 應用服務探勘於發現複合服務之研究
★ 利用權重字尾樹中頻繁事件序改善入侵偵測系統★ 有效率的處理在資料倉儲上連續的聚合查詢
★ 入侵偵測系統:使用以函數為基礎的系統呼叫序列★ 有效率的在資料方體上進行多維度及多層次的關聯規則探勘
★ 在網路學習上的社群關聯及權重之課程建議★ 在社群網路服務中找出不活躍的使用者
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 近年來,大數據的發展趨勢使得校務研究逐漸成為眾多學校關注的議題。為了應對這一趨勢並提高教學品質,本校成立了校務研究單位,整合了包括學生學業成績、課程選擇、社團參與等多個維度的數據,構建了一個既豐富又複雜的資料倉儲。然而,校務研究員在針對不同主題時,如何從資料倉儲中組織出適合的資料市集仍是一項挑戰。僅依靠經驗或相關性可能會產生看似有意義但實質無意義的訊息。本研究認為,資料市集的索引維度應該反映出具有因果可能的分析角度,避免不嚴謹的資料市集設計導致分析結果的不準確和難以解釋,進一步影響決策支援的效果。
本研究採用基於相關性特徵選擇(Correlation-based Feature Selection, CFS)來計算和評估特徵集合的價值,並搭配使用向前選擇(Forward Selection, FS)作為具體的特徵選擇方法,篩選出符合特定主題的特徵集合。隨後,透過因果勝算比探勘技術,針對特定主題進行深入分析,同時評估在給定的數據範圍內,該主題是否具有深入探索的可適性。本研究以校務資料倉儲作為資料來源,分別對 「在學適應良好」、「在學不適應」、「多元學習」三個不同主題進行探討,推薦在特定主題中能夠突顯因果相關的資料市集所需的索引維度。藉此協助校務研究人員在教育方針上能更精準且有力,達到決策支援。
摘要(英) In recent years, the trend of big data has gradually made institutional research a topic of concern for many schools. To cope with this trend and improve teaching quality, our school has established an institutional research unit, integrating data from multiple dimensions including student academic performance, course selection, and club participation, forming a rich and complex data warehouse. However, for institutional researchers addressing different topics, how to organize suitable data marts from the data warehouse remains a challenge. Relying solely on experience or relevance may generate seemingly meaningful but essentially meaningless information. This study argues that the index dimensions of the data marts should reflect potentially causal analysis perspectives. Avoiding imprecise data mart design is crucial as it can lead to inaccurate analysis results and difficulties in interpretation, further affecting the effectiveness of decision support.
This study employs the Correlation-based Feature Selection (CFS) method to calculate and evaluate the value of feature sets. In combination with Forward Selection (FS), it is used to filter out feature sets that align with specific themes. Subsequently, using causal odds ratio mining techniques, it conducts in-depth analysis on specific topics while assessing whether the topic is suitable for in-depth exploration within a given data range. This study uses the institutional research data warehouse as the data source and discusses three different topics: "good adaptability in school," "poor adaptability in school," and "diversified learning." It recommends the index dimensions required for data marts that can highlight causal relevance in specific topics. This assists institutional researchers in being more precise and effective in formulating educational policies, thereby achieving decision support.
關鍵字(中) ★ 因果勝算比探勘
★ 資料市集
★ 校務研究
關鍵字(英) ★ Causal Odds Ratio Mining
★ Data Mart
★ Institutional Research
論文目次 摘要 i
ABSTRACT ii
誌謝 iv
目錄 v
圖目錄 vii
表目錄 viii
一、緒論 1
1-1. 研究背景與動機 1
1-2. 研究目的 2
二、文獻探討 3
2-1. 資料倉儲(Data Warehouse ) 3
2-2. 資料市集(Data Mart) 3
2-3. 基於相關性的特徵選擇(Correlation-based Feature Selection ,CFS) 4
2-3-1. 向前選擇法(Forward Selection ,FS) 5
2-4. 關聯規則探勘 (Association Rule Mining) 6
2-4-1. 支持度(Support) 6
2-4-2. 信賴度(Confidence) 7
2-5. 因果勝算比探勘(Causal Odds Ratio Mining) 7
2-6. 隊列研究(Cohort Study) 8
2-7. 勝算比(Odds Ratio) 9
三、研究方法 11
3-1. 系統架構 11
3-2. 資料來源與前處理 11
3-3. 基於相關性的特徵選擇流程 13
3-4. 系統流程 14
3-5. 因果勝算比規則候選與驗證 16
3-6. 分析與推薦資料市集所需的維度 17
四、實驗 19
4-1. 實驗環境與規格 19
4-2. 實驗資料集與設計 19
4-3. 實驗分析 20
4-3-1. 主題-在學適應良好 20
4-3-1-1. 因果勝算比規則分析與討論-在學適應良好 22
4-3-2. 主題-在學不適應 25
4-3-2-1. 因果勝算比規則分析與討論-在學不適應 28
4-3-3. 主題-多元學習 31
4-3-3-1. 因果勝算比規則分析與討論-多元學習 32
4-4. 實驗結果統計 33
五、結論與討論 35
六、參考文獻 37
參考文獻 [1] J. L. Saupe, "The functions of institutional research," 1990.
[2] S. T. March and A. R. Hevner, "Integrated decision support systems: A data warehousing perspective," Decision support systems, vol. 43, no. 3, pp. 1031-1043, 2007.
[3] C. Ghezzi, "Designing data marts for data warehouses," ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 10, no. 4, pp. 452-483, 2001.
[4] J. Li et al., "From observational studies to causal rule mining," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 7, no. 2, pp. 1-27, 2015.
[5] T. Dasu and T. Johnson, Exploratory data mining and data cleaning. John Wiley & Sons, 2003.
[6] M. A. Hall, "Correlation-based feature selection of discrete and numeric class machine learning," 2000.
[7] J. Pearl, "Causal inference in statistics: An overview," 2009.
[8] W. H. Inmon, "What is a data warehouse," Prism Tech Topic, vol. 1, no. 1, pp. 1-5, 1995.
[9] A. Jović, K. Brkić, and N. Bogunović, "A review of feature selection methods with applications," in 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO), 2015: Ieee, pp. 1200-1205.
[10] J. Hipp, U. Güntzer, and G. Nakhaeizadeh, "Algorithms for association rule mining—a general survey and comparison," ACM sigkdd explorations newsletter, vol. 2, no. 1, pp. 58-64, 2000.
[11] K. Z. Mao, "Orthogonal forward selection and backward elimination algorithms for feature subset selection," IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 34, no. 1, pp. 629-634, 2004.
[12] P. Sedgwick, "Retrospective cohort studies: advantages and disadvantages," Bmj, vol. 348, 2014.
[13] M. Scriven, "A summative evaluation of RCT methodology: An alternative approach to causal research," Journal of multidisciplinary evaluation, vol. 5, no. 9, pp. 11-24, 2008.
[14] M. Szumilas, "Explaining odds ratios," Journal of the Canadian academy of child and adolescent psychiatry, vol. 19, no. 3, p. 227, 2010.
[15] B. Singh, N. Kushwaha, and O. P. Vyas, "A feature subset selection technique for high dimensional data using symmetric uncertainty," Journal of Data Analysis and Information Processing, vol. 2, no. 4, pp. 95-105, 2014.
[16] A. P. U. Siahaan, A. Ikhwan, and S. Aryza, "A novelty of data mining for promoting education based on FP-growth algorithm," 2018.
指導教授 蔡孟峰(Meng-Feng Tsai) 審核日期 2024-7-22
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明