摘要(英) |
In recent years, the trend of big data has gradually made institutional research a topic of concern for many schools. To cope with this trend and improve teaching quality, our school has established an institutional research unit, integrating data from multiple dimensions including student academic performance, course selection, and club participation, forming a rich and complex data warehouse. However, for institutional researchers addressing different topics, how to organize suitable data marts from the data warehouse remains a challenge. Relying solely on experience or relevance may generate seemingly meaningful but essentially meaningless information. This study argues that the index dimensions of the data marts should reflect potentially causal analysis perspectives. Avoiding imprecise data mart design is crucial as it can lead to inaccurate analysis results and difficulties in interpretation, further affecting the effectiveness of decision support.
This study employs the Correlation-based Feature Selection (CFS) method to calculate and evaluate the value of feature sets. In combination with Forward Selection (FS), it is used to filter out feature sets that align with specific themes. Subsequently, using causal odds ratio mining techniques, it conducts in-depth analysis on specific topics while assessing whether the topic is suitable for in-depth exploration within a given data range. This study uses the institutional research data warehouse as the data source and discusses three different topics: "good adaptability in school," "poor adaptability in school," and "diversified learning." It recommends the index dimensions required for data marts that can highlight causal relevance in specific topics. This assists institutional researchers in being more precise and effective in formulating educational policies, thereby achieving decision support. |
參考文獻 |
[1] J. L. Saupe, "The functions of institutional research," 1990.
[2] S. T. March and A. R. Hevner, "Integrated decision support systems: A data warehousing perspective," Decision support systems, vol. 43, no. 3, pp. 1031-1043, 2007.
[3] C. Ghezzi, "Designing data marts for data warehouses," ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 10, no. 4, pp. 452-483, 2001.
[4] J. Li et al., "From observational studies to causal rule mining," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 7, no. 2, pp. 1-27, 2015.
[5] T. Dasu and T. Johnson, Exploratory data mining and data cleaning. John Wiley & Sons, 2003.
[6] M. A. Hall, "Correlation-based feature selection of discrete and numeric class machine learning," 2000.
[7] J. Pearl, "Causal inference in statistics: An overview," 2009.
[8] W. H. Inmon, "What is a data warehouse," Prism Tech Topic, vol. 1, no. 1, pp. 1-5, 1995.
[9] A. Jović, K. Brkić, and N. Bogunović, "A review of feature selection methods with applications," in 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO), 2015: Ieee, pp. 1200-1205.
[10] J. Hipp, U. Güntzer, and G. Nakhaeizadeh, "Algorithms for association rule mining—a general survey and comparison," ACM sigkdd explorations newsletter, vol. 2, no. 1, pp. 58-64, 2000.
[11] K. Z. Mao, "Orthogonal forward selection and backward elimination algorithms for feature subset selection," IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 34, no. 1, pp. 629-634, 2004.
[12] P. Sedgwick, "Retrospective cohort studies: advantages and disadvantages," Bmj, vol. 348, 2014.
[13] M. Scriven, "A summative evaluation of RCT methodology: An alternative approach to causal research," Journal of multidisciplinary evaluation, vol. 5, no. 9, pp. 11-24, 2008.
[14] M. Szumilas, "Explaining odds ratios," Journal of the Canadian academy of child and adolescent psychiatry, vol. 19, no. 3, p. 227, 2010.
[15] B. Singh, N. Kushwaha, and O. P. Vyas, "A feature subset selection technique for high dimensional data using symmetric uncertainty," Journal of Data Analysis and Information Processing, vol. 2, no. 4, pp. 95-105, 2014.
[16] A. P. U. Siahaan, A. Ikhwan, and S. Aryza, "A novelty of data mining for promoting education based on FP-growth algorithm," 2018. |