摘要(英) |
System log data includes the records of system users’ operation. By means of analyzing log data, we can get much valuable information about system efficiency, users’ habitual behaviors and interests, etc. This information is useful to realize system users more, even to help set up proper strategies.
Many systems and technologies of analyzing log records are based on viewpoint of data. However, who needs to analyze data is who should decide the content of data for analysis. Analyzing log records based on viewpoint of data, if the data users required doesn’t exist in systems, usually increases users’ loads by data rearrangement and recalculation.
Hence we design an on-line analytical processing system based on viewpoints of system users to analyze data. First, by operating the split process, users can define various kinds of data features to give more sensible meanings for cube data. In this part, we offer several split functions. Users can model data features by these functions and increase credibility of data feature definitions. And then operating replacing process, users are able to construct feature space to obtain feature-related data. Moreover, the system provides Boolean operation for users. So users can consider multiple conditions to define features or construct feature space. Through data features, users combine their viewpoints with cube data and get required data directly.
After gaining data, we want to do some analysis. The analytical results will contain these feature definitions. Besides data mining technologies, we also adopt statistical methods for data analysis. People have affirmed the test ability of statistics. By using some statistics methods, users will put more trust in analytical results. |
參考文獻 |
[1]林清山:心理與教育統計學。台北:東華書局,民63
[2]Barry Devlin: Data Warehouse: from Architecture to Implementation. Massachusetts: Addison Wesley Longman, 1997
[3]Haleh Vafaie and Kenneth A. De Jong: Robust Feature Selection Algorithms. Proceedings of the International Conference on Tools with AI. IEEE Computer Society Press 1993: 356-364
[4]Haleh Vafaie and Kenneth A. De Jong: Genetic Algorithms as a Tool for Restructuring Feature Space Representations. Proceedings of the International Conference on Tools with AI. IEEE Computer Society Press, 1995
[5]IBM DB2 Intelligent Miner for Data. In http://www-4.ibm.com/software/data/iminer/fordata/
[6]J. Han: Conference Tutorial: Integration of Data Mining and Data Warehousing Technologies. (Microsoft PowerPoint slides), ICDE 1997
[7]J. Han: OLAP Mining: An Integration of OLAP with Data Mining. Proc. 1997 IFIP Conference on Data Semantics (DS-7) 1997: 1-11
[8]J. Han: DBMiner: Discovery the Competitive Edge in Your Data. In http://db.cs.sfu.ca/DBMiner/
[9]J. Ross Quinlan: Data Mining Tools See5 and C5.0. In http://www.rulequest.com/see5-info.html
[10]J. Ross Quinlan: Decision Trees and Instance-Based Classifiers. The Computer Science and Engineering Handbook 1997: 521-535
[11]Jim Gray, Surajit Chaudhuri, Adam Bosworth, Andrew Layman, Don Reichart, Murali Venkatrao, Frank Pellow, Hamid Pirahesh: Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab, and Sub Totals. Data Mining and Knowledge Discovery 1(1): 29-53 (1997)
[12]John F. Elder IV, Daryl Pregibon: A Statistical Perspective on Knowledge Discovery in Databases. Advances in Knowledge Discovery and Data Mining 1996: 83-113
[13]L. Tauscher and S. Greenberg. How people revisit web pages: Empirical findings and implications for the design of history systems. International Journal of Human Computer Studies, Special issue on World Wide Web Usability, 47: 97-138 (1997)
[14]M. S. Chen, J. Han, and P. S. Yu. Data mining: An overview from a database perspective. IEEE Trans. Knowledge and Data Engineering 8: 866-883 (1996)
[15]Michael Corey, Michael Abbey, Ian Abramson, Larry Barnes, Benjamin Taub, Rajan Venkitachalan: SQL SERVER7: Data Warehousing. New York: McGraw-Hill, 1999
[16]Michael J. A. Berry, Gordon Linoff: Data Mining Techniques: for Marketing, Sales, and Customer Support. New York: John Wiley & Sons, 1997
[17]Osmar R. Zaïane, Man Xin, Jiawei Han: Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs. ADL 1998: 19-29
[18]Rodney Fuller, Johannes J. de Graaff: Measuring User Motivation from Server Log Files. In http://www.microsoft.com/usability/webconf/fuller/fuller.htm (1997)
[19]SQL Server 7.0 OLAP Services Overview. In http://www.microsoft.com/SQL/productinfo/olapoverview.htm
[20]Sunita Sarawagi, Rakesh Agrawal, Nimrod Megiddo: Discovery-Driven Exploration of OLAP Data Cubes. EDBT 1998: 168-182
[21]Surajit Chaudhuri, Umeshwar Dayal: An Overview of Data Warehousing and OLAP Technology. SIGMOD Record 26(1): 65-74 (1997)
[22]T. Joachims, D. Freitag, T. Mitchell: WebWatcher: A Tour Guide for the World Wide Web, Proceedings of IJCAI97, August 1997
[23]T. Sullivan. Reading reader reaction: A proposal for inferential analysis of web server log files. In Proc. 3rd Conf. Human Factors & the Web, Denver, Colorado, June 1997
[24]U. M. Fqyyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy. Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 1996 |