dc.description.abstract | On Line Analysis Processing (OLAP) is a common solution that modern enterprises use to generate, monitor, share, and administrate their analysis reports. When daily, weekly, and/or monthly reports are generated or published by the OLAP operators, all the analysis on the contents of reports are left for the report readers. To discover hidden rules, similar reports, or trend inside the potentially huge amount of reports, the report readers can only rely on their smart eyes to find out any knowledge of such kinds.
Data mining is a well-developed field for finding hidden knowledge inside the data itself. However, there are few techniques focus on finding knowledge using OLAP reports as a major of data source.
Therefore, the research provided an approach for mining knowledge from OLAP reports, which is called OLAP report mining. There are three methods proposed in this thesis (called OLAP_MDS, OLAP_CLU, OLAP_OUT) which are applying traditional multi-dimensional scaling, clustering, and outlier analysis methods on OLAP reports. The work includes (1) defining the comparability relationship between two OLAP reports, (2) designing the similarity measurement for OLAP reports, (3) explaining how to apply traditional data mining methods for finding knowledge from OLAP reports, and (4) providing individual and integrative knowledge presentation methods.
Two kinds of experiments to verify the solution are conducted. The first kind of experiment is based on cognition science to validate the proposed definition of semantic distance between two OLAP reports. The experiment supports the rationale behind the definition of this semantic distance. The second kind of experiment is to apply our proposed methods on popular commercial OLAP databases (Foodmart 2000) to verify the applicability of these methods. All the proposed methods are confirmed that can sufficiently and efficiently find and represent similarity-based knowledge of OLAP reports.
| en_US |