dc.description.abstract | Data warehouse is built up to reply queries efficiently. The view selection is to select a set of views to materialize under constraints, when minimizing the total of query processing cost and view maintenance cost. The update policy decides when to refresh the data in a data warehouse. Previous researches dealt with these two problems independently, however under the real situation, they are correlated with each other. Therefore, simultaneously determining view selection and update policy in designing a data warehouse is important. Besides, as to previous researches, they assume that query arrival rates and update frequency are deterministic which can’t reflect uncertain demand of query in real situation, that will lead to a incorrect outcome. Therefore, the stochastic arrival should be considered.
In this research, we propose a mathematical model to minimize the total cost when the set of materialized views are known. In the model, we adopt stochastic view maintenance frequency, which does not be considered in the former researches. Our model also incorporates the stochastic phenomenon to reflect the uncertain query and uncertain update with Poisson process, which is common in the real life. The mean system response time constrained by a specified time is formulated by an M/G/1 model, which is within a given threshold with a desired probability.
As to application, we consider different special cases to implement the mathematical model and the greedy algorithm. A computational analysis is conducted to explore the impact of different constraints and system parameters on view selection. In addition, we also design some experiments to evaluate the difference of view selection and its solution. Finally, we recognize the mathematical model and algorithm we propose here are correct and reliable via these experiments. | en_US |