博碩士論文 104426005 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:96 、訪客IP:3.149.24.20
姓名 葉亭佑(Ting-Yu Yeh)  查詢紙本館藏   畢業系所 工業管理研究所
論文名稱 列式資料倉儲在隨機查詢需求及回覆時間限制下之資料儲存與維護策略
(Determination of Materialized View Selection and Maintenance Policy with Stochastic Query and Response Time Constraints in Column-based Data Warehouse System)
相關論文
★ 以類神經網路探討晶圓測試良率預測與重測指標值之建立★ 六標準突破性策略—企業管理議題
★ 限制驅導式在製罐產業生產管理之應用研究★ 應用倒傳遞類神經網路於TFT-LCD G4.5代Cell廠不良問題與解決方法之研究
★ 限制驅導式生產排程在PCBA製程的運用★ 平衡計分卡規劃與設計之研究-以海軍後勤支援指揮部修護工廠為例
★ 木製框式車身銷售數量之組合預測研究★ 導入符合綠色產品RoHS之供應商管理-以光通訊產業L公司為例
★ 不同產品及供應商屬性對採購要求之相關性探討-以平面式觸控面板產業為例★ 中長期產銷規劃之個案探討 -以抽絲產業為例
★ 消耗性部品存貨管理改善研究-以某邏輯測試公司之Socket Pin為例★ 封裝廠之機台當機修復順序即時判別機制探討
★ 客戶危害限用物質規範研究-以TFT-LCD產業個案公司為例★ PCB壓合代工業導入ISO/TS16949品質管理系統之研究-以K公司為例
★ 報價流程與價格議價之研究–以機殼產業為例★ 產品量產前工程變更的分類機制與其可控制性探討-以某一手機產品家族為例
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 近年來大數據的議題在很多領域都被討論,為了有效的處理及分析如此龐大的資料,資料倉儲是一個重要的關鍵,已經有多研究表示Column-based Data Warehouse比傳統的Row-based Data Warehouse有更好的表現,所以Column-based Data Warehouse 成為了現在很多資料庫系統所使用的資料儲存架構,如SAP HANA。除此之外,在資料倉儲系統中,使用者進行查詢會產生大量的成本,View Selection問題決定哪些查詢的結果資料要預先儲存在資料倉儲之中,View Maintenance Policy則決定什麼時候要去更新這些儲存在資料倉儲內的資料。
在本研究中,我們建立了一個新的MVPP模型能夠表現出Column-based Data Warehouse中的查詢過程,並藉由修改Liu等人(2008)所提出的成本模型,建立了可以考慮到隨機性的查詢及資料更新在系統查詢回覆時間的限制之下。為了符合現實的情況,模型假設隨機的查詢到達率符合普瓦松分配,使用M/G/1模型來限制系統查詢的回覆時間,並在AMPL/MINOS的環境下建立數學模型,計算出相關的成本以及決策。除此之外,我們設計數個不同的案例,來評估及比較Column-based Data Warehouse與傳統資料儲存架構Row-based Data Warehouse的差異。
摘要(英)
In recent years, the issue of Big Data has been discussed in many areas. In order to analyze such a huge amount of information, the data warehouse is an important key. Many researches show that the performance of column-based data warehouse is better than the row-based data warehouse. The column-based data warehouse becomes popular storage architecture used by database systems such as SAP HANA. In the data warehouse, the view selection problem is to select a set of views to be materialized, when minimizing the total of query processing cost and view maintenance cost. The update policy is to decide when to refresh the data in a data warehouse.
In this research, we propose a new multiple view processing plan model which can present the operations in the column-based data warehouse. Modify the cost model in Liu et al. (2008) and propose a cost model which can consider the appearance of the stochastic query arrival and stochastic update, which contained a specified response time limit. For model according to the reality, we incorporate stochastic query into the model follows Poisson process and the constraints of system response time is formulated by an M/G/1 model. We use AMPL/MINOS to solve and implement the mathematical model. In addition, we also design several cases to evaluate the difference in view selection and total cost between the Column-based data warehouse and Row-based Data Warehouse.
關鍵字(中) ★ 列式資料倉儲
★ 資料倉儲
★ 資料維護策略
關鍵字(英) ★ Column-based data warehouse
★ View selection
★ View maintenance policy
★ View Materialization
★ Multiple view processing plan
★ AMPL/MINOS
論文目次
摘要 I
Abstract II
Contents IV
List of Figures VI
List of Tables VIII
Chapter 1 Introduction 1
1.1 Research background and motivation 1
1.2 Problem description 4
1.3 Research objectives 6
1.4 Research methodology and procedure 6
Chapter 2 Literature Review 8
2.1 Column-based data warehouse 8
2.2 Framework for view selection 14
2.3 View selection problem 18
2.4 View maintenance policy 20
Chapter 3 Mathematical Model 23
3.1 Multiple View Processing Plan (MVPP) 23
3.1.1 MVPP symbols 25
3.1.2 Differences between Row-based and Column-based MVPP 25
3.2 Definitions and assumptions 28
3.2.1 Definitions 28
3.2.2 Assumptions 29
3.3 Notations 30
3.4 The mathematic model 31
3.4.1 Basic objective function 33
3.4.2 Modeling of view maintenance policy 35
3.4.3 Modeling of executing AND operation 41
3.4.4 Modeling of storage space constraint 42
3.4.5 Modeling of view set selecting restriction 43
3.4.6 Modeling of response time constraint 43
Chapter 4 Model Application 47
4.1 The greedy algorithm for view selection in MVPP 47
4.2 Validation and evaluation 48
4.2.1 Validation 48
4.2.2 Evaluation 51
Chapter 5 Conclusion 75
5.1 Research Contribution 75
5.2 Future Research 77
Reference 78
Appendix A. Mathematical model in AMPL format 82
Appendix B. Greedy algorithm in AMPL format 85
Appendix C. The relational matrix for cases in AMPL format 88
參考文獻

Abadi, D., Madden, S., & Ferreira, M. (2006, June). Integrating compression and execution in column-oriented database systems. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data (pp. 671-682). ACM.

Abadi, D. J., Myers, D. S., DeWitt, D. J., & Madden, S. R. (2007, April). Materialization strategies in a column-oriented DBMS. In Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on (pp. 466-475). IEEE.

Abadi, D. J., Madden, S. R., & Hachem, N. (2008, June). Column-stores vs. row-stores: How different are they really?. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data (pp. 967-980). ACM.

Abadi, D., Boncz, P., Harizopoulos, S., Idreos, S., & Madden, S. (2013). The design and implementation of modern column-oriented database systems. Foundations and Trends® in Databases, 5(3), 197-280.

Allen, A. O., Probability, Statistics, and Queuing Theory with computer science applications, Academic Press, 1978

Andurkar, A. D. (2012). Implementation of column-oriented database in PostgreSQL for optimization of read-only queries. Computer Science and Information Technology, 2(3), 437-452.

Baralis, E., Paraboschi, S., & Teniente, E. (1997, August). Materialized Views Selection in a Multidimensional Database. In VLDB (Vol. 97, pp. 156-165).

Boncz, P. A., Zukowski, M., & Nes, N. (2005, January). MonetDB/X100: Hyper-Pipelining Query Execution. In CIDR (Vol. 5, pp. 225-237).

Bachmaier, M., & Krutov, I. (2016). In-memory Computing with SAP HANA on IBM EX5 and X6 Systems. IBM Redbooks.

Copeland, G. P., & Khoshafian, S. N. (1985, May). A decomposition storage model. In ACM SIGMOD Record (Vol. 14, No. 4, pp. 268-279). ACM.

Colby, L. S., Kawaguchi, A., Lieuwen, D. F., Mumick, I. S., & Ross, K. A. (1997, June). Supporting multiple view maintenance policies. In ACM SIGMOD Record (Vol. 26, No. 2, pp. 405-416). ACM.

Elmasri, R., & Navathe, S. Fundamentals of Database Systems, 1989.

Färber, F., Cha, S. K., Primsch, J., Bornhövd, C., Sigg, S., & Lehner, W. (2012). SAP HANA database: data management for modern business applications. ACM Sigmod Record, 40(4), 45-51.

Gupta, H. (1997, January). Selection of views to materialize in a data warehouse. In International Conference on Database Theory (pp. 98-112). Springer Berlin Heidelberg.

Gupta, H., & Mumick, I. S. (1999, January). Selection of views to materialize under a maintenance cost constraint. In International Conference on Database Theory (pp. 453-470). Springer Berlin Heidelberg.

Harinarayan, V., Rajaraman, A., & Ullman, J. D. (1996, June). Implementing data cubes efficiently. In ACM SIGMOD Record (Vol. 25, No. 2, pp. 205-216). ACM.

Harizopoulos, S., Liang, V., Abadi, D. J., & Madden, S. (2006, September). Performance tradeoffs in read-optimized databases. In Proceedings of the 32nd international conference on Very large data bases (pp. 487-498). VLDB Endowment.

Halverson, A., Beckmann, J. L., Naughton, J. F., & Dewitt, D. J. (2006, June). A comparison of c-store and row-store in a common framework. In Proc. of the 32nd VLDB Conference.

Kanade, A. S., & Gopal, A. (2013, February). Choosing right database system: Row or column-store. In Information Communication and Embedded Systems (ICICES), 2013 International Conference on (pp. 16-20). IEEE.

Liang, W., Wang, H., & Orlowska, M. E. (2001). Materialized view selection under the maintenance time constraint. Data & Knowledge Engineering, 37(2), 203-216.

Liu, Y. C., Hsu, P. Y., Sheen, G. J., Ku, S., & Chang, K. W. (2008). Simultaneous determination of view selection and update policy with stochastic query and response time constraints. Information Sciences, 178(18), 3491-3509.

Mistry, H., Roy, P., Sudarshan, S., & Ramamritham, K. (2001, May). Materialized view selection and maintenance using multi-query optimization. In ACM SIGMOD Record (Vol. 30, No. 2, pp. 307-318). ACM.

MacNicol, R., & French, B. (2004, August). Sybase IQ multiplex-designed for analytics. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30 (pp. 1227-1230). VLDB Endowment.

Muller, S., Butzmann, L., Howelmeyer, K., Klauck, S., & Plattner, H. (2013, September). Efficient view maintenance for enterprise applications in columnar in-memory databases. In Enterprise Distributed Object Computing Conference (EDOC), 2013 17th IEEE International (pp. 249-258). IEEE.

Ordonez, C., Cabrera, W., & Gurram, A. (2017). Comparing columnar, row and array DBMSs to process recursive queries on graphs. Information Systems, 63, 66-79.

Patel, J. K., & Read, C. B. (1982). THE NORMAL DISTRIBUTION.

Plattner, H. (2009, June). A common database approach for OLTP and OLAP using an in-memory column database. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data (pp. 1-2). ACM.

Ross, S. M. (2014). Introduction to probability models. Academic press.

Segev, A., & Fang, W. (1991). Optimal update policies for distributed materialized views. Management Science, 37(7), 851-870.

Shukla, A., Deshpande, P., & Naughton, J. F. (1998, August). Materialized view selection for multidimensional datasets. In VLDB (Vol. 98, pp. 488-499).

Slazinski, E. D. Structured Query Language (SQL). The Internet Encyclopedia, 2004.

Stonebraker, M., Abadi, D. J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., ... & O′Neil, P. (2005, August). C-store: a column-oriented DBMS. In Proceedings of the 31st international conference on Very large data bases (pp. 553-564). VLDB Endowment.

Sorjonen, S. (2012, December). OLAP Query performance in Column Orientde Databases (December 2012). In Seminar: Column Databases.

Theodoratos, D., & Bouzeghoub, M. (2000, November). A general framework for the view selection problem for data warehouse design and evolution. In Proceedings of the 3rd ACM international workshop on Data warehousing and OLAP (pp. 1-8). ACM.

Theodoratos, D., & Xu, W. (2004, November). Constructing search spaces for materialized view selection. In Proceedings of the 7th ACM international workshop on Data warehousing and OLAP (pp. 112-121). ACM.

Yang, J., Karlapalem, K., & Li, Q. (1997, May). A framework for designing materialized views in data warehousing environment. In Distributed Computing Systems, 1997., Proceedings of the 17th International Conference on (pp. 458-465). IEEE.

Yang, J., Karlapalem, K., & Li, Q. (1997, August). Algorithms for materialized view design in data warehousing environment. In VLDB (Vol. 97, pp. 136-145).

Yu, J. X., Yao, X., Choi, C. H., & Gou, G. (2003). Materialized view selection as constrained evolutionary optimization. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 33(4), 458-467.

Zhuge, Y., Garcia-Molina, H., Hammer, J., & Widom, J. (1995). View maintenance in a warehousing environment. ACM SIGMOD Record, 24(2), 316-327.

Zhang, C., Yao, X., & Yang, J. (1999). Evolving materialized views in data warehouse. In Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congress on (Vol. 2, pp. 823-829). IEEE.

Zhang, C., Yao, X., & Yang, J. (2001). An evolutionary approach to materialized views selection in a data warehouse environment. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 31(3), 282-294.

古傑煥,「在查詢為隨機需求與回覆時間被限制的環境下同時決定資料倉儲所需儲存的資料與資料維護策略」,國立中央大學,碩士論文,2004年。
指導教授 沈國基 審核日期 2017-7-24
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明