高效率的跨版本XML文件儲存結構之研究-以OpenOffice.org為例

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：66

、訪客IP：3.133.140.2

姓名

張志君(Chih-Chun Chang) 查詢紙本館藏

畢業系所

企業管理學系

論文名稱

高效率的跨版本XML文件儲存結構之研究-以OpenOffice.org為例
(Effective Storage Structure for Multi-version XML Documents)

相關論文

★ 在社群網站上作互動推薦及研究使用者行為對其效果之影響	★ 以AHP法探討伺服器品牌大廠的供應商遴選指標的權重決定分析
★ 以AHP法探討智慧型手機產業營運中心區位選擇考量關鍵因素之研究	★ 太陽能光電產業經營績效評估－應用資料包絡分析法
★ 建構國家太陽能電池產業競爭力比較模式之研究	★ 以序列採礦方法探討景氣指標與進出口值的關聯
★ ERP專案成員組合對績效影響之研究	★ 推薦期刊文章至適合學科類別之研究
★ 品牌故事分析與比較-以古早味美食產業為例	★ 以方法目的鏈比較Starbucks與Cama吸引消費者購買因素
★ 探討創意店家創業價值之研究- 以赤峰街、民生社區為例	★ 以領先指標預測企業長短期借款變化之研究
★ 應用層級分析法遴選電競筆記型電腦鍵盤供應商之關鍵因子探討	★ 以互惠及利他行為探討信任關係對知識分享之影響
★ 結合人格特質與海報主色以類神經網路推薦電影之研究	★ 資料視覺化圖表與議題之關聯

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

自從80年代開始，企業導入電子化流程，間接造成辦公室自動化軟體的興起，大部份的公司都採用Office軟體進行資料與文件的處理。不約而同的，辦公室軟體的主要兩大團體OpenOffice.org與Microsoft Office，皆在近期推出以XML為主體的資料儲存方式，逐漸形成格式的標準。但是資料的特點仍未妥善處理，當存取的數目增加，同一類型的主題，可能因為人為因素的閱讀與存取而產生不同的版本，這些文件的歷史版本之間，也許只存在著微小的差異，但在儲存時卻是個別獨立的檔案，除了會耗費儲存空間，對於未來的資訊檢索，也會造成不必要的麻煩。
所以本研究在於妥善運用XML文件開放式儲存架構的特性，發展出處理跨版本文件的演算法，尋求高效率的儲存方式。除了可節省辦公室文件的空間儲存，更要能維持其完整性。其在管理上對企業的意涵即為，對企業內部的電子化文件作最有效率的管理，且每一份文件所包含的訊息仍需保留，達到資料可再用性的特點。

摘要(英)

Since the beginning of 1980, there had been great changes in processing data and documents. Office applications such as OpenOffice.org and Microsoft office are widely used to do everything you expect from your needs. Because of more and more requirements of information exchange and retrieve, XML becomes a standard in doing this way. With the adoption of using XML in both office application groups, the abilities for efficient storing historical office documents are become a growing issue.
This paper introduces an efficient way to process multi-version XML documents. It is not only effective storage space need but also keeping the integral of original documents. It minimizes the change of data values or structures transmutation of historical XML documents. The purpose is to well-managed electronic documents for enterprises and all the messages were involved in should be preserved.

關鍵字(中)

★ 儲存結構
★ 歷史版本
★ XML
★ OpenOffice.org

關鍵字(英)

★ Storage structure
★ Historical document
★ XML
★ OpenOffice.org

論文目次

目錄
頁碼
中文摘要 ………………………………………………………… i
英文摘要 ………………………………………………………… ii
目錄 ………………………………………………………… iii
圖目錄 ………………………………………………………… v
表目錄 ………………………………………………………… vi
一、序論 …………………………………………………… 1
1-1 研究動機……………………………………………… 1
1-2 研究目的……………………………………………… 3
二、文獻探討……………………………………………… 6
2-1 XML介紹………………………………………………… 6
2-1-1 XML開放式儲存結構…………………………………… 6
2-1-2 OpenOffice.org簡介………………………………… 9
2-2 XML文件儲存的相關探討……………………………… 11
2-2-1 結構的儲存…………………………………………… 11
2-2-2 文字的處理…………………………………………… 18
三、演算法………………………………………………… 22
3-1 歷史版本文件特性…………………………………… 22
3-2 資料結構……………………………………………… 22
3-3 文件處理演算法……………………………………… 23
3-3-1 f_list演算法………………………………………… 23
3-3-2 XSS演算法…………………………………………… 27
3-3-3 Data_recovery演算法……………………………… 34
四、實證分析……………………………………………… 36
4-1 實驗設計……………………………………………… 36
4-2 實驗結果與分析……………………………………… 37
五、結論與未來研究建議.……………………………… 48
5-1 結論…………………………………………………… 48
5-2 實際運用範例………………………………………… 49
5-3 未來研究建議………………………………………… 50
參考文獻 ………………………………………………………… 51

參考文獻

參考文獻
[1] 林昌正，「多XML文件整合萃取工具之研究」，國立中央大學，碩士論文，民國97年。
[2] A. Chebotko, D. Liu, M. Atay, S. Lu, F. Fotouhi, Reconstructing XML subtrees from relational storage of XML documents, Proceedings of the Second IEEE International Workshop on XML Schema and Data Management (XSDM05), in conjunction with ICDE05, Tokyo, Japan, April, 2005.
[3] A. Deutsch, Y. Papakonstantinou, and Y. Xu, “The NEXT Logical Framework for XQuery,” Proc. 30th Int’l Conf. Very Large Data Bases (VLDB ’04), 2004.
[4] B. Choi, M. Mahoui, and D. Wood, “On the Optimality of Holistic Algorithms for Twig Queries,” Proc. 14th Int’l Workshop Database and Expert Systems Applications (DEXA ’03), 2003.
[5] de Lara, E., Chopra, Y., Kumar, R., Vaghela, N., Wallach, D., &Zwaenepoel, W. (2005). Iterative adaptation for mobile clients usingexisting APIs. IEEE Transactions on Parallel and Distributed Systems,16(10), 966–981.
[6] Gou G, Chirkova R, “Efficiently querying large XML data repositories: A survey”, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, Vol 19, Issue 10, pp. 1381-1403, October 2007.
[7] H. Su, E.A. Rundensteiner, and M. Mani, “Semantic Query Optimization for XQuery over XML Streams,” Proc. 31st Int’l Conf. Very Large Data Bases (VLDB ’05), 2005.
[8] Ho-pong Leung, Fu-lai Chung, Stephen Chi-fai Chan, On the use of hierarchical information in sequential mining-based XML document similarity computation, Knowledge and Information Systems (2005) 7: 476–498.
[9] International Organization for Standardization, ISO 8879: Information processing---Text and office systems---Standard Generalized Markup Language (SGML), ([Geneva]: ISO, 1986).
[10]J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’00), pages 1–12, Dallas, TX, May 2000.
[11]J. McHugh, S. Abiteboul, R. Goldman, D. Quass, J. Widom, Lore: a database management system for semi-structured data, SIGMOD Rec. 26 (3) (1997) 54–66.
[12]Jung-Won, L., L. Kiho, et al. (2001). Preparations for semantics-based XML mining. Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on.
[13]Koru, A.G., Tian, J., 2005. Comparing high change modules and modules with the highest measurement values in two large-scale open-source products. IEEE Transactions on Software Engineering 31 (8), 625–642.
[14]L.Chen, S.S.Bhowmick, L.T.Chia, ”FRACTURE-Mining: Mining Frequently and Concurrently Mutating Structures from Historical XML Documents”, Elsevier Science Journal: Data & Knowledge Engineering Volume: 59 Issue: 2, 2006, pp. 320-347
[15]L.Chen, S.S.Bhowmick, L.T.Chia, "Mining Maximal Frequently Changing Subtree Patterns from XML Documents", In Proceedings of the 6th International Conference on Data Warehousing and Knowledge Discovery(DaWaK), Zaragoza, Spain, 2004, pp.68-76.
[16]L.H.Rusu, W.Rahayu, D.Taniar, "Mining Changes from Versions of Dynamic XML Documents", KDXD 2006, LNCS3915, pp.3-12
[17]LICKLIDER, J. C. R., and CLARK, W., “Online Man-Computer Communication”, Proceedings of the Spring Joint Computer Conference, San Francisco, California, May 1-3, 1962, vol. 21, pp. 113-128.
[18]M.P. Papazoglou and P.M.A. Ribbers. e-Business: Organizational and Technical Foundations. John Wiley & Sons, Forthcoming 2005.
[19]Q. Li, B. Moon, Indexing and querying XML data for regular path expressions, in: Proceedings of the 27th International Conference on Very Large Data Bases, VLDB 2001, September 11–14, 2001, Rome, Italy, Morgan Kaufmann, Los Altos, CA, pp. 361–370.
[20]R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases (VLDB’94), pages 487–499, Santiago, Chile, Sept. 1994.
[21]R. Krishnamurthy, V.T. Chakaravarthy, R. Kaushik, J.F. Naughton, Recursive XML schemas, recursive XML queries, and relational storage: XML-to-SQL query translation, in: Proc. of the ICDE Conference, 2004, pp. 42–53.
[22]S. Amer-Yahia, L.V.S. Lakshmanan, and S. Pandit, “FleXPath: Flexible Structure and Full-Text Querying for XML,” Proc. 23rd ACM SIGMOD Int’l Conf. Management of Data (SIGMOD ’04), 2004.
[23]S. Paparizos, Y. Wu, L.V.S. Lakshmanan, and H.V. Jagadish, “Tree Logical Classes for Efficient Evaluation of XQuery,” Proc. 23rd ACM SIGMOD Int’l Conf. Management of Data (SIGMOD ’04), 2004.
[24]S.Y. Chien, V.J. Tsotras, and C. Zaniolo, “Efficient schemes for managing multiversion XML documents,” VLDB J., vol.11, no.4, pp.332–353, Dec. 2002.
[25]Sanjay Madriaa, Yan Chena, Kalpdrum Passib, Sourav Bhowmickc, Efficient processing of XPath queries using indexes, Information Systems 32 (2007) 131–159.
[26]T. Milo, D. Suciu, Index structures for path expressions, in: Proceedings of the Seventh International Conference on Database Theory (ICDT ’99), Jerusalem, Israel, January 10–12, 1999, Lecture Notes in Computer Science, vol. 1540, Springer, Berlin, pp. 277–295.
[27]T. Schwentick. XPath query containment. SIGMOD Record, 33(1):101–109, 2004.
[28]Wang FJ, Li J, Homayounfar H, “A space efficient XML DOM parser ” , DATA & KNOWLEDGE ENGINEERING, Vol60, pp. 185-207, January 2007.
[29]X. Li and G. Agrawal, “Efficient Evaluation of XQuery over Streaming Data,” Proc. 31st Int’l Conf. Very Large Data Bases (VLDB’05), 2005.
[30]Y. Li, C. Yu, and H.V. Jagadish, “Schema-Free XQuerys,” Proc. 30th Int’l Conf. Very Large Data Bases (VLDB ’04), 2004.
[31]Z. Chen, H.V. Jagadish, L.V.S. Lakshmanan, and S. Paparizos, “From Tree Patterns to Generalized Tree Patterns: On Efficient Evaluation of XQuery,” Proc. 29th Int’l Conf. Very Large Data Bases (VLDB ’03), 2003.
網站資料 (All retrieve on June 23, 2008)
[32]A Brief History of Internet.
Available from http://arxiv.org/html/cs/9901011v1
[33]Cascading Style Sheets Home Page.
Available from http://www.w3.org/Style/CSS/
[34]CeBIT. Available from http://www.cebit.de/
[35]Extensible Markup Language (XML) http://www.w3.org/xml/
[36]FLWOR Expressions. Available from http://www.w3.org/TR/xquery/#id-flwor-expressions
[37]International Organization for Standardization.
Available from http://www.iso.org/iso/home.htm
[38]Introducing the Office (2007) Open XML File Formats.
Available from http://msdn2.microsoft.com/en-us/library/aa338205.aspx
[39]Introduction to XSLT.
Available from http://www.w3schools.com/xsl/xsl_intro.asp
[40]Megginson Technologies: Simple API for XML.
Available from http://www.megginson.com/downloads/SAX/
[41]OpenOffice.org. The Free and Open Productivity Suite.
Available from http://www.openoffice.org/
[42]OpenOffice.org 2.0 in Enterprises. English version. Available from http://www.ba.ncu.edu.tw/dmerplab/CeBIT_OOo_En.odp
[43]OpenOffice.org 2.0 in Unternehmen. German version. Available from http://de.openoffice.org/files/documents/66/3274/CeBIT_OOo20.odp
[44]Preserving The Past To Protect The Future, “The Strategic Plan of The National Archives and Records Administration 2006-2016”. Available from http://www.archives.gov/about/plans-reports/strategic-plan/2007/nara-strategic-plan-2006-2016.pdf
[45]Unicode in XML and other Markup Languages, “Unicode Technical Report#20”. Available from http://unicode.org/reports/tr20/tr20-6.html
[46]World Wide Web Consortium. Available from http://www.w3.org/
[47]W3C’s Document Object Model (DOM).
Available from http://www.w3.org/DOM
[48]W3C’s Extensible Markup Language (XML) 1.0 (Fourth Edition). Available from http://www.w3.org/TR/REC-xml/
[49]W3C’s XML Path Language (XPath) 2.0.
Available from http://www.w3.org/TR/xpath20/
[50]W3C’s XQuery 1.0: An XML Query Language.
Available from http://www.w3.org/TR/xquery/
[51]W3C’s XSL Transformations (XSLT) Version 2.0. Available from http://www.w3.org/TR/xslt20
[52]XML DOM Tutorial. Available from http://www.w3schools.com/dom/
[53]XPath Tutorial. Available from http://www.w3schools.com/xpath/
[54]XQuery FLWOR Expressions. Available from http://www.w3schools.com/xquery/xquery_flwor.asp

指導教授

許秉瑜(Ping-yu Hsu)

審核日期

2008-6-23

推文