中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/13366
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 41272787      Online Users : 72
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/13366


    Title: 以PAT;tree 為基礎發展之快速搜尋索引樹A fast access indexing tree structure developed based on PAT tree
    Authors: 張楷鑫;Gush Chang
    Contributors: 資訊管理研究所
    Keywords: 資訊檢索;存取方式 資料結構;information retrieval;access method;Patricia trie;data structure
    Date: 2009-06-17
    Issue Date: 2009-09-22 15:30:05 (UTC+8)
    Publisher: 國立中央大學圖書館
    Abstract: 在網路普及和分散式多媒體運算的時代,資料量的成長一日千里。如何快速的存取資料,是當代迫切的需求。PAT tree 資料結構提供極快速的搜尋能力。藉由它的幫助,我們可以在O(log n)的時間內,執行任意的字首搜尋,不用受限於實際資料量的大小。事實上, 查詢字彙的長度通常小於 log n,所以真正在執行搜尋時,所耗費的時間是決定於查詢字彙的長度。雖然PAT tree 是如此的快速,我們認為它可以更快。我們憑藉PAT tree 的基礎,發展了DPAT tree。它的搜尋模式是隨機存取,不受限於查詢字彙的長度。DPAT tree 是一個trie 資料結構的變形,它能支援任何的語言索引功能,它同時也是一個雜湊表。我們使用了一個編碼程序來產生雜湊值,所有的搜尋都透過此雜湊值來查詢,來達到隨機存取的目的,確保查詢花費時間不會超過O(1)。在Cranfield II 資料測試集的實驗中,平均一個查詢花費25ms,相較於PAT tree 平均所花費的1443ms,它所提供的速度快上許多。 In the era of the Internet and distributed multimedia computing, data increasesrapidly in astonishing rate. How to retrieve data efficiently has become an urgentissue. The PAT tree is a data structure that allows very efficient searching withpreprocessing. We can do arbitrary prefix searching in O(log n) time with PAT tree,independent of the size of the answer. In practice, the length of the query is less thanO(log n), and the searching time is proportional to the query length. To improve thesearching efficiency, we have developed a modified PAT tree called decimal PAT tree(DPAT tree) with which the search could be done in a random access manner,regardless of the length of the query. The DPAT tree structure is a variation of triestructure that can support any language indexing functionality. It also is a hash tablethat maps every sistring to where it resides. We have employed an encoding procedureas the hash function to generate the hash values. The procedure is guaranteed to createa hash table, in which search can be done in random access manner. Thus searchingcould be done in no more than O(1). The experiment tests Cranfield II documentcollection and yields 25ms response time per query in average in contrast to 1443msresponse time per query in average in PAT tree retrieval.
    Appears in Collections:[Graduate Institute of Information Management] Electronic Thesis & Dissertation

    Files in This Item:

    File SizeFormat


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明