中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/61780
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 81570/81570 (100%)
Visitors : 47013024      Online Users : 129
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/61780


    Title: 大量專利類別自動分類演算法研究;An automatic classification algorithm for a large number of patent categorization
    Authors: 張元哲;Chang,Yuan-Che
    Contributors: 資訊管理學系
    Keywords: 專利分類;向量空間模型;國際專利分類法;支持向量機;K-質心法分群法;k-近鄰法;Patent classification;Vector space model (VSM);IPC taxonomy;Support vector machines (SVM);K-means;K nearest neighbors (KNN)
    Date: 2013-10-21
    Issue Date: 2013-11-27 11:33:17 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 自動專利分類系統可以快速比對識別現有專利的可能衝突,對發明者以及專利律師而言,可幫他們節省許多人工比對成本與時間,因此是相當有價值的研究。近年來,使用國際專利分類(IPC)來進行專利文件的分類已日益普遍,而此一國際專利分類則是一個複雜的階層式分類系統,它包含了8個部(section)、128個主類(class)、648個次類(subclass),約有7,200個主目(main group)及72,000個次目(subgroup)。儘管已有一些研究著眼於IPC的自動分類,但截至目前為止,並沒有任何分類方法適合用來進行次目層級的自動分類(IPC的底層分類),因此,本研究提出一個全新的分類方法,稱之為三階段分類演算法(簡稱為TPC演算法),它可以進行次目層級的自動分類,並獲得合理的正確率。此一方法是由三個階段所組成,前兩個階段運用了支持向量機進行可能類別的預測,而最後一個階段則運用分群演算法決定最終的次目標籤。本研究使用世界智慧財產權組織的WIPO-alpha專利資料集進行實驗,其結果顯示TPC演算法可以在次目層級的自動分類上,達到36.07%的正確率,此一數據若與隨機猜測一個次目標籤的機率相比,約已提升了26,020倍的正確率。此外,我們額外搜集96,654份與WIPO-alpha專利資料集不重複的專利文件,再與WIPO-alpha專利資料集合併進行測試,實驗結果顯示正確率提升至38.01%。
    An automatic patent categorization system would be invaluable to individual inventors and patent attorneys, saving them time and effort by quickly identifying conflicts with existing patents. In recent years, it has become more and more common to classify all patent documents using the International Patent Classification (IPC), a complex hierarchical classification system comprised of 8 sections, 128 classes, 648 subclasses, about 7,200 main groups, and approximately 72,000 subgroups. So far, however, no patent categorization method has been developed that can classify patents down to the subgroup level (the bottom level of the IPC). Therefore, this dissertation presents a novel categorization method, the three phase categorization (TPC) algorithm, which classifies patents down to the subgroup level with reasonable accuracy. The method is composed of three phases, where the first two are performed using SVM classification and the last one employs clustering. The experimental results for the TPC algorithm, using the WIPO-alpha collection, indicate that our classification method can achieve 36.07% accuracy at the subgroup level. This is approximately a 26,020-fold improvement over a random guess. In addition, a collection of 96,654 distinct patent documents that we collect from Internet has been combined with WIPO-alpha collection. We evaluate the TPC algorithm on this collection and it achieved an accuracy of 38.01% at the subgroup level.
    Appears in Collections:[Graduate Institute of Information Management] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML825View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明