博碩士論文 974403001 完整後設資料紀錄

DC 欄位 語言
DC.contributor資訊管理學系zh_TW
DC.creator張元哲zh_TW
DC.creatorYuan-Che Changen_US
dc.date.accessioned2013-10-21T07:39:07Z
dc.date.available2013-10-21T07:39:07Z
dc.date.issued2013
dc.identifier.urihttp://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=974403001
dc.contributor.department資訊管理學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract自動專利分類系統可以快速比對識別現有專利的可能衝突,對發明者以及專利律師而言,可幫他們節省許多人工比對成本與時間,因此是相當有價值的研究。近年來,使用國際專利分類(IPC)來進行專利文件的分類已日益普遍,而此一國際專利分類則是一個複雜的階層式分類系統,它包含了8個部(section)、128個主類(class)、648個次類(subclass),約有7,200個主目(main group)及72,000個次目(subgroup)。儘管已有一些研究著眼於IPC的自動分類,但截至目前為止,並沒有任何分類方法適合用來進行次目層級的自動分類(IPC的底層分類),因此,本研究提出一個全新的分類方法,稱之為三階段分類演算法(簡稱為TPC演算法),它可以進行次目層級的自動分類,並獲得合理的正確率。此一方法是由三個階段所組成,前兩個階段運用了支持向量機進行可能類別的預測,而最後一個階段則運用分群演算法決定最終的次目標籤。本研究使用世界智慧財產權組織的WIPO-alpha專利資料集進行實驗,其結果顯示TPC演算法可以在次目層級的自動分類上,達到36.07%的正確率,此一數據若與隨機猜測一個次目標籤的機率相比,約已提升了26,020倍的正確率。此外,我們額外搜集96,654份與WIPO-alpha專利資料集不重複的專利文件,再與WIPO-alpha專利資料集合併進行測試,實驗結果顯示正確率提升至38.01%。zh_TW
dc.description.abstractAn automatic patent categorization system would be invaluable to individual inventors and patent attorneys, saving them time and effort by quickly identifying conflicts with existing patents. In recent years, it has become more and more common to classify all patent documents using the International Patent Classification (IPC), a complex hierarchical classification system comprised of 8 sections, 128 classes, 648 subclasses, about 7,200 main groups, and approximately 72,000 subgroups. So far, however, no patent categorization method has been developed that can classify patents down to the subgroup level (the bottom level of the IPC). Therefore, this dissertation presents a novel categorization method, the three phase categorization (TPC) algorithm, which classifies patents down to the subgroup level with reasonable accuracy. The method is composed of three phases, where the first two are performed using SVM classification and the last one employs clustering. The experimental results for the TPC algorithm, using the WIPO-alpha collection, indicate that our classification method can achieve 36.07% accuracy at the subgroup level. This is approximately a 26,020-fold improvement over a random guess. In addition, a collection of 96,654 distinct patent documents that we collect from Internet has been combined with WIPO-alpha collection. We evaluate the TPC algorithm on this collection and it achieved an accuracy of 38.01% at the subgroup level.en_US
DC.subject專利分類zh_TW
DC.subject向量空間模型zh_TW
DC.subject國際專利分類法zh_TW
DC.subject支持向量機zh_TW
DC.subjectK-質心法分群法zh_TW
DC.subjectk-近鄰法zh_TW
DC.subjectPatent classificationen_US
DC.subjectVector space model (VSM)en_US
DC.subjectIPC taxonomyen_US
DC.subjectSupport vector machines (SVM)en_US
DC.subjectK-meansen_US
DC.subjectK nearest neighbors (KNN)en_US
DC.title大量專利類別自動分類演算法研究zh_TW
dc.language.isozh-TWzh-TW
DC.titleAn automatic classification algorithm for a large number of patent categorizationen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明