商務資訊擷取效率與品質促進之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：15

、訪客IP：3.143.168.172

姓名

鄭詩鋒(Shih-fong Cheng) 查詢紙本館藏

畢業系所

資訊管理學系在職專班

論文名稱

商務資訊擷取效率與品質促進之研究
(The Improvement of Business Information Retrieving)

相關論文

★ 信用卡盜刷防治簡訊規則製作之決策支援系統	★ 不同檢索策略之效果比較
★ 知識分享過程之影響因子探討	★ 兼具分享功能之檢索代理人系統建構與評估
★ 犯罪青少年電腦態度與學習自我效能之研究	★ 使用AHP分析法在軟體度量議題之研究
★ 優化入侵規則庫	★ 以分析層級程序法衡量銀行業導入企業應用整合系統(EAI)之關鍵因素
★ 應用基因演算法於叢集電腦機房強迫對流裝置佈局最佳近似解之研究	★ The Development of a CASE Tool with Knowledge Management Functions
★ 以PAT tree 為基礎發展之快速搜尋索引樹	★ 以複合名詞為基礎之文件概念建立方式
★ 利用使用者興趣檔探討形容詞所處位置對評論分類的重要性	★ 透過半結構資訊及使用者回饋資訊以協助使用者過濾網頁文件搜尋結果
★ 利用feature-opinion pair建立向量空間模型以進行使用者評論分類之研究	★ 探討使用者回饋之半結構化文件字詞特性於檢索文件的應用

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

資訊的獲得與知識的管理及應用是企業進步演化的動力，企業商務營運對企業個體內外「商務情報」的需求是渴望且永無止境的。企業經由內部取得相關資訊的管道已相當穩定，最常見的是結合資訊科技與資訊系統，例如，ERP、SCM、CRM、Datamining的應用，經由外部取得相關資訊的管道較不穩定，只要是個體間的互動，例如人與人、企業與企業、國與國之間的互動，就會產生個體相對性外來的資訊，外來資訊具有複雜度高、規律性低、取得的管道不易等的特性，所以外來資訊的來源、時效、品質、儲存、管理與應用時常影響其資訊本體的效用價值，而且個體不同其所面對資訊效用價值觀也不同。
企業內部營運企劃相關工作者，為達成企業營運相關的企劃活動，時常付出不少的工作時間進行「商務情報」的探聽與蒐集。經由人工管道蒐集的資訊較常發生資訊遺漏、不即時、或效益度不佳的現象，且個人化蒐集資訊不易匯整、儲存、與即時傳播供企業內部資訊需求者應用，喪失即時應用資訊協助營運策略規劃的時機。
資訊技能較佳的工作者將「商務情報」來源目標鎖定在資訊四通八達的網際網路。隨著網際網路的蓬勃發展，網路上流通的資訊日益增加，它已成為一個龐大的訊息互通交換平臺，但資料量大不一定就是資訊品質佳，要有翻箱倒櫃的搜尋能力才有可能從此訊息互通交換平臺獲得具效用價值的資訊。但要從此龐大的訊息互通交換平臺上搜尋出適用的資訊並不容易，最基本是藉助搜尋引擎的功能來達成，然而以關鍵字為主的搜尋引擎常會找出所有「字辭相關」的資訊，但是其中夾雜著許多無相關的雜質資料，使用者雖很用心的搜尋與瀏覽資訊但資訊的效益度卻是相當有限。這類搜尋引擎具有簡單方便使用的優點，缺點是搜尋技巧不佳時，會產生數量龐大不可預測的搜尋結果。
針對於提昇企業外來資訊本體的效用價值，本研究改變傳統式人工取得企業外部相關資訊的模式，應用資訊檢索(Information Retrieval)、資訊擷取(Information Extraction)、與資訊萃取(Information Mining)等技術，分類搜尋解析字串代表的訊息類別，完成自動快速搜尋與智慧篩選特定資訊的功能，主動提供企業內部資訊需求者，簡易、快速地取得即時、精確有效的資訊情報，協助達成企業營運策略規劃，掌握市場先機。
本研究所規劃的「商務情報擷取系統」，其系統發展的技術應用，在資訊系統發展模式上尚無完整明確可依循的應用架構，尤其是需經由外部匯入資料配合，對外來資料來源與資訊可靠度的掌握需付出加倍的關注，小心過濾雜質而後歸類分析，對於系統運轉所呈現的結果更需有一套可評估的模式來進行監督。此類型資訊系統發展成功的基本要素是(一).系統規劃應由大處著眼，避免見樹不見林的系統架構盲點、(二).系統設計應由小處著手，化繁為簡、(三).系統發展目標要明確，避免模糊不清無法釐清的觀念存在、(四).系統發展循序漸進，明確掌握每一發展階段的正確性、(五).整體系統成功目標應兼顧商務應用策略、知識管理、與科技應用三個發展導向。

摘要(英)

Nowadays, enterprises are eager to get external information from Internet. But it is not easy to efficiently find the needed information from such huge database. In the past, information was stored more or less well-structured in database. But now, lots of information is presented in unstructured or semistructured format in Internet. The retrieval and management of such large textual information from the Internet has been a challenging issue for enterprises or individuals.
Information extraction is the process of extracting relevant data from semi-structured or unstructured documents and transforming them into structured representations. Many information extraction techniques have been proposed. However, they are ineffective on business information retrieval from Internet.
In this research, we proposed a new information extraction system (Business Information Retrieval System) to enhance existing information extraction techniques. It can automaticly and accurately extract business information from Internet . It can help enterprises easily and efficiently get business information from Internet.
According to the empirical evaluations on Business Information Retrieval System, the performance on automaticly business information retrieval is acceptable. This proposed system showed its capabilities in retrieving accuracy .

關鍵字(中)

★ 資訊萃取
★ 知識管理
★ 資訊擷取
★ 資訊檢索
★ 搜尋引擎

關鍵字(英)

★ Information Retrieval
★ Search Engineering
★ Infor

論文目次

中文摘要......................................................I
英文摘要.....................................................III
致謝辭.......................................................IV
目錄.........................................................V
表目錄......................................................VIII
圖目錄.......................................................IΧ
第一章緒論...................................................1
第一節研究背景與動機.......................................1
一. 網際網路的資訊效應...................................1
二. 企業對商務情報的需求.................................1
第二節研究目的...............................................2
第三節論文結構...............................................2
第二章相關研究...............................................4
第一節網際網路資訊蒐集技術探討.............................4
一. 資訊檢索技術..............................................4
二. 資訊擷取技術..............................................6
第二節資訊擷取系統評估技術探討.............................9
第三章系統設計..............................................13
第一節系統概念架構..........................................13
第二節系統功能架構........................................13
一. 自動化搜尋功能...........................................14
二. 資料過濾與規格轉換功能...................................15
三. 資料萃取分析功能.........................................15
四. 系統展現功能.............................................18
五. 自動排程控管功能.........................................20
第四章系統實作與評估.................... ...................21
第一節系統實作架構........................................21
第二節系統運作模組........................................21
一. 網頁搜尋模組........................................22
二. 規格轉換模組........................................26
三. 簡繁轉換模組........................................26
四. 雜質刪除模組........................................27
五. 擷取網頁模組........................................29
六. 載入網頁模組........................................29
七. 標題分析模組........................................29
八. 排拒分析模組........................................31
九. 異義分析模組........................................33
十. 萃取分析模組........................................34
十一. 重複刪除模組........................................36
十二. 自動學習模組........................................37
十三. 資訊展現模組........................................38
十四. 自動排程模組........................................39
十五. 資料備份模組........................................39
第三節系統開發環境........................................41
一. 系統開發軟體環境....................................42
二. 系統開發硬體環境....................................43
第四節系統應用實例........................................43
一. 第一類型「時間導向應用」實例........................43
二. 第二類型「事件導向應用」實例........................43
第五節系統評估............................................44
一. 精確性系統效能評估..................................44
二. 功能性系統效能評估..................................50
第五章結論及未來研究方向....................................53
第一節結論............................ .....................53
第二節研究貢獻..............................................54
第三節未來研究方向..........................................54
參考文獻.....................................................57

參考文獻

一.中文部分
[1] 中央研究院詞庫小組，”中文詞類分析(三版)”，CKIP Technical Report No 93-05.
[2] ”國家中文標準交換碼”，.
[3] ”中文資料找尋與資料彙編”，GBK 碼規範資料參考，.
[4] 蔡明月，”線上資訊檢索—理論與應用”，台北，學生書局，1991.
[5] 卜小蝶，”圖書資訊檢索技術”，台北，文華書局，1996.
[6] 黃慕萱，”檢索系統評估之發展(理論與實務)”，中國圖書館學會會報.
[7] 周曉雯，”線上檢索結果之評估”，書府12期，1991.
[8] 吳政叡，”都柏林核心集對減低檢索失誤率的實務探討”，圖書館學與資訊科學24卷1期，1998.
[9] 陳詩沛，”全球資訊網上的語意搜尋”，台大資工，碩士論文，2002.
[10] 朱怡霖，”中文斷詞與專有名詞辨識之研究”，台大資工，碩士論文，2001.
[11] 蒙以亨，”非結構化文件中語意知識擷取方法之設計與研究”，交大資科，博士論文，2002.
[12] 吳東軒，”中文資料擷取系統之設計與研究”，中央資工，碩士論文，2001.
[13] 謝祥綺，”利用主題與關鍵詞分析之查詢擴充研究”，清大資工，碩士論文，2001.
[14] 吳宜鴻，”全球資訊網資料之分析、索引與擷取”，清大資工，博士論文，2000.
[15] 呂紹誠，”網際網路半結構性資料擷取系統之設計與實作”，中央資工，碩士論文，2000.
[16] 游基鑫，”中文資訊擷取環境建構與同指涉問題之研究”，台大資工，碩士論文，2000.
[17] 顏逸品，”網際網路半結構化資料之蒐集與整合系統”，中央資管，碩士論文，2000.
[18] 吳俊興，”網際網路分類搜尋引擎設計之研究”，台大資工，博士論文，1998.
二.英文部分
[19] Bright,Laura; Gruser,Jean-Robert; Raschid,Louiqa and Vidal,Maria Esther “A wrapper generation toolkit specify and construct wrappers for web accessible data sources (WebSources)”, International Journal of Computer Systems Science and Engineering, Vol. 14, No. 2, 1999, pp.83-97.
[20] Califf,M.E. and Mooney,R.J. “Relational learning of pattern-match rules for information extraction”, In Proceedings of the 16th National Conference on AI, 1999, pp.328-334.
[21] Chen,Feng-Yi; Tsai,Pi-Fang; Chen,Keh-Jiann and Huang,Chu-Ren “Sinica Treebank, Computational Linguistics and Chinese Language Processing”, 4(2), 2000, pp.87-103.
[22] Cowie,Jim and Lehnert,Wendy “Information Extraction”, Communications of the ACM, 39 (1), 1996, pp.80-91.
[23] Eikvil,Line “Information Extraction from world wide web -A Survey”, Norwegian Computing Center, Report No. 945, July 1999, pp4-19.
[24] Fenichel,Carol Hansen “Online Searching: Measures that Discriminate among Users with Different Types of Experiences”, JASIS 32:1, Jan, 1981, pp.23-32.
[25] Geraldene,Walker “The Search Performance of End-Users”, In Proceedings of the 9th National Online Meeting, Medford,NJ, 1998, pp.403-410.
[26] Halpern,Jack and Kerman,Jouni “The Pitfalls and Complexities of Chinese to Chinese Conversion”, Fourteenth International Unicode Conference in Boston, 1999, pp.6-14.
[27] Hersh,William and Pentecost,Jeffery “A Task-Oriented Approach to Information Retrieval Evaluation”, JASIS 47:1, JAN. 1996, pp.50-56.
[28] Hsu,J.Y.-J and Yih,W.-T. “Template-based information mining from HTML documents”, In Proceedings of AAAI-97, 1997, pp.256-262.
[29] Huffman,S. “Learning information extraction patterns from examples”, IJCAI-95 Workshop on new approaches to learning for natural language processing, 1995, pp.127-142.
[30] Jim,Cowie and Wendy,Lehnert “Information Extraction”, Communications of the ACM,39 (1), 1996, pp.80-91.
[31] Kemp,D.Alasdair “Computer-based knowledge retrieval”, London: Aslib,1988, pp.212-215.
[32] Kent,Allen “Information Analysis and Retrieval”, N. Y.: Becker and Hayes, 1970, p.314.
[33] Kiewitt,Eva L. “Evaluating Information Retrieval Systems”, The PROBE Program (Westport,Conn.: Greenwood Press), 1979, pp.125-132.
[34] Kim,J. and Moldovan,D. “Acquisition of linguistic patterns for Knowledge-based information extraction”, IEEE Transactions on Knowledge and Data Engineering 7(5), 1995, pp.713- 724.
[35] Kushmerick,Nicholas “Wrapper Induction: Efficiency and Expressiveness”, Artificial Intelligence, Vol. 118, Iss. 1-2, April 2000, pp.15-68.
[36] Lancaster,F.W. “MEDLARS: Report on the Evaluation of Its Operating Efficiency”, American Documentation 2002, Apr.1969, pp.119-142.
[37] Pazienza,Maria Teresa “Information extraction: a multidisciplinary approach to an emerging information technology”, SCIE-97, Frascati, Italy, July 1997, pp.14-18.
[38] Riloff,Ellen “Automatically Constructing a Dictionary for Information Extraction Tasks”, Proceeding of the Eleventh National Conference on Artificial Intelligence, 1993, pp.811-816.
[39] Tkach,Daniel S. “Information Mining with the IBN Intelligent Miner Family”, IBM Software Solution White Paper, Feb 1998, pp.11-17.
[40] Vickery,B.C. “Techniques of Information Retrieval”, London: Butterworth, 1997, pp.213.
[41] Yang,Jae-young; Choi,Joong-min and Oh,Hee-kuck “MORPHEUS：A customized comparison-shopping agent”, The 5th International Conference on Autonomous Agents (Agents-2001), Montreal, Canada, 2001, pp.63-64.
[42] Yang,Jae-young; Lee,Eun-seok and Choi,Joong-min “A Shopping Agent That Automatically Constructs Wrapper for Semi-Structured Online Vendors”, Lecture Notes in Computer Science, Vol. 1983/2000, 2000, pp.368-373.
[43] Yong,Hae-Kong and In,Seok-Choi “An efficient Web information extracting system”, Proceedings of IEEE International Symposium on Industrial Electronics (ISIE 2001), Vol. 3, 2001, pp.1771-1774.
三. 網站部分
[44] 全球新聞網,
[45] 中國石油股份有限公司,
[46] 聯合新聞網,
[47] 自由新聞網,
[48] 工商時報,
[49] 台灣產經新聞網,
[50] 財訊月刊,
[51] 台灣工業園區,
[52] 中華經濟研究院,
[53] 中華民國國貿局,< http://www.moeaboft.gov.tw/ >
[54] 日本朝日新, //www.asahi.com/english/english.html>
[55] 日本讀賣新聞,
[56] Chemkey,
[57] US Energy Information Administration,
[58] European Central Bank,
[59] US Federal Reserve,
[60] Organisation of Petroleum Exporting Countries,
[61] European Chemical News,
[62] Asian Chemical News industry,
[63] Chemicals journal for Europe,
[64] Industrial Technology Intelligence Services，
[65] Information Service_London Oil Reports,
[66] Routers,
[67] CNN,

指導教授

周世傑(Shih-Chieh Chou)

審核日期

2005-7-12

推文