Active Learning for Incremental POI Extraction and Pairing

DC 欄位	值	語言
DC.contributor	資訊工程學系	zh_TW
DC.creator	張弘暐	zh_TW
DC.creator	Hung-Wei Chang	en_US
dc.date.accessioned	2016-8-29T07:39:07Z
dc.date.available	2016-8-29T07:39:07Z
dc.date.issued	2016
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=103522034
dc.contributor.department	資訊工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	隨著網際網路與智慧型行動裝置的快速發展，電子地圖已經成為了我們生活中不可或缺的好幫手。若希望電子地圖能提供高品質的區域搜尋服務，則必須讓使用者能夠精確地搜尋到其所在區域內使用者感興趣的地點(Point of Interest, POI)，包含各類食衣住行育樂等不同類別的商店位置。現今公認最強大的電子地圖莫過於Google Maps，使用者習慣在Google Maps上搜尋POI，但並不是所有使用者想要的POI都能在Google Maps上找到。為此我們勢必得拓展POI的來源，並且建構一個豐富的POI資料庫，以提供使用者查詢。近年來由於社交網站的崛起，使用者常常因著社交網站能夠快速散播資訊的特性，所以在這類網路媒體上分享一些美食資訊、旅遊經驗等等諸如此類的資料。同時商家也會在上面成立官方粉絲團或者官方網頁，詳加介紹店家的產品，以快速增加產品曝光率。這些使用者及店家在網際網路上所提供的資訊，對於探勘新的POI都是很好的來源。在本篇論文中，我們提出一個基於Web資訊的系統，此系統可以大略分為以下三部分。第一部分為地址相關Google snippet的爬取，其爬取的原因為Google snippet當中可能包含豐富的POI相關資訊。第二部分為POI擷取模型，透過Conditional Random Field (CRF) 以及 Conditional Random Field Sharp (CRF Sharp)作為學習演算法，產生的中文地址名稱辨識模型以及中文組織名稱辨識模型，其目的是為找出所有在snippet當中出現過的地址以及組織名稱。第三部分為地址與組織名稱的配對模型，使用LibSVM作為學習演算法,以訓練模型，為地址與組織名稱進行配對。	zh_TW
dc.description.abstract	The rapid development of the Internet and mobile smart devices has made the electronic map gradually become a good helper in our lives. If we hope the electronic map can provide a quality Location-Based Service, it must be able to help users accurately find nearby POIs (Point of Interest) in the nearby location, including food, clothing, housing, communications etc. The most powerful electronic map today is Google Maps. Many users are used to search for POIs with it. However, not all user-desired POIs can be found on Google Maps. Therefore, we have to expand the sources of POIs, and build a resourceful database of POIs for user queries. As the rise of social networking in recent years, users often share food information and travel experiences on these media. As the same time, businesses are in favor of setting up official pages to increase the visibility of their products. In this paper, we propose a web-based system, which could be roughly divided into the following three parts. The first part is the crawling of address associated snippets. The second part is the POI extraction model. Through the Conditional Random Field (CRF) and Conditional Random Field Sharp (CRF Sharp) as the learning algorithm. The purpose of this algorithm is to find out all the addresses and POI names in snippets. The third part is the POI pair verification model. The verification model is trained by the LibSVM learning algorithm, paired the address and POI name.	en_US
DC.subject	資料探勘	zh_TW
DC.subject	機器學習	zh_TW
DC.subject	Data Mining	en_US
DC.subject	Machine Learning	en_US
DC.title	Active Learning for Incremental POI Extraction and Pairing	en_US
dc.language.iso	en_US	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 103522034 完整後設資料紀錄