dc.description.abstract |
Taiwan is the country which is often affected by natural disasters such as typhoon and earthquake. When these natural disasters reach the scope of alert, the disaster relief units must quickly grasp the information. In order to effectively reduce the losses, we must also rely on the active report of the people in disaster areas. In the event of a disaster, these disaster information is generally transmitted by telephone to the disaster relief unit. It is worth noting that, the reports of the disaster appear explosively. Relief units hard to handle great amount of reports with the lack of manpower. The fact becomes the bottleneck of grasping disaster information.
With the development of Internet, 3C product penetration has been gradually improved. It is more convenient to exchange information from the Internet. When the disaster occurs, disaster information may also be exchanged. As a result, we have an another way getting disaster information: access to disaster information from social network.This task involves information extraction technology, from the unstructured text information to extract the specific message, and stored in the database. In this paper, we set up a PTT disaster event extraction system, using the PTTWeb as a source of information, crawling regularly through the web crawler, and using Named entity recognition Identify disaster information such as ”disaster name”, ”disaster location” and ”damage description” to establish disaster reports.
This paper is divided into three parts. The first part of the article is pre-processing operations. Using web crawler to fetch PTT posts. The second part is the classification of articles, by using SVM to build a classification model in order to filter out disaster related posts. The third part is the named entity recognition. The training tool is proposed by the NCU WIDM lab. Conditional random field is used as the training algorithm. We have built three models including, disaster name, disaster location and damage description. In experiments, those models in exact match test can get the result with F-Measure higher than 0.7, and F-Measure higher than 0.75 in partial match test. | en_US |