Taiwan is the country which is often affected by natural disasters such as typhoon and earthquake. When these natural disasters reach the scope of alert, the disaster relief units must quickly grasp the information. In order to effectively reduce the losses, we must also rely on the active report of the people in disaster areas. In the event of a disaster, these disaster information is generally transmitted by telephone to the disaster relief unit. It is worth noting that, the reports of the disaster appear explosively. Relief units hard to handle great amount of reports with the lack of manpower. The fact becomes the bottleneck of grasping disaster information.
With the development of Internet, 3C product penetration has been gradually improved. It is more convenient to exchange information from the Internet. When the disaster occurs, disaster information may also be exchanged. As a result, we have an another way getting disaster information: access to disaster information from social network.This task involves information extraction technology, from the unstructured text information to extract the specific message, and stored in the database. In this paper, we set up a PTT disaster event extraction system, using the PTTWeb as a source of information, crawling regularly through the web crawler, and using Named entity recognition Identify disaster information such as ”disaster name”, ”disaster location” and ”damage description” to establish disaster reports.
This paper is divided into three parts. The first part of the article is pre-processing operations. Using web crawler to fetch PTT posts. The second part is the classification of articles, by using SVM to build a classification model in order to filter out disaster related posts. The third part is the named entity recognition. The training tool is proposed by the NCU WIDM lab. Conditional random field is used as the training algorithm. We have built three models including, disaster name, disaster location and damage description. In experiments, those models in exact match test can get the result with F-Measure higher than 0.7, and F-Measure higher than 0.75 in partial match test.
 Murty, Maddipati Narasimha, and Rashmi Raghava. Support Vector Machines and Perceptrons: Learning, Optimization, Classification, and Application to Social Networks. Springer, 2016.
 Wang, Wei. ”Chinese news event 5W1H semantic elements extraction for event ontology population.” Proceedings of the 21st International Conference on World Wide Web. ACM, 2012.
 Lafferty, John, Andrew McCallum, and Fernando Pereira. ”Conditional random fields: Probabilistic models for segmenting and labeling sequence data.” (2001): 282-289.
 Sakaki, Takeshi, Makoto Okazaki, and Yutaka Matsuo. ”Earthquake shakes Twitter users: real-time event detection by social sensors.” Proceedings of the 19th international conference on World wide web. ACM, 2010.
 Kryvasheyeu, Yury, et al. ”Rapid assessment of disaster damage using social media activity.” Science advances 2.3 (2016): e1500779.
 Blunsom, Phil. ”Hidden markov models.” Lecture notes, August 15 (2004): 18-19.
 Graves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. ”Speech recognition with deep recurrent neural networks.” Acoustics, speech and signal processing (icassp), 2013 ieee international conference on. IEEE, 2013.
 Graves, Alex, et al. ”Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks.” Proceedings of the 23rd international conference on Machine learning. ACM, 2006.
 Graves, Alex. ”Generating sequences with recurrent neural networks.” arXiv preprint arXiv:1308.0850 (2013)
 Ma, Xuezhe, and Eduard Hovy. ”End-to-end sequence labeling via bi-directional lstm-cnns-crf.” arXiv preprint arXiv:1603.01354 (2016).
 Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. ”Glove: Global vectors for word representation.” EMNLP. Vol. 14. 2014.
 Y. Y. Huang, C.H. Chung, “A Tool for Web NER Model Generation Based on Google Snippets,” Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, pp. 148–163, ROCLING, 2015.
 Chou, Chien-Lung, Chia-Hui Chang, and Ya-Yun Huang. ”Boosted Web Named Entity Recognition via Tri-Training.” ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 16.2 (2016): 10.