Springer Verlag;Switzerland: Springer International Publishing AG
摘要:
摘要: Opinion word identification (OWI). is an important task for opinion mining. In OWI, it is necessary to find the exact positions of opinion word mentions. Supervised learning approaches can locate such mentions with high accuracy. To construct an OWI system for a new domain, it is necessary to annotate sufficient amounts of data to represent the new domain’s characteristics. However, since annotating every new domain extensively is costly, how to best utilize existing annotated data is a very important challenge for mention-based OWI systems. In this work, we propose a cross-domain OWI system. The query by committee (QBC) active learning scheme is used to select controlled amounts of data in the new domain for manual annotation. This new annotated data is used to complement the existing annotated data of the original domain. We compile three annotated datasets, each for one of three different domains, and conduct domain adaptation experiments on all six domain pairs. Our experiments show that by adding only 1,000 newly annotated sentences from the new domain to the existing annotated data, our system can achieve nearly the same level of accuracy as a system trained on 10,000 annotated new-domain sentences. Our system with the QBC active learning scheme also outperforms the same system with a random selection scheme. 出版者: Switzerland: Springer International Publishing AG 出版日期: 2014 出處: Technologies and Applications of Artificial Intelligence, 2014, Vol.8916, p.334-343 資源來源: Springer Books 版權: Springer International Publishing Switzerland 2014 識別號: ISSN: 0302-9743 識別號: ISBN: 331913986X 識別號: ISBN: 9783319139869 識別號: EISSN: 1611-3349 識別號: EISBN: 3319139878 識別號: EISBN: 9783319139876 識別號: DOI: 10.1007/978-3-319-13987-6_31 識別號: OCLC: 906028264 識別號: LCCallNum: Q334-342TJ210.2-211.