AI Access Foundation;La Canada: American Association for Artificial Intelligence
摘要:
摘要: CiteSeerX is a digital library search engine that provides access to more than 5 million scholarly documents with nearly a million users and millions of hits per day. We present key AI technologies used in the following components: document classification and deduplication, document and citation clustering, automatic metadata extraction and indexing, and author disambiguation. These AI technologies have been developed by CiteSeerX group members over the past 5–6 years. We show the usage status, payoff, development challenges, main design concepts, and deployment and maintenance requirements. We also present AI technologies, implemented in table and algorithm search, that are special search modes in CiteSeerX. While it is challenging to rebuild a system like CiteSeerX from scratch, many of these AI technologies are transferable to other digital libraries and search engines. 其他題名: AI Magazine 出版者: La Canada: American Association for Artificial Intelligence 出版日期: 2015-09-22 出處: The AI magazine, 2015-09, Vol.36 (3), p.35-48 資源來源: ABI/INFORM Collection 版權: 2015 The Authors. AI Magazine published by John Wiley & Sons Ltd on behalf of Association for the Advancement of Artificial Intelligence 版權: COPYRIGHT 2015 American Association for Artificial Intelligence 版權: Copyright Association for the Advancement of Artificial Intelligence Fall 2015 識別號: ISSN: 0738-4602 識別號: ISSN: 2371-9621 識別號: EISSN: 2371-9621 識別號: DOI: 10.1609/aimag.v36i3.2601