參考文獻 |
[1] Boris Chidlovskii, Jon Ragetli, and Maarten de Rijke. Automatic Wrapper Generation for Web Search Engines. In Proceedings of the 1st International Conference on Web-Age Information Management 2000 (WAIM-2000), pp. 399-410, LNCS Series, Shanghai, China, June 2000.
[2] Chia-Hui Chang and Chun-Nan Hsu. Automatic Extraction of Information Blocks Using PAT Trees. In Proceedings of 1999 National Computer Symposium (NCS-1999), Tamkang University, Tamsui, Taiwan, Dec 1999.
[3] Chia-Hui Chang, Shao-Chen Lui, and Yen-Chin Wu. Applying pattern mining to Web information extraction. In Proceedings of the 5th Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD-2000), pp. 4-16, Hong Kong, Apr 2001.
[4] Chia-Hui Chang and Shao-Chen Lui. IEPAD: Information Extraction based on Pattern Discovery, In Proceedings of the 10th International Conference on World Wide Web (WWW10), pp. 595-609, Hong Kong, May 2001.
[5] Chun-Nan Hsu and Ming-Tzung Dung. Generating finite-state transducers for semi-structured data. Journal of Information Systems, Special Issue on Semi-structured Data, Volume 23, pp. 521-537, Aug 1998.
[6] Chun-Nan Hsu and Chien-Chi Chang. Finite-state transducers for semi-structured text mining. In Proceedings of IJCAI-99 Workshop on Text Mining: Foundations, Techniques and Applications, pp. 38-49, Stockholm, Sweden, 1999.
[7] C.T. Kwok, and D.S. Weld, Planning to gather information. In Proceedings of the 13th National Conference on Artificial Intelligence (AAAI-96), pp. 32-39, AAAI Press, Menlo Park, California, 1996.
[8] D. Gusfield, Algorithms on strings, tree, and sequence, Cambridge. 1997.
[9] D. Smith, and M. Lopez, Information extraction for Semi-structured documents. In Proceedings of the Workshop on Management of Semi-Structured Data, Tucson, Arizona, 1997.
[10] D.W. Embley, Y.S. Jiang, and Y.K. Ng, Record-boundary discovery in Web documents. In Proceedings of 1999 ACM SIGMOD International Conference on Management of Data (SIGMOD-99), pp. 467-478, Philadelphia, Pennsylvania, 1999.
[11] D.W. Embley, Y.S. Jiang, and Y.K. Ng, Recognizing Ontology-Applicable Multiple-Record Web Documents. Submitted.
[12] G. Gonnet, R. Baeza-Yates, and T. Snider, New Indices for Text: PAT Trees and PAT Arrays. In Bill Frakes, and B.Y. Ricardo, editors, Information Retrieval: Data Structures and Algorithms, Prentice Hall, Englewood Cliffs, Chapter 5 (pp. 66-82), NJ, USA, 1992.
[13] I. Muslea, S. Minton, and C. Knoblock, STALKER: learning extraction rules for semi-structured, Web-based information sources. In Proceedings of AAAI-98 Workshop on AI and Information Integration, Technical Report WS-98-01, AAAI Press, Menlo Park, California, 1998.
[14] I. Muslea, S. Minton, and C. Knoblock, A hierarchical approach to wrapper induction. In Proceedings of the 3rd International Conference on Autonomous Agents (Agents-99), pp. 190-197, Seattle, Washington, 1999.
[15] I. Muslea, Extraction patterns for information extraction tasks: a survey. In Proceedings of AAAI-99 Workshop on Machine Learning for Information Extraction, 1999.
[16] J.R. Gruser, L. Raschid, M.E. Vidal, and L. Bright, Wrapper Generation for Web Accessible Data Sources. In Proceedings of the 3rd IFCIS International Conference on Cooperative Information Systems (CoopIS-98), pp. 14-23,1998.
[17] Jane Hsu and Wen-Tau Yih. Template-based information mining from html documents. In Proceedings of the 14th National Conference on Artificial Intelligence (AAAI-97), pp. 256-22, AAAI Press, Menlo Park, California, 1997.
[18] Jane Hsu, Wen-Tau Yih, Ching-Hung Leu, and Euna Jeong. Information Extraction from HTML Documents: An Approximate Tree Matching Approach. Submitted to AAAI-99, 1999.
[19] L.F. Chien, PAT-tree-based keyword extraction for Chinese information retrieval. In Proceedings of the 20th annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-97), pp.50-58, 1997.
[20] N. Ashish, and C.A. Knoblock, Semi-automatic Wrapper generation for internet information sources. In Proceedings of the International Conference on Cooperative Information Systems (CoopIS-97), pp.160-169, Charleston, South Carolina, 1997.
[21] N. Kushmerick, D. Weld, and R. Doorenbos, Wrapper Induction for information extraction. In Proceedings of the 15th International, Joint Conference on AI (IJCAI-97), pp. 729-737, 1997.
[22] N. Kushmerick, Wrapper Induction: Efficiency and expressiveness. Workshop on AI & Information Integration. In Proceedings Of AAAI-98 Workshop on Artificial Intelligence and Information Integration, pp. 15-68, AAAI Press, Menlo Park, California, 1998.
[23] R. Sedgewick, Algorithms in C, Addison Wesley, 1990.
[24] R.B. Doorenbos, O. Etzioni, and D.S. Weld, A scalable comparison- shopping agent for the world-wide web. In Proceedings of the 1st International Conference on Autonomous Agents (Agents-97), pp. 39-48, ACM Press, New York, NY, 1997. |