參考文獻 |
[1] A. Arvind, and H. Garcia-Molina, Data Integration and Sharing II: Extracting Structured Data from Web Pages. In Proceedings of 2003 ACM SIGMOD International Conference on Management of Data, Page: 337 – 348, 2003.
[2] J. Caverlee, D. Buttler, and L. Liu. Discovering Objects in Dynamically- Generated Web Pages. Technical report, Georgia Institute of Technology, 2003
[3] C. H. Chang and S.C. Lui. IEPAD: Information Extraction Based on Pattern Discovery. In Proceedings of the 10th international conference on World Wide Web, Page: 681 – 688, 2001.
[4] V. Crescenzi, G. Mecca, and P.Merialdo. ROADRUNNER: Towards automatic data extraction from large web sites. In Proceedings of the 2001 International Conference on Very Large Data Base (VLDB), Page: 109 – 118, 2001.
[5] H. Davulcu, S. Koduri, and S. Nagariajan. DataRover: A Taxonomy Based Crawler for Automated Data Extraction from Data-Intensive Websites. In Proceedings of the 5th ACM international workshop on Web information and data management (WIDM’03), Page 9 – 14, 2003.
[6] C. N. Hsu, and C. C. Chang. Finite-state Transducers for Semi-Structured Text Mining. In Proceedings of IJCAI-99 Workshop on Text Mining: Foundations, Techniques and Application, Page 38 – 49, 1999.
[7] C. N. Hsu, and M. T. Dung. Generating Finite-state Transducers for Semi-Structured Data Extraction from the Web. Information Systems, 23(8):521-538, 1998
[8] N. Kushmerick, D. S. Weld, and R. B. Doorenbos. Wrapper Induction for Information Extraction. In Intl. Joint Conference on Articial Intelligence (IJCAI), pages 729 – 737, 1997.
[9] A. Laender, B. Ribeir-Neto, and A. da Silva, an J. Teixeira. A Brief Survey of Web Data Extraction Tools. ACM Sigmod Record, Volume 31, Issue 2, Pages: 84 – 93, 2002.
[10] K. Lerman, L. Getoor, S. Minton, and C. Knoblock. Using the Structure of Web Sites for Automatic Segmentation of Tables. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data (SIGMOD’04), Page 119 – 130, 2004
[11] B. Liu, R. Grossman, and Y. Zhai. Mining Data Records in Web Pages. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Date Mining (KDD’03), Page 24 - 27, 2003
[12] Z. Liu, F. Li, and W. K. Ng. Wiccap Data Model: Mapping Physical Websites to Logical Views. In Proceedings of the 21st International Conference on Conceptual Modeling, Page 120 – 134, 2002.
[13] I. Muslea, S. Minton, and C. A. Knoblock. STALKER: Learning Extraction Rules for Semistructured Web-based Information Sources. In Proceedings of AAAI Workshop on AI and Information Integration, Pages 74-81, 1998.
[14] S. Pandya. Improving Search Engines for a Changing Web. In M.Tech Dissertation of Department of Computer Science and Engineering Indian Institute of Technology, Powai. Mumbai.
[15] S. Sarawagi. Automation in Information Extraction and Data Integration (Tutorial). In Proceedings of the 2002 International Conference on Very Large Data Base (VLDB), 2002.
[16] H. Song, S. Giri, and F. Ma. Data Extraction and Annotation for Dynamic Web Pages. In Proceedings of the 2004 IEEE International Conference on e-Technology, e-Commerce, and e-Service (EEE’04), Page 499 – 502, 2004.
[17] J. Wang, and F.H. Lochovsky. Data Extraction and Label Assignment for Web Databases. In Proceedings of the twelfth international conference on World Wide Web, Page 187 – 196, 2003.
[18] G. Yang, I.V.Ramakrishnan, and M.Kifer. On the Complexity of Schema Inference from Web Pages in the Presence of Nullable Data Attributes. In Proceedings of the twelfth international conference on Information and knowledge management, Page 224 – 231 , 2003
[19] Y. Zhai, and B. Liu. Web Data Extraction Based on Partial Tree Alignment. In the Proceedings of the 14th international conference on World Wide Web, Page 76 – 85, 2005
[20] Web Service Site: Amzon.com http://www.Amazon.com
[21] Document Object Model, DOM. http://www.w3.org/DOM/
[22] EXALG: Experimental results. http://www-db.stanford.edu/~arvind/extract/ |