參考文獻 |
[1] B. Adelberg, “NoDoSE: A Tool for Semi-Automatically Extracting Structured and Semi-Structured Data from Text Documents,” ACM SIGMOD Record, vol. 27, no. 2, pp. 283-294, 1998.
[2] R. Agrawal and R. Srikant, “On Integrating Catalogs,” in Proceedings of the 10th International Conference on World Wide Web, 2001, pp. 603-612.
[3] A. Arasu and H. Garcia-Molina, “Extracting structured data from Web pages,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, San Diego, California, 2003, pp. 337-348.
[4] G. O. Arocena and A. O. Mendelzon, “WebOQL: Restructuring Documents, Databases, and Webs,” in Proceedings of the 14th IEEE International Conference on Data Engineering, Orlando, Florida, 1998, pp. 24-33.
[5] H. Bulskov, R. Knappe, and T. Andreasen, “On Measuring Similarity for Conceptual Querying,” in Proceedings of the 5th International Conference on Flexible Query Answering Systems, vol. 2522, Copenhagen, Denmark, 27-29 October, 2002, pp. 100–111.
[6] M. Califf and R. Mooney, “Relational Learning of Pattern-Match Rules for Information Extraction,” in Proceedings of AAAI Spring Symposium on Applying Machine Learning to Discourse Processing, Stanford, California, March, 1998.
[7] G. A. Carpenter and S. Grossberg, “A massively parallel architecture for a self-organizing neural pattern recognition machine,” Computer Vision Graphics Image Process, vol. 37, pp. 54-115, 1987.
[8] G. A. Carpenter and S. Grossberg, “ART 2: Self-organization of stable category recognition codes for analog input patterns,” Appl. Opt. vol. 26, pp. 4919-4930, 1987.
[9] G. A. Carpenter and S. Grossberg, “The ART of adaptive pattern recognition by a self-organization neural network,” Computer, vol. 21, no. 3, pp. 77-88, 1988.
[10] G. A. Carpenter and S. Grossberg, “ART 3: Hierarchical search using chemical transmitters in self-organizing pattern recognition architectures,” Neural Networks, vol. 3, no. 2, pp. 129-152, 1990.
[11] S. Castano and V. D. Antonellis, “A schema analysis and reconciliation tool environment for heterogeneous databases,” in Proceedings of the 1999 International Symposium on Database Engineering & Applications, 1999, pp. 53-62.
[12] C.-H. Chang and S.-C. Lui, “IEPAD: Information Extraction based on Pattern Discovery,” in Proceedings of the Tenth International Conference on World Wide Web, Hong-Kong, 2001, pp. 223-231.
[13] C.-H. Chang and S.-C. Kuo, “OLERA: A Semi-Supervised Approach for Web Data Extraction with Visual Support,” IEEE Intelligent Systems, vol. 19, no. 6, pp.56-64, 2004.
[14] V. Crescenzi and G. Mecca, “Grammars Have Exceptions,” Information Systems, vol. 23, no. 8, pp. 539-565, 1998.
[15] V. Crescenzi, G. Mecca, and P. Merialdo, “RoadRunner: Towards Automatic Data Extraction from Large Web Sites,” in Proceedings of the 26th International Conference on Very Large Database Systems, Rome, Italy, 2001, pp. 109-118.
[16] C.-H. Chang, M. Kayed, M.R. Girgis, and K.F. Shaalan, “A Survey of Web Information Extraction Systems,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 10, pp. 1411-1428, 2006.
[17] B. Chidlovskii, “Automatic Repairing of Web Wrappers by Combining Redundant Views,” in Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence, Meylan, France, Nov. 4-6, 2002, pp. 399-406.
[18] A. Doan, P. Domingos, and A. Y. Halevy, “Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach,” in Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, 2001, pp. 509-520.
[19] A. Doan, J. Madhavan, R. Dhamankar, P. Domingos, and A.Y. Halevy, “Learning to match ontologies on the Semantic Web,” The International Journal on Very Large Data Bases, vol. 12, no. 4, pp. 303-319, 2003.
[20] D. W. Embley, Y. Jiang, and Y. K. Ng, “Record-boundary discovery in web documents,” in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’99), Philadelphia, PA, 1999, pp. 467-478.
[21] D. W. Embley, Y. K. Ng, and Li. Xu, “Recognizing Ontology -Applicable Multiple-Record Web Documents,” in Proceedings of the 20th International Conference on Conceptual Modeling on Lecture Notes in Computer Science, Vol. 2224, London, UK, 2001, pp.555-570.
[22] D. Freitag, “Information Extraction from HTML: Application of A General Learning Approach,” in Proceedings of the Fifteenth Conference on Artificial Intelligence, 1998.
[23] J. Hammer, J. McHugh, and H. Garcia-Molina, “Semistructured Data: the TSIMMIS Experience,” in Proceedings of the 1st East-European Symposium on Advances in Databases and Information Systems, St. Petersburg, Russia, 1997, pp. 1-8.
[24] F. Hakimpour and A. Geppert, “Resolving Semantic Heterogeneity in Schema Integration: an Ontology Based Approach,” in Proceedings of the International Conference on Formal Ontology in Information Systems, vol. 2001, 2001, pp. 297-308.
[25] B. He, K.C.-C. Chang, and J. Han, ”Discovering Complex Matchings across Web Query Interfaces: A Correlation Mining Approach,” in Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 148-157.
[26] M. A. Hernández, R. J. Miller, and L. M. Haas, “Clio: A Semi-Automatic Tool for Schema Mapping,” in Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, 2001, p. 607.
[27] G. Hirst and D. St-Onge, “Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms,” in Proceedings of Fellbaum, 1998, pp. 305–332.
[28] A. Hogue and D. Karger, “Thresher: Automating the Unwrapping of Semantic Content from the World Wide,” in Proceedings of the 14th International Conference on World Wide Web, Japan, 2005, pp. 86-95.
[29] C.-N. Hsu, and M. Dung, “Generating Finite-State Transducers for Semi-Structured Data Extraction from the Web,” Journal of Information Systems, vol. 23, no. 8, pp. 521-538, 1998.
[30] R. Ichise, H. Takeda and S. Honiden, “Integrating Multiple Internet Directories by Instance-based Learning,” in Proceedings of the 18th International Joint Conference on Artificial Intelligence, 2003, pp. 22-28.
[31] J.J. Jiang and D.W. Conrath, “Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy,” in Proceedings of the International Conference on Research in Computational Linguistic, Taiwan, 1998.
[32] Y. Kalfoglou and M. Schorlemmer, “Ontology Mapping: The state of the Art,” The Knowledge Engineering Review, vol. 18, no. 1, pp. 1-31, 2003.
[33] R. Knappe, H. Bulskov, and T. Andreasen, “On Similarity Measures for Content-Based Querying,” in Proceedings of the 10th International Fuzzy Systems Association World Congress, Instsnbul, Turkey, June-July, 2003, pp. 400–403.
[34] R. Kosala, H. Blockeel, M. Bruynooghe and J.V. d. Bussche, “Information extraction from structured documents using k-testable tree automaton inference,” Data & Knowledge Engineering, vol. 58, no. 2, pp. 129-158, 2006.
[35] N. Kushmerick, D. Weld, and R. Doorenbos, “Wrapper Induction for Information Extraction,” in Proceedings of the Fifteenth International Conference on Artificial Intelligence, pp. 729-735, 1997, pp. 729-735.
[36] N. Kushmerick, “Wrapper Verification,” World Wide Web, vol. 3, no. 2, pp. 79-94, 2000.
[37] N. Kushmerick, “Regression Testing for Wrapper Maintenance,” in Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence, Orlando, Florida, United States, 1999, pp. 74-79.
[38] A.H.F. Laender, B. Ribeiro-Neto, and A.S.D. Silva, “DEByE -Data Extraction by Example,” Data and Knowledge Engineering, vol. 40, no. 2, pp. 121-154, 2002.
[39] K. Lerman, L. Getoor, S. Minton, and C. A. Knoblock, “Using the Structure of Web Sites for Automatic Segmentation of Tables,” in Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, Paris, France, 2004, pp. 119-130.
[40] K. Lerman, S. Minton, and C. Knoblock, “Wrapper Maintenance: A Machine Learning Approach,” Journal of Artificial Intelligence Research, pp. 149-181, 2003.
[41] Y. Li, Z.A. Bandar, and D. McLean, “An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources,” IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 4, pp. 871-882, July-August, 2003.
[42] L. Liu, C. Pu, and W. Han, “XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources,” in Proceedings of the 16th IEEE International Conference on Data Engineering, San Diego, California, 2000, pp. 611-621.
[43] D. Lin, “Principle-Based Parsing Without Overgeneration,” in Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 1993, pp. 112–120.
[44] B. Liu, R. Grossman, and Y. Zhai, “Mining data records in Web pages,” in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 601-606, 2003.
[45] B. Liu, and Y. Zhai, “NET - A System for Extracting Web Data from Flat and Nested Data Records,” in Proceedings of the Sixth International Conference on Web Information Systems Engineering, pp. 487-495, 2005.
[46] P.W. Lord, R.D. Stevens, A. Brass, and C.A. Goble, “Investigating Semantic Similarity Measures across the Gene Ontology: the Relationship between Sequence and Annotation,” Bioinformatics, vol. 19, no. 10, pp.1275–1283, 2003.
[47] J. Madhavan, P. A. Bernstein and E. Rahm, “Generic Schema Matching with Cupid,” in Proceedings of the 27th International Conference on Very Large Data Bases, 2001, pp. 49-58.
[48] B. Magnini, L. Serafini, and M. Speranza, “Linguistic based Matching of Local Ontologies,” in Proceedings of AAAI-02 Workshop on Meaning Negotiation, 2002.
[49] S. Melnik, H. Garcia-Molona, and E. Rahm, “Similarity Flooding: A Versatile Graph Matching Algorithm and its Application to Schema Matching,” in Proceedings of the International Conference on Data Engineering, 2002, pp. 117-128.
[50] X. Meng, D. Hu and C. Li, “Schema-Guided Wrapper Maintenance for Web-Data Extraction,” ACM Fifth International Workshop on Web Information and Data Management, New Orleans, Louisiana, USA, November 7-8, 2003.
[51] X. Meng, H. Wang, D. Hu and M. Gu, “SG-WRAM: Schema Guided Wrapper Maintenance: A Demonstration,” Proceedings 19th International Conference on Data Engineering, Bangalore, India, March 5-8, 2003.
[52] G.A. Miller, W.G. Charles, “Contextual Correlates of Semantic Similarity,” Language and Cognitive Processes, pp.1-28, 1991.
[53] I. Muslea, S. Minton, and C. Knoblock, “A hierarchical approach to wrapper induction,” Proceedings of the Third International Conference on Autonomous Agents, 1999.
[54] N.F. Noy, “Semantic Integration: A Survey of Ontology-based Approaches,” ACM SIGMOD Record, vol. 33, no. 4, December, 2004.
[55] N. Papadakis, D. N. Skoutas, K. Raftopoulos, and T. A. Varvarigou, “STAVIES: A System for Information Extraction from Unknown Web Data Sources through Automatic Web Wrapper Generation Using Clustering Techniques,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 12, December, 2005, pp. 24-30.
[56] N. Papadakis, D. N. Skoutas, K. Raftopoulos, and T. A. Varvarigou, “An Automatic Web Wrapper for Extracting Information from Web Sources, Using Clustering Techniques,” IEEE/IPSJ International Symposium on Applications and the Internet, Trento, Italy, January, 2005, pp. 24-30.
[57] D. Pinto, A. McCallum, X. Wei, and B. C. Croft, “Table Extraction Using Conditional Random Fields,” Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 235-242, 2003.
[58] R. Rada, H. Mili, E. Bicknell, and M. Blettner, “Development and Application of a Metric on Semantic Nets,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 19, no. 1, pp. 17-30, January-February, 1989.
[59] E. Rahm and P. A. Bernstein, “A Survey of Approaches to Automatically Schema Matching,” The International Journal on Very Large Data Bases, vol. 10, no. 4, pp. 334-350, 2001.
[60] J. Raposo, A. Pan, M. Álvarez, and J. Hidalgo, “Automatically Maintaining Wrappers for Web Sources,” in Proceedings of the 9th International Database Engineering & Application Symposium, 2005, pp. 105-114.
[61] J. Raposo, A. Pan, M. Álvarez, and J. Hidalgo, “Automatically Generating Labeled Examples for Web Wrapper Maintenance,” in Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, 2005, pp. 250-256.
[62] R. Richardson, A. Smeaton, and J. Murphy, “Using WordNet as a Knowledge Base for Measuring Semantic Similarity Between Words,” Working Paper CA-1294, School of Computer Applications, Dublin City University, Dublin, Ireland, 1994.
[63] O. Resnik, “Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity and Natural Language,” Journal of Artificial Intelligence Research, vol. 11, pp. 95–130, 1999.
[64] B. Ribeiro-Neto, A.H.F. Laender, and A.S.D. Silva, “Extracting semi-structured data through examples,” in Proceedings of the Eighth ACM International Conference on Information and Knowledge Management, Kansas City, Missouri, 1999, pp. 94-101.
[65] M.A. Rodriguez and M.J. Egenhofer, “Determining Semantic Similarity Among Entity Classes from Different Ontologies,” IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 2, pp. 442-456, March-April 2003.
[66] H. Rubenstein, J.B. Goodenough, “Contextual Correlates of Synonymy,” Communications of the ACM 8, pp.627-633, 1965.
[67] A. Saiiuguet and F. Azavant, “Building intelligent Web applications using lightweight wrappers,” Data and Knowledge Engineering, vol. 36, no. 3, pp. 283-316, 2001.
[68] S. Sarawagi, S. Chakrabarti, and S. Godbole, “Cross-Training: Learning Probabilistic Mappings between Topics,” in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pp. 177-186, 2003.
[69] S. Soderland, “Learning Information Extraction Rules for Semi-Structured and Free Text,” Journal of Machine Learning, vol. 34, no. 1-3, pp. 233-272, 1999.
[70] M. C. Su, C.-K. Huang, J. Lee, and S.-P. Ma, “Webpage Information Extractor with On-Line Learning,” in Proceedings of the First Taiwan Conference on Software Engineering, Taipei, Taiwan, June 3-4, 2005, pp. 202-206.
[71] M. C. Su, J. Lee, and S. J. Wang, “Method for Wrapper Maintenance,” in Proceedings of the Second Taiwan Conference on Software Engineering, Taipei, Taiwan, June 9-10, 2006, pp. 293-298.
[72] A. Tversky, “Features of Similarity,” Psychological Review, vol. 84, no. 4, pp.327–352, 1977.
[73] H. Wache, T. V¨ogele, U. Visser, H. Stuckenschmidt, G. Schuster, H. Neumann and S. H¨ubner, “Ontology-Based Integration of Information—A Survey of Existing Approaches,” in Proceedings of IJCAI-01 Workshop: Ontologies and Information Sharing, 2001.
[74] J. Wang and F. H. Lochovsky, “Wrapper Induction based on Nested Pattern Discovery,” Technical Report HKUST-CS-27-02, Department of Computer Science, Hong Kong, University of Science & Technology, 2002.
[75] J. Wang, and F. H. Lochovsky, “Data Extraction and Label Assignment for Web Databases,” in Proceedings of the Twelfth International Conference on World Wide Web, Budapest, Hungary, 2003, pp. 187-196.
[76] W. Wu, C. Yu, A. Doan, and W. Meng, “An Interactive Clustering-based Approach to Integrating Source Query Interfaces on the Deep Web,” in Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, 2004, pp. 95-106.
[77] Z. Wu and M. Palmer, “Verb Semantics and Lexical Selection,” in Proceedings of the 32nd Annual Meeting of the Associations for Computational Linguistics, Las Cruces, New Mexico, 1994, pp. 133-138.
[78] L. Yi, B. Liu, and X. Li, “Eliminating Noisy Information in Web Pages for Data Mining,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Washington, D.C., USA, August 24 - 27, 2003.
[79] Y. Zhai and B. Liu, “Web Data Extraction Based on Partial Tree Alignment,” in Proceedings of the 14th International Conference on World Wide Web, Japan, 2005, pp. 76-85.
[80] D. Zhang and W. S. Lee, “Web Taxonomy Integration through Co-Bootstrapping,” in Proceedings of the 27th annual International Conference on Research and Development in Information Retrieval, 2004, pp. 410-417.
[81] H. Zhao, W. Meng, Z. Wu, V. Raghavan, and C. Yu, “Fully Automatic Wrapper Generation For Search Engines,” in Proceedings of the 14th International Conference on World Wide Web, Japan, 2005, pp. 66-75.
[82] A Repository of Online Information Sources Used in Information Extraction Tasks, http://www.isi.edu/info-agents/RISE/index.html
[83] IEEE, Available: http://ieeexplore.ieee.org/
[84] Yahoo, Available: http://www.yahoo.com.tw
[85] Google, Available: http://www.google.com.tw
[86] Springerlink, Available: http://www.springerlink.com/
[87] 呂紹誠,「網際網路半結構性資料擷取系統之設計與實作」,國立中央大學資訊工程學系,碩士論文,民國90年。
[88] 郭釋謙,「線上擷取規則分析」,國立中央大學資訊工程學系,碩士論文,民國92年。
[89] 黃陳科,「具學習功能之新型擷取程式」,國立中央大 學資訊工程學系,碩士論文,民國94年。
[90] 黃執強,「同性質網頁資料整合之自動化研究」,國立 中央大學資 訊工程學系,碩士論文,民國94年。
[91] 張斐章,張麗秋,黃浩倫,類神經網路理論與實務,東華書局,2003。
[92] 蘇威霖,「類神經網路應用於多資料庫資料表與欄位對應之研 究」,朝陽科技大學資訊管理系,碩士論文,民國91年。 |