博碩士論文 86345004 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:41 、訪客IP:18.191.5.239
姓名 溫敏淦(Ming-Gang Wen)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 以虛擬骨架為基礎之中文字分類方法
(Printed Chinese character classification based on pseudo-skeletons)
相關論文
★ 使用視位與語音生物特徵作即時線上身分辨識★ 以影像為基礎之SMD包裝料帶對位系統
★ 手持式行動裝置內容偽變造偵測暨刪除內容資料復原的研究★ 基於SIFT演算法進行車牌認證
★ 基於動態線性決策函數之區域圖樣特徵於人臉辨識應用★ 基於GPU的SAR資料庫模擬器:SAR回波訊號與影像資料庫平行化架構 (PASSED)
★ 利用掌紋作個人身份之確認★ 利用色彩統計與鏡頭運鏡方式作視訊索引
★ 利用欄位群聚特徵和四個方向相鄰樹作表格文件分類★ 筆劃特徵用於離線中文字的辨認
★ 利用可調式區塊比對並結合多圖像資訊之影像運動向量估測★ 彩色影像分析及其應用於色彩量化影像搜尋及人臉偵測
★ 中英文名片商標的擷取及辨識★ 利用虛筆資訊特徵作中文簽名確認
★ 基於三角幾何學及顏色特徵作人臉偵測、人臉角度分類與人臉辨識★ 一個以膚色為基礎之互補人臉偵測策略
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 中文文字識別(Chinese optical character recognition)技術在近二十幾年內,因為許多研究者的投入努力而有長足的進步,甚至已成熟至商品化的階段,這樣的發展結果說明了兩個事實:中文字文字識別的需求及在技術上的可行性。文字識別在各種領域中的廣泛應用,從早期的銀行支票處理及郵件中的郵遞區號自動分信,到辦公室自動化的文件數位化,以至近來知識管理及數位圖書館的知識檢索(information retrieval),無不日益仰賴文字識別的技術。
本論文的主要目的有二:一、以單一樣版字型資料庫做多字形的中文字分類。二、對旋轉字做識別。而為了適應新的文字識別需求,使用較少的系統資源及較快速的辨識處理時間是研究的預設限制。
我們提出了一個全新的形似骨架產生法,並以此虛擬骨架為基礎,擷取文字的特徵以做為辨識的依據。在論文中我們討論了三種不同形態的虛擬骨架及對應的虛擬輪廓,在前處理時間的節省上有極大的改善。而以虛擬骨架為基礎,將二維的文字影像以一維的字碼特徵表示,是在特徵擷取階段透過對虛擬骨架的投影梯度圖,產生文字的假筆劃字碼,以做為分類階段的依據。在分類階段,我們以變異權重的編輯距離演算法及模糊編輯距離演算法對文字的特徵字碼進行分類。以五千四百零一個常用中文字為測試樣本,我們以最常用的細明體為樣版字集,再分別以不同大小的標楷體及細明體字,驗證我們所提出方法的成效,結果顯示,在時間及記憶體空間的需求上,都可以大幅改善文字分類的效能。
另外針對旋轉文字的辨識,論文中以文字的虛擬輪廓為基礎,以模糊C-mean演算法及我們所提出的循環模糊C-mean演算法,擷取旋轉文字的特徵向量,再以循環漢明距離估測文字間的相似度,以做為文字辨識的依據。在實證上我們以象棋中的十四個棋子文字為測試樣本,以四種不同旋轉角度驗證而得所提出方法的正確及可行性。
摘要(英) In this dissertation, a novel method is presented to classify machine printed Chinese characters by matching the code-string-based features which are generated from pseudo skeleton. In our approach, the proposed novel pseudo skeletons of Chinese characters are extracted instead of the skeletons generated by the traditional thinning algorithms. The features of the pseudo skeletons of input and template characters are encoded into two code strings. Next, the edit-distance based matching algorithm is employed to compute the similarity of two characters based on their corresponding encoded strings.
There are three main modules in our work which include preprocessing, feature extraction, and fuzzy matching modules. First, p-skeletons of an input character and the pixel projection histograms are generated in the preprocessing module. Three kinds of virtual-strokes (called v-strokes) are defined by using the fuzzy membership functions. These features are encoded and represented by three kinds of fuzzy variables in the feature extraction module. Based on the encoded strings, the problem of OCR classification is transformed to the matching problem of 1-D string instead of that of 2-D image. At the training stage, the extracted features are stored in the reference database, whereas the fuzzy edit-distance matching algorithm is applied to measure the similarity of an unknown pattern and those in the reference database at the classification stage. Finally, the candidate list is generated as the classification results. Experiments were conducted on 5401 daily-used Chinese characters of various fonts and sizes. Experimental results are illustrated to demonstrate the validity and efficiency of our proposed method. The main contribution of this dissertation is to effectively classify the multi-font Chinese characters using single-font reference database.
In addition, a new method for rotational character classification is also proposed in this dissertation. Similar to p-skeleton generation, the pseudo contour of a character is generated first. The using of pseudo contour instead of original image can greatly speed up the process time. A new clustering method, called circular fuzzy C-mean algorithm, is devised to obtain the rotation invariant feature. At the classification stage, the Hamming distance is applied to measure the similarity of the characters. Experiments shown that the fuzzy ring feature is effective for rotational character classification.
關鍵字(中) ★ 模糊環特徵
★ 中文字分類
★ 虛擬輪廓
★ 虛擬骨架
★ 模糊編輯距離
關鍵字(英) ★ fuzzy edit-distance
★ fuzzy ring features
★ Chinese character classification
★ pseudo-contour
★ pseudo-skeleton
論文目次 Abstract i
Contents iii
List of Figures vii
List of Tables xiii
CHAPTER 1 1
Introduction 1
1.1 Motivation 1
1.2 Survey of Related Works 7
1.3 System architecture of the Dissertation 8
1.4 Organization of the Dissertation 10
CHAPTER 2 13
Character pre-processing and pseudo skeleton extraction 13
2.1 Introduction 13
2.2 Noise elimination 15
2.3 Pseudo skeleton generation 24
CHAPTER 3 49
Feature extraction of Chinese characters based on p-skeletons and p-contours 49
3.1 Introduction 49
3.2 Fuzzy membership function of strokes 51
3.3 Feature extraction based on p-skeletons 60
3.4 Ring feature extraction based on pseudo contours 69
CHAPTER 4 77
Classification Chinese characters based on virtual stroke features and fuzzy ring features 77
4.1 Introduction 77
4.2 Pre-filtering 78
4.3 Character classification using virtual stroke feature and edit distance algorithm with variant cost function 79
4.4 Character classification using fuzzy virtual stroke features and fuzzy edit distance 84
4.5 Character classification with fuzzy ring feature 88
4.6 Experimental results 90
CHAPTER 5 99
Conclusions and future works 99
5.1 Conclusions 99
5.2 Future works 101
References 103
簡 歷 1
Vita 2
Publication List 3
參考文獻 1. A. Datta and S. K. Parui, "A robust parallel thinning algorithm for binary images", Pattern Recognition, 27(9), 1181–1192 (1994).
2. A. Jain, "Fundamentals of Digital Image Processing", Prentice-Hall, 1986, p 384-387.
3. A. Rosenfeld and J. L. Pfaitz, "Sequential operators in digital picture processing," J. ACM 13, 471–494 (1966).
4. Amin, "Off-line Arabic character recognition: the state of the art," Pattern Recognition, vol. 31, no. 5, 1998, 517-530.
5. B. Albadr and S.A. Mahmoud, "Survey and bibliography of Arabic optical text recognition," Signal Processing, vol. 41, no. 1, 1995, 49-77.
6. B. K. Jeng and R. T. Chin, "One-pass parallel thinning: analysis, properties, and quantitative evaluation, " IEEE Trans. Pattern Anal. Mach. Intell. 14(11), 1120–1140 (1992).
7. B. Li and C. Y. Suen, "A knowledge-based thinning algorithm," Pattern Recognition 24(12), 1211–1221 (1991).
8. C. Arcelli and G. S. D. Baja, "A one-pass two operations process to detect the skeletal pixels on the 4-distance transform," IEEE Trans. Pattern Anal. Machine Intell. 11, 411–414 (1989).
9. C. C. Han and K. C. Fan, "Skeleton generation of engineering drawings via contour matching," Pattern Recognition, 27(2), 261–275 (1994).
10. C.C. Tappert, C.Y. Suen and T. Wakahara, "The state of the art in on-line handwriting recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 8, 1990, 787-808.
11. C. Kan, M.D. Srinath,J. Liua, P. Gaderb, "Invariant character recognition with Zernike and orthogonal Fourier-Mellin moments," Pattern Recognition 35, 2002, 143-154
12. D. Doermann, "The retrieval of document images: a brief survey," Proceedings of the Fourth International Conference on Document Analysis and Recognition, vol. 2, 1997, 945-949.
13. D.G. Elliman and I.T. Lancaster, "A review of segmentation and contextual analysis techniques for text recognition," Pattern Recognition, vol. 23, no. 3/4, pp. 337-346, 1990.
14. D. Trier, A.K. Jain, and T. Taxt, "Feature extraction methods for character recognition - a survey," Pattern Recognition, vol. 29, no. 4, pp. 641-662, Apr. 1996.
15. D. Vernon "Machine Vision," Prentice-Hall, 1991, pp 76 - 79.
16. E. R. Davies and A. P. N. Plummer, "Thinning algorithm: a critique and a new methodology, " Pattern Recognition 14, 53–63 (1981).
17. E.S. Ristad, and P.N. Yianilos, "Learning String-Edit Distance," IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 20, no. 5, pp. 522-532, May 1998.
18. F. Nouboud and R. Plamondon, "On-line recognition of handprinted characters: survey and beta tests," Pattern Recognition, vol. 23, no. 9, pp. 1031-1044, 1990.
19. K.C. Fan, D.F. Chen, and M.G. Wen, "Skeletonization of binary images with nonuniform width using block decomposition and contour vector matching method," Pattern Recognition, Vol. 31, No. 7, pp. 823-838, July 1998.
20. L. Lam and C. Y. Suen, "Structural classification and relaxation matching of totally unconstrained handwritten ZIP-code numerals," Pattern Recognition 21(1), 19–31 (1988).
21. L. Lam, S. W. Lee and C. Y. Suen, "Thinning methodologies- A comprehensive survey, " IEEE Trans. Pattern Anal. Mach. Intell., 14(9), 869–885 (1992).
22. N. J. Naccache and R. Dhinghal, "An investigation into the skeletonization approach of Hilditch, " Pattern Recognition 17, 279–284 (1984).
23. N. J. Naccache and R. Shinghal, "STPA: a proposed algorithm for thinning binary patterns, " IEEE Trans. System Man Cybernet. 14(3), 409–418 (1984).
24. R. A.Wanger, and M. J. Fischer, "The string-to-string correction problem, " Journal of the ACM, 21, pp.168-173
25. R. Gonzalez and R. Woods, "Digital Image Processing, " Addison-Wesley Publishing Company, 1992, pp 518- 552.
26. R. Haralick and L. Shapiro "Computer and Robot Vision, " Vol. 1, Chap. 5, Addison-Wesley Publishing Company, 1992.
27. R.H. Davis and J. Lyall, "Recognition of handwritten characters-a review", Image and Vision Computing, vol. 4, 1986, 208-218
28. R. L. Grimsdale, F. H. Summer, C. J. Tunis and T. Kilburn, "A system for the automatic recognition of patterns, " Proc. IEEE 210–221 (1959).
29. R. Plamondon and S.N. Srihari, "Online and off-line handwriting recognition: a comprehensive survey", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, 2000, pp. 63-84.
30. R. R. Morales, I. Castellanos B, T. E. Alarcón, R. W. Navarro, E. M. Felipe, L. S. Cuello, "Elimination Of Noise Via Morphological Filters And Components Labeling. Its Use In The Study Of Angiogenesis," 6th Internet World Congress for Biomedical Sciences, http://www.uclm.es/inabis2000/symposia/files/151/material.htm
31. R. Stefanelli and A. Rosenfeld, "Some parallel thinning algorithms for digital pictures, " J. ACM 18, 255–264 (1971).
32. S. Mori, C.Y. Suen, and K. Yamamoto, "Historical review of OCR research and development", Proceedings of the IEEE, vol. 80, 1992, 1029-1058
33. S. D. Baja, "Finding and ranking basic structures on complex line patterns, " pp. 33–42, in Shape Structure and Pattern Recognition, D. Dori and A. Brookstein, eds. World Scientific, Singapore (1995).
34. T. Pavlidis, "A thinning algorithm for discrete binary images, " Comput. Graphics Image Process. 13, 142–157 (1980).
35. T. Pavlidis, "A vectorizer and feature extractor for document recognition, " Comput. Vision Graphics Image Process.35, 111–127 (1986)
36. T. Pavlidis, "A hybrid vectorization algorithm, " Proc. 7th Int. Conf. on Pattern Recognition, pp. 490–492 (1984).
37. T. Pavlidis and K. Steiglitz, "The automatic counting of asbestos fibers in air samples, " IEEE Trans. Comput. 27, 258–261 (1978).
38. T. Y. Zhang and C. Y. Suen, "A fast parallel algorithm for thinning digital patterns, " Comm. ACM, 27(3), 236–239 (1984).
39. F. Y. Shih and W. T. Wong, "Fully parallel thinning with tolerance to boundary noise, " Pattern Recognition, 27(12), 1677–1695 (1994).
40. T.H. Hildebrandt and W. Liu, "Optical recognition of handwritten Chinese characters: advances since 1980," Pattern Recognition, vol. 26, no. 2, pp. 205-225, 1993
41. V.K. Govindan and A.P. Shivaprasad, "Character recognition -- A review," Pattern Recognition, vol. 23, no. 7, pp. 671-683, 1990
42. Y. S. Chen and W. H. Hsu, "A modified fast parallel algorithm for thinning digital patterns, " Pattern Recognition letter 7, 99–106 (1987).
43. Y.Y. Tang, S.W. Lee, and C.Y. Suen, "Automatic document processing - a survey", Pattern Recognition, vol. 29, no. 12, 1996, 1931-1952.
44. Y. Y. Zhang and P. S. P. Wang, "A new parallel thinning methodology, " Int. J. Pattern Recognition Artif. Intell. 8(4), 999–1010 (1994).
45. S.L. Hsieh, T.M. Parng, "A new scheme for rectifying recognition results of printed Chinese characters, " Pattern Recognition 34 (2001)2121-2132
46. B. Lazzerini, F. Marcelloni, "A linguistic fuzzy recogniser of off-line handwritten characters, " Pattern Recognition Letters 21 (2000) 319-327
47. K.C. Fan, M.G. Wen, C.C. Han (2002), "A New Method for Chinese Characters Classification Using Pseudo Skeleton Features," Fifth Asian Conference on Computer Vision, Jan. 2002
48. G.J. Klir and B. Yuan, "Fuzzy sets and fuzzy logic: theory and application," Prentice Hall 1995, p356-361
49. A.B. Wang, K.C. Fan, "Optical recognition of handwritten Chinese characters by hierarchical radical matching method," Pattern Recognition 34 (2001) 15-35
50. C.D. Stefano, A.D. Cioppa, and A. Marcelli, "Character precalssification based on genetic programming, " Pattern Recognition Letters, Vol. 23, pp. 1439-1448, 2002.
51. L. Heuttle, T. Paquet, J.V. Moreau, Y. Lecourtier, and C. Olivier, "A structural /statistical feature based vector for handwritten character recognition, " Pattern Recognition Letters, Vol. 19, pp. 629-641, 1998.
52. P. Foggia, C. Sansone, F. Tortorella, and M. Vento, "Combining statistical and structural approaches for handwritten character description, " Image vision computing, Vol. 17, pp701-711, 1999.
53. M. Hanmandlu, K.R. M. Mohan, S. Chakraborty, S. Goyal, and D.R. Choundhury, "Unconstrained handwritten character recognition based on fuzzy logic, " Pattern Recognition, Vol. 36, pp. 603-623, 2003.
54. A. de Carvalho and M.C. Fairhurt, "Applying adaptive logic networks to character recognition, " Pattern Recognition Letters, Vol. 19, pp. 469-473, 1998.
55. P.D. Gader, J.M. Keller, "Applications of fuzzy set theory to handwriting recognition, " Fuzzy Systems, 1994. IEEE World Congress on Computational Intelligence. Proceedings of the Third IEEE Conference on , 26-29 Jun 1994 P. 910 –917.
56. G.D. Trier, A.K. Jain, T. Taxt, "Feature extraction methods for character recognition - a survey, " Pattern Recognition Volume 29, Issue 4, April 1996, pp. 641-662.
57. J.C. Handley, "Improving OCR accuracy through combination a survey, " Systems, Man, and Cybernetics, 1998. 1998 IEEE International Conference on , Volume 5 , 11-14 Oct 1998 pp. 4330 –4333.
58. A. Amin, "Off line Arabic character recognition a survey, " Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on , Volume 2 , 18-20 Aug 1997 pp. 596 –599.
59. B. Al-Badr, S. Mahmoud, "A Survey and bibliography of Arabic optical text recognition, " Signal Processing Vol1, Issue1, January, 1995, pp. 49-77.
60. S. Madhvanath, V. Govindaraju, "The role of holistic paradigms in handwritten word recognition, " Pattern Analysis and Machine Intelligence, IEEE Transactions on , Volume 23 Issue 2 , Feb 2001 pp. 149 -164.
61. Y.Y. Tang, B.F. Li, H. Ma, and J. Liu, "Ring-Projection-Wavelet-Fractal Signatures: A Novel Approach to Feature Extraction, " IEEE Transactions on circuits and systems—II: Analog and digital signal processing, VOL. 45, NO. 8, AUGUST 1998, pp. 1130-1134.
62. Y.Y. Tang, B.F. Li, H. Ma, and J. Liu, C.H. Leung, "A novel approach to optical character recognition based on ring-projection-wavelet-fractal signatures, " Pattern Recognition, 1996., Proceedings of the 13th International Conference on , Volume: 2 , 25-29 Aug 1996, pp. 325-329.
63. H.P. Chiu, D.C. Tseng, J.C. Cheng, "Invariant handwritten Chinese character recognition using weighted ring-data matrix, " Document Analysis and Recognition, Proceedings of the Third International Conference on , Volume: 1 , 14-16 Aug 1995, pp.116-119
64. D.C. Tseng, H.P. Chiu, "Fuzzy ring data for invariant handwritten Chinese character recognition, " Pattern Recognition, 1996, Proceedings of the 13th International Conference on , Volume: 3 , 25-29 Aug 1996, pp. 94-98.
65. T.N. Yang, S.D. Wang, "A rotation invariant printed Chinese character recognition system, " Pattern Recognition Letters 22, 2001, pp.
66. M. Hanmandlua, K.R.M. Mohanb, S. Chakrabortyc, S. Goyald,D.R. Choudhurye, "Unconstrained handwritten character recognition based on fuzzy logic, " Pattern Recognition 36, 2003, pp. 603 – 623.
67. D. Shi, S.R. Gunn, R.I. Damper, "Handwritten Chinese character recognition using nonlinear active shape models and the Viterbi algorithm, " Pattern Recognition Letters 23, 2002 pp. 1853–1862.
68. A.F.R. Rahman, R. Rahman, M.C. Fairhurst, "Recognition of handwritten Bengali characters: a novel multistage approach, " Pattern Recognition 35, 2002, pp. 997–1006.
69. I.J. Kim, J.H. Kim, "Pair-wise discrimination based on a stroke importance measure, " Pattern Recognition 35, 2002, pp.2259 – 2266.
70. V.G. Gezerlis , S. Theodoridis, "Optical character recognition of the Orthodox Hellenic Byzantine Music notation, " Pattern Recognition 35, 2002, pp.895 –914.
71. J. Liua, P. Gaderb, "Neural networks with enhanced outlier rejection ability for off-line handwritten word recognition, " Pattern Recognition 35,2002, pp. 2061 – 2071.
指導教授 范國清(Kuo-Chin Fan) 審核日期 2003-6-23
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明