中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/90044
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 80990/80990 (100%)
造访人次 : 41641241      在线人数 : 1407
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/90044


    题名: 基於漸進式修正網路與拼寫錯誤修正語言模型之場景文字辨識;Progressive Rectification Network and Spelling Error Correction Language Model Based Scene Text Recognition
    作者: 彭明正;PENG, MING-CHENG
    贡献者: 資訊工程學系
    关键词: 深度學習;文字辨識;語言模型;拼寫錯誤修正;Deep Learning;Text Recognition;Language Model;Spelling Error Correction
    日期: 2022-09-23
    上传时间: 2022-10-04 12:08:55 (UTC+8)
    出版者: 國立中央大學
    摘要: 場景文字辨識因為擁有廣大的應用領域而快速地成為了一個熱門的研究主題,不同於一般的文本辨識,複雜的背景、不規則方向、字元遮擋、影像模糊等等情況經常出現在場景文字之中,因此場景文字辨識必須要比一般的文本辨識更具備處理影像多樣化和影像品質下降的能力。
    近年來隨著深度學習技術的發展,已經有不少方法嘗試著解決場景文字辨識任務,然而對於人類來說,文字辨識這項任務不僅只從眼睛看到所判斷,同時還會考慮語意知識而給出更合理的辨識結果,為了使深度學習模型更接近於人類閱讀文字的過程,近年來越來越多方法開始轉往如何使模型學會更豐富的語義資訊,然而在現有文獻中,大部分都使用了英文資料集做研究,若直接將這些研究用在中文資料集上可能並不適合。有鑑於此本論文提出了一個更適合中文文字辨識的深度學習模型,我們加入了語言模型並且使用額外的文本資料做拼寫錯誤修正欲訓練,這樣能使我們的場景文字辨識模型架構具有更好的語意推理能力,此外我們還提出了漸進式修正網路,取代了現有文獻方法中最常使用的修正網路[1],漸進式修正網路能夠使模型更好的處理不規則方向的字。
    在實驗中我們展現了本論文所提出的方法優於[1, 2]這兩種經典的場景文字辨識架構,這兩種架構也經常被其他文獻拿來比較,本論文的方法也優於[3, 4]這兩種近年所提出的方法,另外在消融實驗中我們還探討了模型中各個部分的有效性,我們相信本論文是一個更適合中文文字辨識任務的方法。;Scene text recognition has quickly become a hot research topic due to its wide range of applications. Different from general text recognition, complex backgrounds, irregular directions, occlusion of characters, blurred images, etc. often appear in scene texts. Therefore, scene text recognition must be more capable of dealing with image diversification and image quality degradation than general text recognition.
    In recent years, with the development of deep learning technology, many methods have been tried to solve the task of scene text recognition. However, for humans, the task of text recognition is not only judged from what the eyes see, but also considers semantic knowledge to give more reasonable recognition results. In order to make the deep learning model closer to human reading, more and more methods have begun to turn to how to make the model learn richer semantic information in recent years. However, in the existing literature, most of them use English datasets for research, and it may not be suitable to directly apply these studies to Chinese datasets. In view of this, this paper proposes a deep learning model that is more suitable for Chinese scene text recognition. We added a language model and used additional text data for spelling error correction training, which enabled our scene text recognition model to have better semantic reasoning capabilities. In addition, we also propose a progressive rectification network, which replaces the most commonly used rectification network in existing literature [1], which enables the model to better handle text with irregular orientations.
    In the experiments, we show that the method proposed in this paper outperforms the two classic scene text recognition method [1, 2], which are often compared by other literatures. The method of this paper is also better than the two methods proposed in recent years [3, 4]. In addition, in the ablation study, we also explored the effectiveness of each part of the model, and we believe that this paper is a more suitable method for Chinese scene text recognition.
    显示于类别:[資訊工程研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML102检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明