考量街景圖像中所出現的交通路牌與商家招牌等傳達了重要的影像相關資訊,本研究提出街景影像之招牌/路牌偵測機制,於其中定位文字與圖形區域。研究的挑戰在於街景影像常包含與文字紋理相似的雜亂背景,且畫面中的招牌或路牌可能遭到其他物體遮蔽,天候、光線和拍攝角度等因素亦增加偵測的困難。此外,中文字能夠以垂直和水平方式書寫,因此必須能夠偵測這些不同方向的文字並加以區分。我們所提出的偵測機制分成兩個部分,第一部分定位影像中的路牌及招牌所屬區域,採用全卷積網路(Fully Convolutional Network, FCN)訓練街景路牌及招牌偵測模型,將偵測的招路牌視為感興趣區域(Region of Interest, ROI)。第二部分則於ROI中擷取文字及商標,我們使用區域候選網絡(Region Proposal Network, RPN)訓練文字偵測模型,藉此對影像分別做水平與垂直的文字串偵測,再根據第一部分所偵測的ROI,減少RPN對文字的錯誤偵測。最後我們進行後處理以結合水平及垂直文字串,排除錯誤偵測和處理文字串的複雜交集情形,以文字串長寬比、面積、交集情況、招牌背景顏色等來判定有效的區域。實驗結果顯示本研究能有效的在複雜街景畫面中找出招/路牌並偵測文字與圖案區域,並探討兩種不同架構的深度學習網路在此應用中的使用方式。;Considering that traffic/shop signs appearing in street view images contain important visual information such as locations of scenes, effects of advertising on billboards, and the information of store, etc., a text/graph detection mechanism in street view images is proposed in this research. However, many of these objects in street view images are not easy to extract with a fixed template. In addition, street view images often contain cluttered backgrounds such as buildings or trees, which may block some parts of the signs, complicating the related detection. Weather, light conditions and filming angle may also increase the challenges. Another issue is related to the Chinese writing style as the characters can be written vertically or horizontally. Detecting different directions of text-lines is one of the contributions in this research. The proposed detection mechanism is divided into two parts. A fully convolutional network (FCN) is used to train a detection model for effectively locating the positions of signs in street view images, which will be viewed as the regions of interest. The text-lines and graphs in the sign regions can then be successfully extracted by Region Proposal Network (RPN). Finally, post-processing is applied to distinguish horizontal and vertical text-lines, and eliminate false detections. Experimental results show the feasibility of the proposed scheme, especially when complex street views are investigated.