物件追蹤在電腦視覺和深度學習領域是一項熱門的議題,物件追蹤目的在於從一串具連續性的畫面當中找出目標物件的所在位置,現今方法多以深度學習來提高辨識的準確度,而物件追蹤在深度學習領域可分為單物件追蹤及多物件追蹤,前者目的在於判斷物標物件在連續畫面中的位置,而後者目的在於將不同時間點的相同件進行配對,本論文將著重於探討單物件追蹤。 現今基於深度學習的單物件追蹤方法多數都是採用孿生網路的架構,並透過相關計算來找出特徵圖上各個位置與目標物的相關性,本文將試圖改善目前物件追蹤方法尚存在的一些問題,我們嘗試加入變異損失函數來強化模型區分前背景的能力,並加入圖像卷積網路來藉由目標物與周遭物件之間的關聯來提升模型判斷的準確度。 由於物件偵測模型是針對每張輸入影像判斷目標物是否存在於影像當中,但在一串連續性的畫面當中,每幀畫面都有些許的不同,如此可能會在某幾幀畫面造成漏偵測的問題,因此我們嘗試加入物件追蹤模型來解決物件偵測模型在連續畫面中的不穩定性,當物件偵測模型偵測到目標物時可由物件追蹤模型來追蹤目標物在往後幾幀畫面的位置,我們將物件追蹤模型結合招牌偵測模型以提升偵測的穩定度及準確度。 ;Visual Object Tracking is a popular task in computer vision and deep learning. The purpose of object tracking is to find the location of the target object from a series of continuous images. These years, most object tracking method use deep learning to improve the accuracy. In the field of deep learning, object tracking can be divided into single object tracking and multi-object tracking, the former aims to find the location of the target object in each frames, while the later aims to do the object association, which matches the objects in different time steps. This paper will focus on single object tracking. Most of the current deep learning based single object tracking methods use Siamese network architecture, then using the correlation filter to find the correlation between target image and search image. This paper try to improve some existing problems in Siamese based visual object tracking method. We try to add variance loss to enhance the model to distinguish the foreground and the background. Besides, we add the graph convolutional network to improve the accuracy by associating the target object and the surrounding objects. Object detection model is to determine whether the target object exists in the image for each input image, but in a continuous series of frames, each frame is slightly different, some objects may be miss detected in some frames, so we try to use tracking model to solve the problem. When the detection model detect the target object, we can use tracking model to track the target in the later frames. We use the visual object tracking model to enhance the stability and the accuracy of the object detection model.