本論文旨在開發應用於頭戴式光場顯示的眼動追蹤模型。本模型以機器學習為基礎,以可見光的攝影機進行拍攝,利用捕捉到人眼的可見光影像作為輸入,經過神經網路得到對應的人眼凝視點作為輸出。 本模型是由兩種網路架構串連而成的,分別為特徵定位模型以及映射模型,其中特徵定位模型利用卷積神經網路(convolution neural network,簡稱CNN)提取RGB影像的特徵圖,再使用特徵圖計算人眼在影像中的對應座標X_e、Y_e,目前並沒有對應的資料庫能夠符合光場顯示的應用場域,因此我們設計了一套拍攝架構用於產生眼睛影像的資料庫;映射模型為全連接網路(fully connected network,簡稱FCN)架構,在每次眼動追蹤前紀錄一組校正影像,接著使用校正影像訓練映射模型的參數,訓練完成的映射模型能將眼睛(影像)座標X_e、Y_e轉換成凝視點(螢幕)座標X_g、Y_g,達到眼動追蹤的目的。 本研究的主要貢獻為(1)建立光場顯示的眼動追蹤資料庫、(2)開發應用於光場顯示的眼動追蹤模型、(3)利用RGB影像進行追蹤,不需要額外的光源。 ;This study aims to develop an eye-tracking model for use in head-mounted light field displays. The model is based on machine learning and utilizes a visible light camera to capture images. It takes the captured visible light images of the human eye as input and employs a neural network to output the corresponding gaze point. The model consists of two interconnected network architectures: the feature localization model and the mapping model. The feature localization model utilizes a Convolutional Neural Network (CNN) to extract feature maps from RGB images. These feature maps are then used to compute the corresponding coordinates, X_e and Y_e, of the human eye in the image. Since there is currently no existing database that matches the application domain of light field displays, we designed a capture setup to generate a database of eye images. The mapping model employs a Fully Connected Network (FCN) architecture. Before each eye-tracking session, a set of calibration images is recorded. The parameters of the mapping model are then trained using these calibration images. The trained mapping model can convert the eye (image) coordinates X_e and Y_e to gaze point (screen) coordinates X_g and Y_g, thereby achieving eye-tracking. The main contributions of this research are as follows: (1) Establishing an eye-tracking database for light field displays, (2) Developing an eye-tracking model specifically designed for light field displays, and (3) Utilizing RGB images for tracking without the need for additional light sources.