目前消費型電子產品朝向更便利、人性化的人機互動界面發展。不經由直接碰觸,而藉身體姿態與機器進行互動,已經成為一個重要的研究趨勢。本論文提出一個基於立體視覺手勢辨識的人機互動系統,將雙攝影機架設於電腦螢幕上方,鏡頭拍攝螢幕前方空間。假設操控電腦的手掌為距離螢幕最近的物體,利用類神經網路求得像差與影像深度的映射模型,接著針對所找出之最近的物體進行手指偵測,並藉由手指的數目與位置定義出不同靜態手勢,最後建立以不同手勢間狀態轉移與狀態執行動作的動態手勢模型,系統中不同的動態手勢分別表示不同的人機互動指令。 實驗結果顯示,藉由立體視覺方法可有效且精確地找出手掌位置,無頇耗費過多運算成本於複雜背景中取出手掌。我們的動態手勢模型辨識方法所達到的辨識率與其他研究方法相差不大,但計算複雜度較低,而且當系統要加入新的手勢,只需添加新的狀態轉移描述即可,無需重新訓練動態手勢模型。因此我們的方法為未來不斷衍生的新的人機互動應用,提供了一個高度彈性的、高效率而可靠的人機互動方法。 At present, the development of consumer electronic products focuses on more convenient and friendly interactive interface. Through the body posture interacting with machine without direct touch has become an important research trend. This paper presents a stereo vision-based gesture recognition system of human-computer interaction. Two cameras are set on the computer screen, and the Lens Shooting is in front of screen. Assuming the hand which controls the computer is the nearest object to the screen. We use the neural network to achieve the mapping model of aberrations and image depth. We detect the fingers after the nearest object is found, and the static gestures are defined by the numbers of finger. Finally, the dynamic gesture model is established on the state transfer of different static gestures and the action of states. Different dynamic gestures indicate the different human-computer interaction commands. Based on the stereo vision, the location of hand can be efficiently and accurately identified by the experiments. And we don’t need to waste too much cost on finding the hand in a complex background. Our dynamic gestures recognition efficiency is the same as other research, and the complexity is low than others. When the system wants to add a new gesture, a new description of the state transition is required. It has no need to retrain the dynamic gesture model. Therefore, the system we present here provides a highly flexible, efficient and reliable human-computer interaction method.