應用塑模與非塑模方式作人類行為分析; Human Behavior Analysis using Model-based and Model-free Approaches

NCUIR > College of Electrical Engineering & Computer Science > Graduate Institute of Computer Science and Information Engineering > Electronic Thesis & Dissertation > Item 987654321/8903

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/8903

Title:	應用塑模與非塑模方式作人類行為分析;Human Behavior Analysis using Model-based and Model-free Approaches
Authors:	余執彰;Chih-Chang Yu
Contributors:	資訊工程研究所
Keywords:	行為分析;動作辨識;人體塑模;Behavior analysis;Action recognition;Human body modeling
Date:	2009-01-06
Issue Date:	2009-09-22 11:37:23 (UTC+8)
Publisher:	國立中央大學圖書館
Abstract:	近年來，由於大容量儲存裝置以及網路多媒體技術的普及，多媒體檔案的數量有著巨幅的成長。為了能快速以及有效率的將這些多媒體資訊做有系統的分類與規劃，自動化影像分析系統勢在必行。近年來，專家學者們在人類行為分析上多有著墨，原因無他，正是因為人類的行為往往是視訊影像中令人注目的焦點。因此，若能詳盡分析畫面中人類的行為，將能提供此類分析系統極為豐富的資訊。在本論文中，我們深入的探討了辨識人類行為的兩種主要研究方法。此外，對於此兩類研究方法所遇到的問題，我們也提出了詳細的描述及解決方法。　　第一類方法是以模組化為基礎的研究方法。此類方法是將人體以關節點為區分，歸類為幾部分，例如頭部，身體，上臂，前臂，大腿，小腿等。在此類方法中，我們提出了一個階層式的架構，依循頭部，軀幹，四肢的順序，依序取出人體的這幾個重要部位。在擷取四肢的部份，我們進一步的提出了兩種擷取方式，分別是以直線為基礎的直觀比對方式以及以區塊為基礎的機率比對模型。第一種比對方法速度較快但無法處理肢體自我遮蔽的問題。為了解決自我遮蔽的問題，我們提出了一個機率式比對的方法來找出最可能的肢體姿勢。藉由此種方法，我們可以解決在人體模組化過程中常常發生的部分肢體遮蔽的問題。　　第二類方法是不採用模組化的研究方法。由於不取出人類肢體特徵，此類方法以分析整個前景物體的特徵為主要分析方式。本論文中我們只針對二值化後的前景來進行分析。一般而言，要比較兩個圖樣的類似程度，L1-norm 是很常使用的方式。然而，L1-norm的比對效率會隨著圖樣的特徵維度增加而降低。為了解決此問題，我們將行為辨識的問題轉變成直方圖比對的問題。如此一來我們可以利用許多直方圖的特性來解決比對效率不佳的問題。在本論文中，我們提出了一個創新的直方圖建構方式，將一個直方圖切割為多重解析度的直方圖。藉由不同解析度直方圖之間的關係，我們提出了一種加速比對過程的方法，在維持最高辨識率的前提下，將比對時間縮短為傳統L1-norm比對方法的９％。同時，我們也提出了一種自動辦別影像內容重要性的機制，讓不同的影像可針對其內容的重要部分建立出各具特色的多重解析度直方圖，來提高比對的辨識率。　　為了證明所提出方法的實用性與穩定性，我們分析了在單一視角下的幾種人類常有的行為進行實驗並在最後做出總結及未來的改進方向。 Recently, the development of video archives grows rapidly due to the advancement and popularization of multimedia internetworking technologies and high-capacity data storage devices. To efficiently summarize these multimedia contents, an automated video understanding system is highly required. When performing video understanding and summarization, researchers are most interested in analyzing human behaviors due to the high demanding of various applications. Hence, having a detailed description of human actions can provide rich information for these applications. In this dissertation, we make a broad study on human behavior analysis. Among them, we comprehensively study two main categories of approaches for human action recognition. Problems that may occur in both categories of approaches are fully addressed and solutions are proposed. The first category is the model-based approach. For this type of approach, several body parts including head, torso, arms and legs are extracted to build a human body model. A hierarchical system is designed starting with head extraction, torso extraction, and following by limb extraction. In terms of limb extraction, two methods are proposed including line-based and patch based methods. The line-based method is simpler and faster. However, it cannot deal with the partial occlusion problem. Thus, we further propose the patch based method which adopts a probabilistic framework to find the best configuration of limbs. By using the patch based method, we can successfully tackle the partial occlusion problem, which usually happens on the limbs. The second category is the model-free approach. This type of approach tries to recognize human actions via the overall video objects. In this dissertation, we propose a novel approach based on the human silhouettes. As we know, the L1-norm is a popular way to estimate the similarity between two patterns. However, the computation efficiency decreases because the L1-norm measurement is relevant to the dimension of feature. In our work, we convert the human action recognition problem to a histogram matching problem. By doing so, many characteristics of histogram matching can be employed to improve the recognition efficiency and accuracy. Moreover, a novel histogram matching method is proposed by creating multi-resolution histograms, whose bins at higher resolution levels are unevenly partitioned into its lower resolution levels. By utilizing this multi-resolution structure, the computation time will only be relevant to the partitioned histogram bins and the recognition time can be reduced to 9% of the original L1-norm measurement. Because of the reduced computational complexity, the proposed approach allows a real-time recognition system to be realized. To demonstrate the feasibility and validity of the proposed approaches, several generic human actions, such as walking, running, jumping, waving hands, falling were performed under a monocular camera. With the success of the experimental results, we believe that the development of this framework can eventually be applied to all kinds of human centric event detection and behavior understanding systems.
Appears in Collections:	[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Size	Format

社群 sharing

Loading...