肢體分析與異常事件分析在視訊中之應用

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：136

、訪客IP：18.219.28.179

姓名

莊啟宏(Chi-hung Chuang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

肢體分析與異常事件分析在視訊中之應用
(Posture Analysis and Suspicious Event Analysis from Videos)

相關論文

★ 使用視位與語音生物特徵作即時線上身分辨識	★ 以影像為基礎之SMD包裝料帶對位系統
★ 手持式行動裝置內容偽變造偵測暨刪除內容資料復原的研究	★ 基於SIFT演算法進行車牌認證
★ 基於動態線性決策函數之區域圖樣特徵於人臉辨識應用	★ 基於GPU的SAR資料庫模擬器：SAR回波訊號與影像資料庫平行化架構 (PASSED)
★ 利用掌紋作個人身份之確認	★ 利用色彩統計與鏡頭運鏡方式作視訊索引
★ 利用欄位群聚特徵和四個方向相鄰樹作表格文件分類	★ 筆劃特徵用於離線中文字的辨認
★ 利用可調式區塊比對並結合多圖像資訊之影像運動向量估測	★ 彩色影像分析及其應用於色彩量化影像搜尋及人臉偵測
★ 中英文名片商標的擷取及辨識	★ 利用虛筆資訊特徵作中文簽名確認
★ 基於三角幾何學及顏色特徵作人臉偵測、人臉角度分類與人臉辨識	★ 一個以膚色為基礎之互補人臉偵測策略

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在電腦視覺領域中，行為辨識與異常事件分析是相當基礎且重要的兩個問題。同時可應用在很多方面，例如：視訊監控，保全系統，犯罪偵測…等。主要目的是可以在任何一個開放的環境中找出異常的事件與異常的行為並且不論場景如何地變化。
在這篇論文中提出了一個新穎的方法，此方法可以從一段影片中找出帶著攜帶物的物件，並且利用找出的這個物件進行異常事件分析。並且也提出了一個新穎的切割演算法和可變形的三角型方法去切出人的肢體各部分。我們在這邊分別的討論這兩種方法。在異常事件分析這個主題裡，首先，一個利用最小值濾波器背景相減的方法來偵測出前景物。然後，一個新穎的追蹤方法來描述一個移動的物件並且更進一步的得到此物體的運動軌跡。利用此軌跡，當有兩個物件交錯時，一個新的比例統計的方法被提出來分析攜帶物並且分析最後攜帶物落在哪個物件上面。接下來，利用顏色投影的方法，攜帶物的顏色可以大概的被投影在前景物所在的區域。接下來，再利用混和高斯模型的方法，我們可以正確的把攜帶物切割出來。當背包（攜帶物）被找出來後，我們提出一個有限狀態機事件分析的方法從一段影片中分析背包轉換的各種可能的異常事件。就如我們知道，我們一開始並不知道攜帶物的形狀與顏色，所以並沒有一個自動的系統可以利用攜帶物轉移的特性來分析這樣的異常事件（例如搶劫）。然而，利用我們提出來的顏色比例統計的方法，任何不同的攜帶物將可以正確的被找出，並且最後利用混和高斯模型的方法，可以完整的將整個背包切割出進而進行事件分析。在行為辨識方面這個主題裡，為了使人的各種姿勢更好分析，首先系統先將人的輪廓用三角化的方法切割成多個三角形。接下來，系統將切割完後的三角形利用先深後廣的演算法將這些三角型的展開成一個擴張樹。接下來提出以骨幹為基底與用模板為驅動的兩種混合方法，把人的肢體從各種姿勢的輪廓中各分別切割出來。為了去分析這些肢體動作，一個新穎的群聚方法被提出來訓練且分類這些主要姿勢。因此，這些模組空間被用來當作姿勢的分類與切割。在群聚後，當輸入的姿勢如果不屬於資料庫的類別條件，以骨幹為基底的方法將被用來分割這個姿勢的各個部位，並且利用混合高斯模型的技巧來分開肢體的各部位。然而，如果兩個姿式的輪廓非常相似，將有可能會因此產生錯誤的模組選擇。因此，此篇論文利用追蹤的方法與前後畫面的關係來改善這類的問題並找出最好的模板。此外，混合高斯模型的切割技巧被用來穩定的把各部位肢體完整的切割出來。實驗的結果將會證明我們提出的方法十分的具有強健性、正確性和功能非常強大的偵測攜帶物還有異常事件分析與穩定的把各種行為的肢體切割出來。

摘要(英)

Behavior recognition and suspicious event analysis is a fundamental and important problem which can be applied to various applications like video surveillance, navigation, content-based image retrieval and so on. Its goal is to find the abnormal event of the environment and abnormal behavior no matter how the environmental conditions change.
This thesis proposes a novel method to detect carried objects from videos and applies it for suspicious event analysis, and presents a novel segmentation algorithm to segment a body posture into different body parts by using the technique of deformable triangulation. We discuss these two methods separately. In suspicious event analysis, First of all, a background subtraction using a minimum filter is proposed for detecting foreground objects. Then, a novel kernel-based tracking method is described for tracking each moving object and further obtaining its trajectory. With the trajectory, a novel ratio histogram is then proposed for analyzing the interactions between the carried object and its owner. After color re-projection, different carried objects can be accurately segmented from the background by taking advantages of GMMs (Gaussian mixture models). After bag detection, we propose an event analyzer to analyze various suspicious events using finite state machines. Even though there is no prior knowledge (like shape or color) about the bag, our proposed method still performs well to detect suspicious events from videos. As we know, due to the uncertainties of bag shape and color, there is no automatic system which can analyze various suspicious events (like robbery) caused by bags without any manual effects. However, by taking advantages of our proposed ratio histogram, different carried bags can be well segmented from videos and applied for event analysis. In Behavior recognition, First of all, to better analyze each posture, we triangulate it into triangular meshes, from which a spanning tree can be found using a depth-first search scheme. Then, two hybrid methods, i.e., the skeleton-based and model-driven ones, are proposed for segmenting the posture into different body parts according to its self-occlusion conditions. To analyze the self-occlusion condition, a novel clustering scheme is then proposed for clustering the training samples into a set of key postures. Then, a model space can be formed and used for posture classification and segmentation. After clustering, if the input posture belongs to the non-self-occlusion category, the skeleton-based scheme will be used for dividing it into different body parts which will be then refined using a set of Gaussian mixture models (GMMs). As to the self-occlusion case, we propose a model-driven technique for selecting a good reference model for guiding the process of body part segmentation. However, if two postures’ contours are similar, some ambiguity will be caused and lead to the failure in model selection. Thus, this thesis proposes a tree structure via a tracking technique for tackling this problem so that the best model can be selected not only from the current frame but also its previous frame. Thus, a suitable GMM-based segmentation scheme can be driven for finely segmenting a body posture into different body parts. Experimental results have proved that the proposed method is robust, accurate, and powerful in carried object detection and suspicious event analysis and in body part segmentation.

關鍵字(中)

★ 肢體分析與異常事件分析

關鍵字(英)

★ Posture Analysis and Suspicious Event Analysis

論文目次

摘要 V
Abstract VII
誌謝 IX
Chapter 1：Introduction 1
1.1 Motivation 1
1.2 Review of Related Works 4
1.2.1 Previous Methods for Behavior Analysis 4
1.2.2 Previous Methods for Object Detection 5
1.3 Organization of the Dissertation 6
Chapter 2：System Overview 7
2.1 System Overview of Body Part Segmentation 7
2.2 System Overview of Object Detection 8
Chapter 3：The Body Part Segmentation System 10
3.1 Deformable Triangulations 10
3.2 Skeleton Extraction and Body Part Segmentation 12
3.2.1 Triangulation-based Skelton Extraction 12
3.2.2 Body Part Segmentation Using Skeletons 13
3.2.3 Body Part Segmentation Using Blobs 14
3.3 Model-driven Body Part Segmentation 17
3.3.1 Posture Classification Using Centroid Contexts 17
3.3.2 Key Model Selection through Clustering 20
3.3.3 Tree Structure for Reference Model Selection 22
3.3.4 Model-driven Segmentation Scheme 25
Chapter 4：Carried Object Detection and Suspicious Event Analysis 28
4.1 Moving Object Extraction and Tracking 28
4.1.1 Background Subtraction Using a Minimum Filter 28
4.1.2 Kernel-based Object Tracking Using Multiple Frames 29
4.2 Missed Color Detection Using Ratio Histogram 33
4.3 Carried Object Detection Using Gaussian Mixture Models 37
4.3.1 Segmentation Using Single Gaussian Model 37
4.3.2 Segmentation Using Gaussian Mixture Models 39
4.4 Suspicious Event Analysis Using Finite State Machines 41
Chapter 5：Experimental Results 44
5.1 Body Part Segmentation Performace 44
5.1.1 The Accuracy of Clustering Scheme 44
5.1.2 The Accuracy of Posture Segmentation from Different Angles 46
5.1.3 The Accuracy of Posture Segmentation from Different Lighting and Complicated Backgrounds 49
5.1.4 The Accuracy of Posture Segmentation in Different Clothes and Trousers Types 50
5.1.5 The Accuracy of Posture Segmentation from Different Angles in Videos 52
5.1.6 The Accuracy of Posture Segmentation in Similar Contours 54
5.1.7 The Accuracy of Posture Segmentation in False Foreground 56
5.1.8 The Accuracy of Posture Segmentation in HumanEva Database 57
5.2 Carried Object Detection Performance 61
5.2.1 The Performance of Our Tracking Algorithm 61
5.2.2 The Accuracy of Carried Object Detection with Different Methods 62
5.2.3 The Accuracy of Carried Object Detection from Videos 67
5.2.4 The Accuracy of Analysis from Different Events 73
Chapter 6：Conclusions and Future Works 76
6.1 Conclusions 76
6.2 Future Works 77
References 79

參考文獻

[1] T. B. Moeslund and E. Granum, “A survey of computer vision-based human motion capture,” Computer Vision and Image Understanding, vol. 81, no. 3, pp. 231-268, March 2001.
[2] D. M. Gavrila, “The visual analysis of human movement: a survey,” Computer Vision and Image Understanding, vol. 73, no.1, pp. 82-98, 1999.
[3] R. Cucchiara, C. Grana, A. Prati, and R. Vezzani, “Probabilities posture classification for human-behavior analysis,” IEEE Transactions on Systems, Man, and Cybernetics-Part A: System and Humans, vol. 35, no. 1, pp. 42-54, Jan. 2005.
[4] I. Haritaoglu, D. Harwood, and L.S. Davis, “W4: real-time surveillance of people and their activities,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 809-830, 2000.
[5] I. Haritaoglu, D. Harwood, and L. S. Davis, “Ghost: A human body part labeling system using silhouettes,” in Proc. of the 14th Intl. Conf. on Pattern Recognition, vol. 1, pp.77-82, 1998.
[6] N. Oliver, B. Rosario, and A. Pentland, “A bayesian computer vision system for modeling human interactions,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 831-843, Aug. 2000.
[7] S. S. Micilotta, E. J. Ong, and R. Bowden, “Detecting and tracking of humans by probabilistic body part segmentation,” in Proc. of Modeling People and Human Interaction Workshop, vol.1, pp. 429-438, Oct. 2005.
[8] B. Wu and R. Nevatia, “Tracking of multiple, partially occluded humans based on static body part detection,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, vol. 1, pp. 951-958, June 2006.
[9] C. R. Wren, A. Azarbayejani, T. Darrell and A.P. Pentland,“Pfinder: real-time tracking of the human body,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp.780-785, July 1997.
[10] S. Park and J. K. Aggarwal, “Segmentation and tracking of interacting human body parts under occlusion and shadowing,” in IEEE Workshop on Motion and Video Computing, Orlando, FL, pp.105-111, 2002.
[11] S. Park and J. K. Aggarwal, “Semantic-level understanding of human actions and interactions using event hierarchy,” in 2004 Conference on Computer Vision and Pattern Recognition Workshop, Washington, D.C., USA, Vol. Issue, 27-02, pp.12, June 27-July 02, 2004.
[12] I. Mikic, M. Trivedi, E. Hunter and P. Cosman, “Human body model acquisition and tracking using voxel data,” International Journal of Computer Vision, vol. 53, no. 3, pp.199-223, 2003.
[13] Y. Ivanov and A. Bobick, “Recognition of visual activities and interactions by stochastic parsing,” IEEE Transactions on Pattern Recognition and Machine Intelligence, vol. 2, no. 8, pp. 852-872, Aug. 2000.
[14] S. Weik and C. E. Liedtke, “Hierarchical 3D pose estimation for articulated human body models from a sequence of volume data,” in Proc. of Intl. Workshop on Robot Vision 2001, Auckland, New Zealand, pp. 27-34, Feb. 2001.
[15] J. Shewchuk, “Delaunay refinement algorithms for triangular mesh generation, computational geometry,” Theory and Applications, vol. 23, pp. 21-74, May 2002.
[16] E. J. Ong and R. Bowden, “A boosted classifier tree for hand shape detection,” in Proc. of International Conference on Automatic Face and Gesture Recognition, pp.889-894, 17-19 May 2004.
[17] Y. Song, L. Goncalves, and P. Perona, “Unsupervised learning of human motion,” IEEE Transactions on Pattern Recognition and Machine Intelligence, vol. 25, no. 7, pp. 814-827, 2003.
[18] G. Mori, X. Ren, A. A. Efros, and J. Malik, “Recovering human body configuration: combining segmentation and recognition,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, vol. 2, pp.326-333, 2004.
[19] S. Belongie, J. Malik, and J. Puzicha, “Shape matching and object recognition using shape contexts,” IEEE Transactions on Pattern Recognition and Machine Intelligence, vol. 24, no. 4, pp. 509–522, April 2002.
[20] T. Cour and J. Shi, “Recognizing objects by piecing together the segmentation puzzle,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, pp.1-8, June 2007.
[21] Deva Ramanan, “Learning to parse images of articulated bodies,” Advanced in Neural Information Processing Systems, Vol.1, pp. 206-213, 2006.
[22] D. Ramanan and C. Sminchisescu, “Training deformable models for localization,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, vol.1, pp. 206- 213, June 2006.
[23] X. Lan and D. Huttenlocher, “Beyond trees: common factor models for 2D human pose recovery,” in Proc. of IEEE International Conference on Computer Vision, vol.1, pp.470-477, Oct. 2005.
[24] G. Hua, M.-H. Yang, and Y. Wu, “Learning to estimate human pose with data driven brief propagation,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, vol.2, pp. 747-754, June 2005.
[25] R. Rosales and S. Sclaroff, “3D Trajectory Recovery for Tracking Multiple Objects and Trajectory Guided Recognition of Actions,” Proc. of IEEE Conf. On Computer Vision and Pattern Recognition, vol. 2, no. 2, pp. 637-663, June 1999.
[26] W. Niu, J. Long, D. Han, and Yuan-Fang Wang, “Human activity detection and recognition for video surveillance,” in Proceedings of the IEEE Multimedia and Expo Conference, Taipei, Taiwan, Vol. 1, pp.719-722, June 2004.
[27] L. P. Chew, “Constrained delaunay triangulations,” Algorithmica, vol. 4, no.1, pp.97-108, 1989.
[28] E. Stringa and C. S. Regazzoni, “Real-time video-shot detection for scene surveillance applications,” IEEE Transactions on Image Processing, vol. 9, no. 1, pp. 69-79, Jan. 2000.
[29] G. L. Foresti, L. Marcenaro, and C. S. Regazzoni, “Automatic detection and indexing of video-event shots for surveillance applications,” IEEE Transactions on Multimedia, vol. 4, no. 4, pp.459 – 471, Dec. 2002.
[30] M. Spengler and B. Schiele, “Automatic detection and tracking of abandoned objects,” in Proc. of IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Nice, France, October 2003.
[31] Y.L. Tian, Max Lu, and A. Hampapur, “Robust and efficient foreground analysis for real-time video surveillance,” in Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp.1182-1187, June 2005.
[32] I. Haritaoglu, R. Cutler, D. Harwood, and L. Davis, “Backpack: detection of people carrying objects using silhouettes,” Computer Vision and Image Understanding, vol. 6, no. 2, 2000.
[33] Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” in Proc. Second European Conference on Computational Learning Theory, Springer-Verlag, pp. 23-37, 1995.
[34] C. Stauffer and E. Grimson, “Learning patterns of activity using real-time tracking,” IEEE Transactions on Pattern Recognition and Machine Intelligence, vol. 22, no. 8, pp. 747-757, 2000.
[35] M. Bern and D. Eppstein, Mesh generation and optimal triangulation, Computing in Euclidean Geometry, 2nd Ed., World Scientific, 1995, pp. 47-123.
[36] E. Horowitz, S. Sahni, and S. A. Freed, Fundamentals of data structure in C, W. H. Freeman and Company, New York, USA.
[37] M. Sonka, V. Hlavac, and R. Boyle, Image Processing, Analysis and Machine Vision, London, U. K., Chapman & Hall, 1993.
[38] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification (2nd Edition), Wiley-Interscience, 2000.
[39] L.R. Rabiner and B. H. Juang, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp.257-286, Feb. 1989.
[40] M. Isard and A. Blake, “CONDENSATION: conditional density propagation for visual tracking,” International Journal on Computer Vision, vol. 29, no. 1, pp. 5-28, 1998.
[41] D. Comaniciu, V. V. Ramesh, and P. Meer, “Kernel based object tracking,” IEEE Transactions on Pattern Recognition and Machine Intelligence, vol. 25, no. 5, pp.564-577, May 2003.
[42] J. Han and K. K. Ma, “Fuzzy color histogram and its use in color image retrieval,” IEEE Transactions on Image Processing, vol.11, No. 8, pp. 944- 952, Aug. 2002.
[43] R. Poppe, “Evaluating example-based pose estimation: experiments on the humaneva sets,” in Proc. of CVPR 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation (EHuM2), 2007.
[44] R. Urtasun and T. Darrell, “Sparse probabilistic regression for activity-independent human pose inference,” in Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.1-8, June 2008.
[45] L. Sigal and M. J. Black. “HumanEva: synchronized video and motion capture dataset for evaluation of articulated human motion,” Technical Report CS-06-08, Brown University, 2006.
[46] P. Viola, M. Jones, and D. Snow. “Detecting pedestrians using patterns of motion and appearance, ” in Proc. 9th Int’l Conf. Computer Vision, pp. 734–741, 2003.
[47] J. Grahn and H. Kjellstron, “Using SVM for efficient detection of human motion,” IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp.231- 238, 2005.
[48] G. Mori, X. Ren, A. Efros, and J. Malik. “Recovering human body configurations: combining segmentation and recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 326-333, 2004.

指導教授

范國清(Kuo-Chin Fan)

審核日期

2009-7-13

推文