智慧型監控系統的物件偵測與物件壓縮技術開發及設計;Design of Object Detection and Object Compression for Intelligent Surveillance System

NCU Institutional Repository > 資訊電機學院 > 電機工程研究所 > 博碩士論文 > Item 987654321/53178

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/53178

題名:	智慧型監控系統的物件偵測與物件壓縮技術開發及設計;Design of Object Detection and Object Compression for Intelligent Surveillance System
作者:	林崇元;Chung-Yuan Lin
貢獻者:	電機工程研究所
關鍵詞:	前景偵測;視訊壓縮;監控系統;surveillance system;video compression;foreground detection
日期:	2012-01-10
上傳時間:	2012-06-15 20:23:26 (UTC+8)
摘要:	新興的智慧型監控系統試圖利用視覺型分析任務來瞭解與預測監控場景裡所發生的事件，並以此來達到對廣大區域的自動監控。在所提供的視覺型分析任務裡，偵測前景物件是一件初期的且決定性的任務。前景偵測的目的在於將影像裡感興趣的前景區域與不感興趣的背景區域分離。如此、藉由分析影像裡的前景區域，監控系統可以自動的瞭解前景物件在影像裡的行為。隨著硬體製程在發展上趨於成熟，高解析度的攝影機在目前已經是相當的普及。然而，隨著畫面裡像素數量的增加，將前景偵測技術運用於高解析度攝影機上將會導致極高的運算量，並且增加整個監控系統的硬體成本。而此成本也將隨著系統裡所架設攝影機的數量上升而上升。本論文以偵測的強韌性及運算的複雜性為觀點，對前景偵測進行探討並且提出可能的解決方案。首先，針對小型監控系統，我們提出適用於數位訊號處理器之前景偵測法。所提出演算法的特性為：預測空間域資料的相關性，並根據預測的相關性來設置與運算量有關的參數來減少不必要的運算。我們也利用數位訊號處理器上的硬體資源，提出了具適應性之畫面率控制機制。此一機制可自動偵測具多攝影機之監控系統上的前景偵測運算量，來調整空間域資料相關性的參數來維持可達到即時處理的表現。如此，硬體成本將不會隨著攝影機數量的增加而上升。我們將前景偵測實現於在可進行驗證的硬體平台。僅需要單一顆數位訊號處理器即可對16隻CIF畫面大小的攝影機同時進行前景偵測。其次、針對大型監控系統，我們設計了低複雜度的前景偵測法。而首要的設計考量為忽略移動的背景同時偵測出移動的前景。本文提出了利用人機在物件層級的互動機制來達到此一目標。我們提出的互動機制可改變移動物件被視為前景物件的條件。該條件會隨著監控場景的不同而改變，並且由使用者透過人機介面進行互動而產生出。採用此機制後，僅需利用低複雜度前景偵測法即可達到良好的偵測率。我們也提出了基於系統晶片架構之處理器設計來實現所提出的演算法。提出的處理器可達到每秒即時處理30張HD720畫面的運算速度。最大輸出率可提昇至32.707 MPixels/s。第三、我們提出附屬的模式來強化在複雜環境下的偵測品質。根據研究在複雜環境下的時間空間域的機率密度函數，我們發現任一特定區域會存在一個可識別的機率密度函數分佈。而此機率密度函數分佈可以利用一簡單的背景模型來獲得。以此結果、我們提出強化的前景偵測演算法，其利用可識別的機率密度函數分佈計算出區域性的似然率斜率測試(likelihood ratio test)來進一步區別有著明顯移動的背景與移動前景。測試裡所用到的臨界值皆為自動產生，且會隨著影像內容的變化而調整。量化的評估與比較結果顯示，我們的方法較目前最先進的演算法可提供更為精準的偵測率。最後、我們提出了以物件為主的編碼方式，有效率的在監控網路上傳輸視訊物件至各種不同的裝置上。提出的編碼方式利用了取決於前景及背景內容的多餘性來進行更有效率的壓縮。根據混合高斯的背景模型，我們將編碼區塊以取決於影像內容的多餘性進行分類。因此，編碼區塊的移動向量預測只會對真正涉及到移動的區塊進行運算。為對不同種類的區塊進行編碼，我們推演出雙迴路式的編碼方式。實驗結果顯示，相較於MPEG-4及其他以物件為主式的編碼方式，雙迴路式的編碼方式可達到更高的編碼效率，同時顯著的降低整體的編碼複雜度。The emerging intelligent video surveillance attempts to provide vision-based analysis tasks to understand and predict the actions in the field of view for automated wide-area surveillance. Among the vision-based analysis tasks, detecting visual foreground object is an early and crucial vision task. The foreground detection separates the interested visual object from the background. By analyzing the detected visual objects in a scene, automatically understanding actions can be achieved in a surveillance system. Due to the progress in hardware technology scaling that realizes high resolution sensors, applying foreground detection to surveillance system often leads to high computational load and increases the cost of entire system when a mass deployment of end cameras in needed. This thesis explores the foreground detection in the perspective on detecting robustness and the perspective on computational complexity, and contributes three key techniques to surveillance scenario. First, a DSP-based foreground detection solution for small scale multiple cameras surveillance system is presented. The algorithm incorporates a temporal data correlation predictor which can exhibit the correlation between data and reduce computation based on this correlation. With the DSP-oriented foreground detection, an adaptive frame rate control is developed as a low cost solution for such surveillance system. The adaptive frame rate control automatically detects the computational load of foreground detection on multiple video sources and adaptively tunes the temporal data correlation predictor to meet the real-time specification. Therefore, no additional hardware cost is required when the number of deployed cameras is increased. The presented approach has been validated on a demonstration platform. Performance can achieve 30 CIF frames processing per second for a 16-camera surveillance system by single-DSP chip. Second, a low cost foreground detection solution for distributed surveillance system is presented. The primary issue is to tolerate background motions while detecting foreground motions in dynamic scene. This thesis presents the human-machine interaction in object level scheme. This scheme can vary the conditions for a moving object been regarded as a foreground object. The conditions are depending on a scene and are derived from the information from human-machine interaction. With such scheme, adopting the simple algorithm can achieve well foreground detection with significant background motions. A processor based on system-on-chip design is also presented for the human-machine interaction in object level based foreground detection. The detecting capability of the processor reaches HD720 at 30 Hz. The maximum throughput can be up to 32.707 MPixels/s. Third, an auxiliary mode is presented to further enhance the foreground detection quality in a complex environment. A study of spatiotemporal probability density functions of the background and the foreground in a complex scene supports the assertion that a discernible probability density function exists in a particular spatiotemporal region, and that the discernible probability density function can be effective learned using a simple background model. An enhanced algorithm that is based on regional likelihood ratio test and exploits discernible probability density functions is proposed. The thresholds used in the test are automatically estimated and adapted to the context of the video sequence. Quantitative evaluation and comparison with state-of-the-art approaches show that the presented algorithm provides much improved results. Finally, the object-based video coding scheme is presented to efficiently transmit visual object data to various devices such as storage device, server, and remote client through the network. Contextual redundancy associated with background and foreground objects in a scene is exploited. With a mixture-of-Gaussian background model, a method is presented to classify macroblock according to the type of contextual redundancy. The motion search is only performed on the specific type of context of MB that really involves interested motion. To facilitate the encoding by context of macroblock, an improved object-based coding architecture, namely dual-closed-loop encoder, is derived. The presented coding framework can achieve higher coding efficiency than MPEG-4 and related object-based coding approaches, while significantly reducing coding complexity.
顯示於類別:	[電機工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	591	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....