結合影像與文字辨識的網路色情過濾

DC 欄位	值	語言
DC.contributor	資訊工程學系	zh_TW
DC.creator	邱建明	zh_TW
DC.creator	Jen-Min Chiu	en_US
dc.date.accessioned	2004-7-22T07:39:07Z
dc.date.available	2004-7-22T07:39:07Z
dc.date.issued	2004
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=90522049
dc.contributor.department	資訊工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	Internet的蓬勃發展，讓資訊與知識能更廣泛，更有效率地流通。但是方便取得的資訊，也意味著網路上的不當資訊更加地四處橫流；電腦教育的日漸普及，使得越來越多的人可以接觸到網路，對於藉由Internet來擴散的負面題材，例如色情、暴力、吸毒、種族仇恨...等等資訊，將因為未設防的存取環境，而比實體的傳播管道更具穿透力。因此在不妨礙言論自由的範圍內，對於以國中小學教育為主的網路環境所能接取的網站內容，及存取行為施以某種程度的過濾是有必要的。　　　對於網站過濾方面的研究，應用黑名單其中一種受歡迎的手法，獲得名單的方式則因方法而異。一般來說有可以分為人工檢查、關鍵字分析、程式自動收尋...等等。本文針對色情網站在影像及文字方面的特性，發展出一套綜合的分析方法。在色情圖片方面，利用影像處理及圖樣分析方面的技術：如色彩分析，紋理分析，中軸抽取，Shape From Shading...等技術，來分析影像中是否有膚色色調的區域，以及這些區域是否能代表存在著裸露的人體；在文字方面，則運用資訊檢索和文件分類的手法，測量關於色情方面的關鍵字之數目及出現頻率。最後藉由衡量兩方面所萃取出的特徵向量，計算彼此間的相似性，來對名單作群聚分析的工作，進一步精煉出色情與非色情的網址，來提高名單整體的精確性。	zh_TW
dc.description.abstract	With the explosive growing of Internet, information and knowledge may proliferating wide-spreadly and efficiently. And the computer education is available to all in recent years, let more and more people access varirty material in Internet, But at the same time, it also implyed the flooding of inappropriate Internet content. In the unfortified enviroment, some objectionable topic such as pornography, violence, and hate messages, will penetrate to those who shouldn’t access these web sites. Thus, it is nessessary that apply filting scheme to offensive content, without harmimg to free speech. Blacklist is a popular way in current web filtering research, and there are variety collecting method of blacklist, i.e. key word analysis, human inspectnig ...etc.But there are alway some false positive exist. In this paper we develope a compounded method, according to the multiple characteristics of pornography sites in image and text, to refining the blacklist. For erotic images, we use the image processing techniques: color segmentation, coarse detection, median axes extraction, and shape from shading. For text in web document, we use the techniques of Information Retrieval and Document Classification, to measure the number and frequence of erotic key word. After extract two forms of feature vector, we measure the similarity of two document by the angle of their feature vector. Finally, the refining task is cast to the graph partitioning problem, and divide the blacklist into two groups: pornographic site and non-pornographic site.	en_US
DC.subject	網站過濾	zh_TW
DC.subject	色情影像偵測	zh_TW
DC.subject	文件分類	zh_TW
DC.subject	Pornographic Image Analysis	en_US
DC.subject	Document Classification	en_US
DC.subject	Web filtering	en_US
DC.title	結合影像與文字辨識的網路色情過濾	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Internet Pornography Filtering With Combination ofImage-Based and Text-Based Classification	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 90522049 完整後設資料紀錄