中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/78739
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 43590453      Online Users : 828
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/78739


    Title: 遺漏值填補 – 過去、現在與未來;Past, Present, and Future for Missing Value Imputation
    Authors: 蔡志豐
    Contributors: 國立中央大學資訊管理學系
    Keywords: 填補遺漏值;資料前處理;資料探勘;監督式學習法;missing value imputation;data pre-processing;data mining;supervised learning algorithms
    Date: 2018-12-19
    Issue Date: 2018-12-20 13:46:28 (UTC+8)
    Publisher: 科技部
    Abstract: 遺漏值(Missing Value)是造成資料不完整的一項原因,而資料遺漏的原因可能來自人為的資料輸入錯 誤、隱瞒或背景差異等主觀影響所造成的缺失;亦可能來自機器本身,如:儲存失敗、硬體故障、毁損 等導致特定時段内的資料遺漏等。因此,在進行資料探勘時遺漏值的問題往往導致了探勘效能的降 低。針對遺漏值的處理方式可分為直接刪除法以及遺漏值填補法。本研究計晝之第一年研究目的主要 在於收集與檢視從2000至今所發表的相關文獻(共超過一百篇論文)進行探討以發現目前填補遺漏值 的限制,另一方面將試著瞭解使用直接刪除法之最佳時機(例如於何種資料類型以及多少遺漏率等 等)。而第二年的研究目的將著重在統計與監督式學習演算法於填補遺漏值的效能比較,其中將包含 六種不同的演算法。最後一年的研究目的將嘗試推出一個混合式學習的遺漏值填補法以提昇填補遺漏 值的品質。 ;Incomplete datasets are usually caused by missing values. That is, some attribute value(s) of the data samples are missing. The missing value problem occurs due to problems such as manual data entry procedures, incorrect measurements, equipment errors, and so on. As a result, this kind of incomplete datasets can lead to performance degradation for the data mining purpose. To solve this problem, the case deletion and missing value imputation can be considered. In this three-year project, the aim of the first year research is to review and survey related works of missing value imputation from 2000 to 2015 in order to figure out the limitations of related literatures. On the other hand, the applicability of using case deletion is also examined. That is, different types missing data (i.e. categorical, numerical, and mixed types) and different missing rates are studied. The second year research focuses on comparing statistical and supervised learning techniques for missing value imputation. In particular, six different algorithms will be compared. Finally, the aim of the third year research is to propose a hybrid learning based imputation method to improve the quality of missing value imputation.
    Relation: 財團法人國家實驗研究院科技政策研究與資訊中心
    Appears in Collections:[Department of Information Management] Research Project

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML245View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明