中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/25877
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 42141840      Online Users : 947
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/25877


    Title: 針對雙屬性集合問題的兩階段分群演算法;Two-Staged Clustering Algorithm for Two-Attributes-Set Problem
    Authors: 蕭雅君;Ya-Chun Hsiao
    Contributors: 資訊管理研究所
    Keywords: 群集分析;分群;資料挖掘;Clustering;Cluster analysis;Data mining
    Date: 2009-11-11
    Issue Date: 2010-06-11 16:58:33 (UTC+8)
    Publisher: 國立中央大學圖書館
    Abstract: 分群在許多領域中被廣泛的研究與應用,分群在資料探勘技術中更是一項很重要的領域。分群是將相似的資料劃分成同一群並以少數的群集代替龐大的資料。然而目前的傳統演算法,用以計算相似度進行分群的屬性與分群後用以表達群集特徵的屬性必頇是相同的,但實際上可以加以區別的,例如當銀行想了解不同背景信用卡使用者的消費行為,則一般會希望以年齡薪資等個人資料來區隔群集、描述群集的特徵屬性,且希望分出的群集其群內消費行為是相似的,因此需要兩組不同的屬性,一組是個人資料作為區隔群集、描述群集的特徵屬性,另一組是消費行為作為計算相似度進行分群的屬性。我們將計算相似度的屬性稱為分群距離屬性,描述群的特徵屬性稱為使用者表達屬性,而傳統演算法在分群時,表達屬性與距離屬性必頇是相同的,因此無法針對前述的問題產生良好的分群結果。因此我們提出兩階段分群演算法,可以處理距離屬性與表達屬性是不同的問題,使得分群的結果可用表達屬性來區隔與描述,而群內距離屬性依然是相似的。 Cluster analysis has recently become a highly active topic in data mining research. However, existing clustering algorithms had a common problem for applying on practical application that they consider only one set of attributes for both partitioning data space and measuring similarity between objects when clustering data. There are some practical situations that two different sets of attributes are required for both procedures. For example, a bank needs to cluster their customers to learn about customers’ consumption behaviors of different background. Then customers should be clustered by the attribute set of consumption behaviors, while the bank still need to know the characteristics of every cluster from the customers’ personal information like age and income. Therefore, two different sets of attributes are required that one set is for similarity-measuring, called similarity-measuring attribute, and the other one, called dataset-partitioning attribute, is for partitioning data set as well as describing resulting clusters. Traditional algorithms do not distinguish the two sets of attributes which lead to low quality clustering results in such two-attributes-set problem. We propose Two-Clustering Algorithm to solve the two-attributes-set problem, generating resulting clusters that can be segmented or described by dataset-partitioning attributes and objects in the same cluster are similar in similarity-measuring attributes as well.
    Appears in Collections:[Graduate Institute of Information Management] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML549View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明