Mining typical transactions from transaction databases

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：53

、訪客IP：18.118.184.211

姓名

楊雅純(Ya-Chun Yang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

(Mining typical transactions from transaction databases)

相關論文

★ 零售業商業智慧之探討	★ 有線電話通話異常偵測系統之建置
★ 資料探勘技術運用於在學成績與學測成果分析 -以高職餐飲管理科為例	★ 利用資料採礦技術提昇財富管理效益 -以個案銀行為主
★ 晶圓製造良率模式之評比與分析－以國內某DRAM廠為例	★ 商業智慧分析運用於學生成績之研究
★ 運用資料探勘技術建構國小高年級學生學業成就之預測模式	★ 應用資料探勘技術建立機車貸款風險評估模式之研究－以A公司為例
★ 績效指標評估研究應用於提升研發設計品質保證	★ 基於文字履歷及人格特質應用機械學習改善錄用品質
★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題	★ 關聯式資料庫之廣義知識探勘
★ 考量屬性值取得延遲的決策樹建構	★ 從序列資料中找尋偏好圖的方法 - 應用於群體排名問題
★ 利用分割式分群演算法找共識群解群體決策問題	★ 以新奇的方法有序共識群應用於群體決策問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著數位化時代的來臨，資料量暴增導致資訊過載，對於日理萬機的高階主管，要在短時間內消化大量資料，並且在對的時間，給予對的行銷方案實屬難事，為了解決資訊過載的問題，本研究提出了摘要化交易資料庫的演算法，在眾多資料中找出最具有代表性的交易資料，以減少資訊閱讀的時間。期望協助高階主管進行快速決策，讓高階主管可以使用少數具有高度可讀性的代表資料，來窺探整體線上交易零售資料庫，以快速得知整體的銷售概況。
本篇研究使用K-medoids、Balanced K-means以及Genetic Algorithm演算法運算，找出最能代表線上交易零售資料庫的交易紀錄，並且比較三者的總成本，而總成本是由代表成本及代表不平均成本組成，最後期望以Genetic Algorithm，來改善使用K-medoids運算時的代表問題，在降低代表成本的同時，也提高代表性。

摘要(英)

With the digital generation coming, the data has been explosive growth and causes the information overloading. For a senior manager, it is hard to digest so much data and make a right marketing decision in right time. In order to resolve the problem of information overloading, this research provide an algorithm of transaction data reduction. It can reduce the time of searching the information by discovering the most representative data from the large data set. We expect to help senior managers to make the decision more efficiently.
With making good use of those representative data, they can see whole the online transaction retail database and realize the basic facts of all the sales in the short time.
This research will adopt the K-medoids, Balanced K-means and Genetic Algorithm to discover the most representative transaction data from the online transaction retail database. We will also compare the total cost of the three algorithm which is composed of representative cost and representative imbalanced cost. We propose the Genetic Algorithm can improve the representative problem, which is able to reduce the representative cost and also improve the representative of the data.

關鍵字(中)

★ K-medoids
★ Balanced K-means
★ Genetic Algorithm
★ Transaction Database

關鍵字(英)

★ K-medoids
★ Balanced K-means
★ Genetic Algorithm
★ Transaction Database

論文目次

目次
摘要 i
Abstract ii
誌謝 iii
圖目錄 vi
表目錄 vii
一、緒論 1
1.1 研究背景 1
1.2 研究目的 2
1.3 研究動機-情境舉例 3
1.4 研究方法-舉例說明 3
二、相關文獻 6
2.1 摘要化抽樣方法 6
2.1.1 統計方法 6
2.1.2 一般化方法 7
2.2 群集分析 8
2.2.1 群集方法介紹 8
2.2.2 群集方法應用 9
2.2.3 群集方法的延伸 10
2.3 基因演算法 11
三、研究設計 13
3.1 研究架構 13
3.2 問題定義 15
3.2.1 代表總成本TRcost 15
3.2.2 代表不平均總成本TScost 16
3.2.3 總成本 FinalCost 17
四、演算法設計 19
4.1 k-medoids 19
4.2 Balanced K-means 20
4.3 Genetic Algorithm 21
4.4小結 25
五、實驗 26
5.1 資料來源 26
5.2 衡量指標 26
5.3 資料處理 27
5.4 實驗評估 28
5.4.1 基因實驗方法評估 28
5.4.2 時間成本 29
5.4.3 代表成本 (Rcost) 30
5.4.4 代表不平均成本 (Scost) 31
5.4.5 總成本 (FinalCost) 32
5.4.6 總成本與執行時間 33
六、結論與未來研究 35
6.1 結論 35
6.2 研究貢獻 35
6.3 研究限制與未來研究 36
七、參考文獻 37
八、附錄 39

參考文獻

[1] Rajalakshmi Nandakumar and Laurel Orr, “Database Summarization”, CSE 544 Winter 2015.
[2] 連啟舜，「閱讀中的減法：摘要能力的發展與其相關因素研究」，2016年3月17卷2期
[3] 陳旭昇，統計學：應用與進階，(三版)，東華出版社，2015年4月13日
[4] 洪家育, “Noise-free Attribute oriented induction” ,2015
[5] Regis Saint-Paul, Guillaume Raschia, Noureddine Mouaddib,“General purpose database summarization”
[6] Varun Chandola, Vipin Kumar “Summarization - Compressing Data into an Informative Representation”
[7] Jiawei Han, Micheline Kamber, Data Mining: Concept and Techniques, Second Edition
[8] Marti A. Hearst, “Clustering versus Faceted Categories for Information Exploration”
[9] 陳垂呈、楊明憲、陳宗義、李靖平，“利用分群化建置病患罹患疾病探勘系統”，第十七屆資訊管理暨實務研討會
[10] 陳垂呈、洪茂峰、吳閔慧，“利用探勘技術發掘旅遊行程最適性之消費者”，第五屆觀光休閒餐旅產業永續經營學術研討會
[11] Sugato Basu, Ian Davidson, Kiri Wagstaff, Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall/CRC, 1 edition.
[12] P.S. Bradley, K.P. Bennett, A. Demiriz, “Constrained K-means Clustering with Background Knowledge”
[13] Mikko I. Malinen and Pasi Fr¨anti, “Balanced K-Means for Clustering”
[14] Shunzhi Zhu, Dingding Wang, Tao Li, “Data clustering with size constraints”, 2010
[15] 林豐澤，演化式計算：基因演算法以及三種應用實例
[16] 褚志鵬, “Analytic Hierarchy Process Theory” 2009
[17] 楊佩臻，利用文句關係網路自動萃取文件摘要之研究，國立中央大學，2013
[18] Ramona Georgescuy, Christian R. Bergery, Peter Willetty, Mohammad Azam, and Sudipto Ghoshal, “Comparison of Data Reduction Techniques Based on the Performance of SVM-type Classiers”, Dept. of Electr. and Comp. Engineering, University of Connecticut, Storrs, CT 06269, USA
[19] Rajalakshmi Nandakumar and Laurel Orr, “Database Summarization “ , CSE 544 Winter 2015
[20] 陳俊華，移動式網格之分散式資料分群技術，東吳大學資訊科學系，資訊管理研究第六期
[21] Hongjun Wang, Jianhuai Qi, Weifan Zhengm Mingwen Wang, “Balance K-means Algorithm”, Information Research Institute, South West Jiaotong University
[22] Mr. Ilango, Dr. V mohan, “A Survey of Grid Based Clustering Algorithms”, 1Professor, Department of Computer Applications, K L N College of Engineering, Pottapalaym- 630611., Sivagangai District, Tamilnadu, India
[23] Aristidis Likasa, Nikos Vlassisb, JakobJ. Verbeekb “The global k-means clustering algorithm”, 2002
[24] 劉政璋、葉鎮源、柯皓仁、楊維邦，“以概念分群為基礎之新聞事件自動摘要”，
國立交通大學資訊科學系、國立東華大學資訊管理系
[25] Cai, Y., “Attribute-oriented induction in relational databases”, 1989, Doctoral dissertation, Simon Fraser University

指導教授

陳彥良(Yen-Liang Chen)

審核日期

2017-8-18

推文