Subdata Selection : A- and I-optimalities

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/80925

Title:	Subdata Selection : A- and I-optimalities
Authors:	吳姿蓉;Wu, Zih-Rong
Contributors:	統計研究所
Keywords:	線性回歸;優化設計;信息性子數據選擇;粒子群最佳化;Linear regression;Informative subdata selection;Optimal design;Particle swarm optimization
Date:	2019-07-02
Issue Date:	2019-09-03 15:16:56 (UTC+8)
Publisher:	國立中央大學
Abstract:	隨著技術的進步，數據集的規模呈指數級增長。計算工具的進步，但與數據量的增加比起來，就顯得相形見絀。因此，用於分析大數據的統計和計算工具就非常需要，因此，要如何可以在有限的成本下，從數據中提取重要信息，是我們這篇論文要討論的。考慮具有n個樣本數和p個變異數的線性回歸。對n≫p的情況，在現有研究方向是從完整數據中隨機抽取子樣本。但是，對於線性回歸模型下，現有方法還是需要時間來計算。Wang et al. (2018) 提出了一種稱為基於信息的最優子數據選擇（IBOSS）方法的替代方法。想法是選擇樣本數小的子樣本，以保留完整數據中的大部分信息。在本文中，我們採用A-最優性準則，其尋求最小化回歸係數的估計量的變異數，以及I-最優性準則，在設計空間尋求最小化預測的變異數。;With technology advances, the sizes of datasets are growing exponentially. While computing power becomes stronger, it is dwarfed by the phenomenal increase in data volume. Therefore, eﬃcient statistical and computing tools for analyzing huge datasets are urgent so that one can draw important information from data under limited cost. Consider linear regression with n responses and p covariates. Existing investigations for n ≫ p are to take random subsamples from the full data. However, for linear regression on the full data, many existing methods take time to calculate. Wang et al. (2018) proposed an alternative approach called information-based optimal subdata selection (IBOSS) method. The idea is to select subdata of a small size that preserves most of information in the full data. In this thesis, we adopt the A-optimality criterion, which seeks to minimize the average variance of the estimators of regression coeﬃcients, and the I-optimality criterion, which seeks to minimize the average prediction variance over the design space.
Appears in Collections:	[Graduate Institute of Statistics] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	245	View/Open

社群 sharing

Loading...