隨著技術的進步,數據集的規模呈指數級增長。計算工具的進步,但與數據量的增加比起來,就顯得相形見絀。因此,用於分析大數據的統計和計算工具就非常需要,因此,要如何可以在有限的成本下,從數據中提取重要信息,是我們這篇論文要討論的。考慮具有n個樣本數和p個變異數的線性回歸。對n≫p的情況,在現有研究方向是從完整數據中隨機抽取子樣本。但是,對於線性回歸模型下,現有方法還是需要時間來計算。Wang et al. (2018) 提出了一種稱為基於信息的最優子數據選擇(IBOSS)方法的替代方法。想法是選擇樣本數小的子樣本,以保留完整數據中的大部分信息。在本文中,我們採用A-最優性準則,其尋求最小化回歸係數的估計量的變異數,以及I-最優性準則,在設計空間尋求最小化預測的變異數。;With technology advances, the sizes of datasets are growing exponentially. While computing power becomes stronger, it is dwarfed by the phenomenal increase in data volume. Therefore, efficient statistical and computing tools for analyzing huge datasets are urgent so that one can draw important information from data under limited cost. Consider linear regression with n responses and p covariates. Existing investigations for n ≫ p are to take random subsamples from the full data. However, for linear regression on the full data, many existing methods take time to calculate. Wang et al. (2018) proposed an alternative approach called information-based optimal subdata selection (IBOSS) method. The idea is to select subdata of a small size that preserves most of information in the full data. In this thesis, we adopt the A-optimality criterion, which seeks to minimize the average variance of the estimators of regression coefficients, and the I-optimality criterion, which seeks to minimize the average prediction variance over the design space.