博碩士論文 107225021 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:156 、訪客IP:3.145.184.7
姓名 徐瑋辰(Wei-Chern Hsu)  查詢紙本館藏   畢業系所 統計研究所
論文名稱
(A Survival Tree based on Stabilized Univariate Score Tests with High Dimensional Covariates)
相關論文
★ A control chart based on copula-based Markov time series models★ An improved nonparametric estimator of distribution function for bivariate competing risks model
★ Estimation and model selection for left-truncated and right-censored data: Application to power transformer lifetime modeling★ A robust change point estimator for binomial CUSUM control charts
★ Maximum likelihood estimation for double-truncation data under a special exponential family★ A class of generalized ridge estimator for high-dimensional linear regression
★ A copula-based parametric maximum likelihood estimation for dependently left-truncated data★ A class of Liu-type estimators based on ridge regression under multicollinearity with an application to mixture experiments
★ Dependence measures and competing risks models under the generalized Farlie-Gumbel-Morgenstern copula★ A review and comparison of continuity correction rules: the normal approximation to the binomial distribution
★ Likelihood inference on bivariate competing risks models under the Pareto distribution★ Parametric likelihood inference with censored survival data under the COM-Poisson cure models
★ Likelihood-based analysis of doubly-truncated data under the location-scale and AFT models★ Copula-based Markov chain model with binomial data
★ The Weibull joint frailty-copula model for meta-analysis with semi-competing risks data★ A general class of multivariate survival models derived from frailty and copula models: application to reliability theory
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 在醫學研究中,生物指標因素(prognosis factor)和其相對應的預測模
型已經被廣泛使用。存活樹(Survival tree)和森林(Survival forest)是當
前非常熱門用於存活數據(Survival data)開發預測模型的非參數方法。它們
具有很高的彈性,可以合理地檢測某些變數間的交互作用而不需要太多模型
假設。此外,一棵存活樹可以根據其二元分類及不斷遞迴的特性產生多個指
標因素並將樣本分為多個組別。在本文中,我們點名的存活樹在高維度變數
下的實施困難原因及解決辦法。此外,我們還指出,用於檢測樹節點在傳統
logrank test 下具有致命的缺點。為了解決上述問題,我們提出了穩定單變
量score statistics 來找出樹的節點。進階來說,我們可以在沒有任何迭代
優化的情況下執行高維度變數的篩選和提出決策,在某些特殊運算下能提升
效率。本文也提出對於當logrank test 無法提供適量的統計決策時,我們提
出的方法能適當解決這個問題並產生更有預測能力的存活樹。
摘要(英) Analysis of prognostic factors and prediction models has been considered extensively in
medical research. Survival trees and forests are popular non-parametric tools for developing
prognostic models for survival data. They offer great flexibility and can automatically detect
certain types of interactions without the need to specify them beforehand. Moreover, a single tree
can naturally classify subjects into different groups according to their survival prognosis based on
their covariates. In this thesis, we point out the difficulty of tree-based model fitting a high
dimensional covariate. Furthermore, we also point out that the traditional logrank tests for
detecting the nodes of a tree have fatal drawbacks. In order to overcome these difficulties, we
propose a stabilized univariate score statistics to find the nodes of a tree. We show that the high
dimensional score tests can be performed without any iteration and optimization, leading to a
computationally efficient test procedures. We also show that the proposed method can resolve the
drawbacks of the logrank tests, leading to a highly precise tree. Simulation studies are performed
to see the relative performance of the proposed method with the existing method. The lung cancer
dataset is analyzed for illustration.
關鍵字(中) ★ 右設限
★ 樹
★ 高維度變數
★ 基因序列
關鍵字(英) ★ Right censoring
★ Tree
★ High dimensional covariate
★ Gene selection
論文目次 Content
摘要...................................................................................................................................................i
Abstract...........................................................................................................................................ii
致謝詞.............................................................................................................................................iii
1. Introduction................................................................................................................................ 1
2. Background................................................................................................................................. 3
2.1 Problem Setup:................................................................................................................. 3
2.2 Classification and Regression Tree................................................................................. 4
2.2.1 Introduction of Tree Algorithm........................................................................... 4
2.2.2 Splitting criterion .................................................................................................. 4
2.2.3 Stopping criterion ................................................................................................. 5
2.2.4 Logrank test........................................................................................................... 6
2.2.5 Score test .............................................................................................................. 10
3. Proposed method...................................................................................................................... 13
3.1 Univariate Score Test..................................................................................................... 13
3.2 Matrix-based computation ............................................................................................ 14
3.3 Survival tree algorithm.................................................................................................. 17
3.4 Prognostic Prediction..................................................................................................... 21
4. R package.................................................................................................................................. 24
4.1 uni.logrank ...................................................................................................................... 25
4.2 KM.split............................................................................................................................ 25
4.3 uni.tree............................................................................................................................. 26
4.4 feature.selected ................................................................................................................ 29
4.5 risk.classification............................................................................................................. 29
5. Simulations ............................................................................................................................... 29
5.1 Simulation designs.......................................................................................................... 30
5.2 Simulation result ............................................................................................................ 34
6. Data analysis............................................................................................................................. 37
6.1 The Lung Cancer data................................................................................................... 37
6.2 Binary splitting............................................................................................................... 38
6.3 Survival tree.................................................................................................................... 40
6.3.1 Logrank tree by uni.logrank() and uni.tree().................................................... 40
v
6.3.2 Modified score tree by uni.score() and uni.tree().............................................. 42
6.3.3 Conditional inference tree by ctree() ................................................................. 44
6.4 Analytic results............................................................................................................... 47
7. Conclusions............................................................................................................................... 50
Reference....................................................................................................................................... 51
Appendix....................................................................................................................................... 53
Appendix A: Performance Evaluation............................................................................... 53
A1. Tree model and notation settings......................................................................... 53
A2. Evaluation index.................................................................................................... 54
A3. c-index .................................................................................................................... 54
A4. Likelihood ratio test.............................................................................................. 56
Appendix B: Code for data analysis................................................................................... 58
Appendix C: Searching optimal threshold and constant
0 d
to build an univariate tree
for lung cancer data ............................................................................................................. 63
Appendix D: Optimal the adjust P-value for ctree() for lung cancer data ..................... 68
參考文獻 Beer DG, Kardia SLR, Huang CC, Giordano TJ, Levin AM, et al. (2002) Gene-expression profiles predict
survival of patients with lung adenocarcinoma. Nat Med 8: 816-824.
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and Regression Trees. New York,
US, Chapman and Hall.
Chen HY, Yu SL, Chen CH, Chang GC, Chen CY, et al. (2007) A five-gene signature and clinical
outcome in non-small-cell lung cancer. N Engl J Med 356: 11-20.
Choi J, Oh I, Seo S, Ahn J (2018) G2Vec: Distributed gene representations for identification of cancer
prognostic genes. Sci. Rep 8(1): 1-10.
Emura T, Matsui S, Chen HY (2019) compound.Cox: univariate feature selection and compound covariate
for predicting survival. Comput Methods Programs Biomed 168: 21-37
Emura T, Chen YH, Chen HY (2012) Survival prediction based on compound covariate under Cox
proportional hazard models. PLoS ONE 7 (10). doi:10.1371/journal.pone.0047627
Emura T, Chen YH (2016) Gene selection for survival data under dependent censoring, a copula-based
approach. Stat Methods Med Res 25(6): 2840-57.
Emura T, Chen YH (2018) Analysis of survival data with dependent censoring, Copula-based approaches.
JSS Research Series in Statistics, Springer, Singapore.
Emura T, Hsu JH (2020) Estimation of the Mann-Whitney effect in the two-sample problem under
dependent censoring Compt Stat Data Anal 150: 106990.
Emura T, Nakatochi M, Matsui S, Michimae H, Rondeau V (2018) Personalized dynamic prediction of
death according to tumour progression and high-dimensional genetic factors: meta-analysis with a joint
model. Stat Methods Med Res 27(9): 2842-58
Everitt BS, Howell DC (2005) Classification and regression trees, encyclopedia of statistics in behavioral
science. Chichester, Wiley, Second Edition, pp. 287-290.
Alvisi G, Brummelman J, Puccio S, Mazza EM, Tomadam EP, et al. (2020) IRF4 instructs effector Treg
differentiation and immune suppression in human cancer. J Clin Invest 130(6): 3137-3150.
Hothorn T, Everitt BS (2014). A Handbook of Statistical Analyses using R, Third Edition. CRC press.
Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference
framework. J Comput Graph Stat 15: 651-74.
Hothorn T, Hornik K, Zeileis A (2020) ctree: Conditional Inference Trees. CRAN Version 1.2-8.
https://cran.r-project.org/web/packages/partykit/vignettes/ctree.pdf
Hothorn T, Seibold H, Zeileis A (2020) partykit: A toolkit for Recursive Partytioning. CRAN Version 1.2-
8.
Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS (2008) Random survival forests. Ann Appl Stat 2(3):
841-860.
69
Kang TH, Park JH, Yang A, Park HJ, Lee SE, et al. (2020) Annexin A5 as an immune checkpoint
inhibitor and tumor-homing molecule for cancer treatment. Nat. Commun 11(1). doi:10.1038/s41467-
020-14821-z.
Kim M, Oh I, Ahn J (2018) An improved method for prediction of cancer prognosis by network learning.
Genes 9(10): 478.
LeBlanc M, Crowley J (1995) A review of tree–based prognostic models. Cancer Res Treat 75, 113-124.
Matsui S (2015) Statistical issues in clinical development and validation of genomic signatures, design
and analysis of clinical trials for predictive medicine. Boca Raton, CRC Press, pp. 207-226.
Matsui S (2006) Predicting survival outcomes using subsets of significant genes in prognostic marker
studies with microarrays. BMC bioinform 7(1): 156.
Moradian H, Larocque D, Bellavance F (2019) Survival forests for data with dependent censoring. Stat
Methods Med Res 28(2): 455-461.
Mantel N, Bohidar NR, Ciminera JL (1977) Mantel-Haenszel analyses of litter-matched time-to-response
data, with modifications for recovery of interlitter information. Cancer Res 37(11): 3863-3868.
Shimokawa A, Kawasaki Y, Miyaoka E (2015). Comparison of splitting methods on survival tree. Int J
Biostat 11(1): 175-188.
Therneau TM, Atkinson EJ (2019) rpart: Recursive Partitioning and Regression Trees. CRAN Version
4.1-15.
Therneau TM, Lumley T (2020) survival: survival analysis. CRAN Version 3.1-12.
van Wieringen WN, Kun D, Hampel R, Boulesteix L (2009). Survival prediction using gene expression
data: a review and comparison. Comput Stat & Data Anal 53(5): 1590-1603.
Witten DM, Tibshirani R (2010) Survival analysis with high-dimensional covariates. Stat Methods Med
Res 19: 29-51.
Yang SP, Emura T (2017) A Bayesian approach with generalized ridge estimation for high-dimensional
regression and testing. Commun Stat-Simul 46 (8): 6083-105.
指導教授 江村剛志(Takeshi Emura) 審核日期 2020-7-30
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明