||Measurement of gene expression using microarray has been an extremely important research tool in biology and medicine. However, poor reproducibility of array-based results remains a long-standing issue. Although the cause for the problem has not been firmly identified, platform design and test site have been ruled out in a large-scale study by the MicroArray Quality Control project. In such measurements, prehybridization error (biological variance, or BV) introduced during sample processing (e.g. culture and treatment) and platform-specific sample preparation, and inherent random error of the technology (technical variance, or TV) are coupled and difficult to quantify separately. Increasing evidence points to BV as the primary cause but lack of a method for assessing BV keeps the experimentalist in constant doubt of data reliability. Here, we developed a procedure, Measuring Improper Sample Handling (MISH), as a solution for the problem and produced a computer package for its implementation. MISH is novel, all-statistics procedure and does not require normalization. For demonstration, we applied MISH to study the BV in 350 public data sets. Part of the result may be taken as a characterization of BV of the Affymetrix GeneChip Human Genome U133 Plus 2.0 Array platform. We found that BV was the dominant error in the data sets studied and that, for data sets from biological replicates, sample processing introduced the most error. Our analysis showed that a large number of public cohort data sets had low sensitivity on contrasts, which may well explain why studies on same diseases yielded highly dissimilar lists of DEGs. This suggests that the reproducibility issue will remain a concern for measurements based on next-generation sequencing, and on any future technology that does not focus on improvement in sample processing.|
||1. Schena, M., et al., Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 1995. 270(5235): p. 467-70.|
2. Tan, P.K., et al., Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res, 2003. 31(19): p. 5676-84.
3. Ramalho-Santos, M., et al., "Stemness": transcriptional profiling of embryonic and adult stem cells. Science, 2002. 298(5593): p. 597-600.
4. Ivanova, N.B., et al., A stem cell molecular signature. Science, 2002. 298(5593): p. 601-4.
5. Miller, R.M., et al., Dysregulation of gene expression in the 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine-lesioned mouse substantia nigra. J Neurosci, 2004. 24(34): p. 7445-54.
6. Fortunel, N.O., et al., Comment on " ’’Stemness’’: transcriptional profiling of embryonic and adult stem cells" and "a stem cell molecular signature". Science, 2003. 302(5644): p. 393; author reply 393.
7. Miklos, G.L. and R. Maleszka, Microarray reality checks in the context of a complex disease. Nat Biotechnol, 2004. 22(5): p. 615-21.
8. Frantz, S., An array of problems. Nat Rev Drug Discov, 2005. 4(5): p. 362-3.
9. Marshall, E., Getting the noise out of gene arrays. Science, 2004. 306(5696): p. 630-1.
10. Michiels, S., S. Koscielny, and C. Hill, Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet, 2005. 365(9458): p. 488-92.
11. Ein-Dor, L., O. Zuk, and E. Domany, Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. Proc Natl Acad Sci U S A, 2006. 103(15): p. 5923-8.
12. Shi, L., et al., The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol, 2006. 24(9): p. 1151-61.
13. Shi, L., et al., Cross-platform comparability of microarray technology: intra-platform consistency and appropriate data analysis procedures are essential. BMC Bioinformatics, 2005. 6 Suppl 2: p. S12.
14. Guo, L., et al., Rat toxicogenomic study reveals analytical consistency across microarray platforms. Nat Biotechnol, 2006. 24(9): p. 1162-9.