蛋白質重複序列分析工具

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：84

、訪客IP：3.144.93.14

姓名

鄧仕祥(Shr-Shiang Deng) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

蛋白質重複序列分析工具
(Protein Repeats Finder)

相關論文

★ 應用嵌入式系統於呼吸肌肉群訓練儀之系統開發	★ 勃起障礙與缺血性心臟病的雙向研究: 以台灣全人口基礎的世代研究
★ 基質輔助雷射脫附飛行時間式串聯質譜儀微生物抗藥性資料視覺化工具	★ 使用穿戴式裝置分析心律變異及偵測心律不整之應用程式
★ 建立一個自動化分析系統用來分析任何兩種疾病之間的關聯性透過世代研究設計以及使用承保抽樣歸人檔	★ 青光眼病患併發糖尿病,使用Metformin及Sulfonylurea治療得到中風之風險:以台灣人口為基礎的觀察性研究
★ 利用組成識別和序列及空間特性構成之預測系統來針對蛋白質交互作用上的特殊區段點位進行分析及預測辨識	★ 新聞語意特徵擷取流程設計與股價變化關聯性分析
★ 藥物與疾病關聯性自動化分析平台設計與實作	★ 建立財務報告自動分析系統進行股價預測
★ 建立一個分析疾病與癌症關聯性的自動化系統	★ 基於慣性感測器虛擬鍵盤之設計與實作
★ 一個醫療照護監測系統之實作	★ 應用手機開發手握球握力及相關資料之量測
★ 利用關聯分析全面性的搜索癌症關聯疾病	★ 全面性尋找類風濕性關節炎之關聯疾病

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

了解蛋白質的重複序列對於分析蛋白質的空間，功能，結構，以及其彼此間的相互作用是有助益的。本研究試著開發一套蛋白質序列的分析工具，這套工具不僅能找出連續性重複序列及週期性重複的胺基酸,並且最主要的是這套工具可以對分散於蛋白質序列上的相近重複序列做分析。本研究開發之蛋白質重複序列分析工具，簡稱PRF，用來做蛋白質序列分析的工具，使用者可在網頁上直接來使用，同時我們分析SWISS-PROT的12萬條蛋白質序列並將其建成資料庫，進而提供使用者可以藉由網路線上查詢到所需要的資訊。有關PRF的詳情資訊，請參考 URL: http://140.115.155.94.

摘要(英)

Protein repeated sequence patterns may be a mechanism which provides regular arrays of spatial and functional groups, useful for structural packing or for one to one interactions with target molecules, and many large proteins have evolved by internal duplication and many internal sequence repeats correspond to functional and structural units.
In this study, we purpose to develop a protein repeat analysis tool that can find kinds of protein repeats such as tandem repeats, periodically conserved single amino acid repeats, and approximate repeats. We also provide a variety of statistics the repeats we found. Protein Repeats Finder (PRF) is developed to find kinds of protein repeats in a given protein sequence. Users can use this tool on line to query protein repeats they need. The web site is now available at URL: http://140.115.155.94.

關鍵字(中)

★ 蛋白質
★ 重複序列

關鍵字(英)

★ repeats
★ protein

論文目次

Contents
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Goal 2
Chapter 2 Related Work 4
2.1 PAM (Percent/Point Accepted Mutation) 4
2.2 Swiss-Prot 4
2.3 CLUSTAL W 4
2.4 Suffix Array 5
2.5 TRIPS 5
2.6 Radar 5
2.7 Dotlet 6
Chapter 3 Materials and Methods 7
3.1 Data Sets 7
3.2 Develop Environment 7
3.3 Main Ideals 7
3.4 Approach 8
(i) Exact repeats 8
(ii) Maximal repeat 8
(iii) Seeds 9
(iv) L-extend 10
(v) R-extend 10
(vi) LR-extend 11
(vii) Relation Matrix 11
3.5 Algorithm 15
3.6 Database Schema 24
Chapter 4 Results 29
4.1 Statistics of Different Protein Repeats 29
A. Protein Length Distribution 29
B. Approximate Repeats Distribution 31
C. Approximate Repeats Length Distribution 32
D. Ratio of (Repeats/Protein Length) 33
E. Ratio of (Length of Repeats/Protein Length) 34
4.2 A Database of Repeatitive Elements in Proteins 36
4.3 Query Interface 41
Chapter 5 Discussion 44
5.1 Relations of Structure 44
5.2 A Comparison of Different Tools 54
5.3 Summary 56
5.4 Future Work 56
References 57

參考文獻

Adebiyi, E. F., T. Jiang, et al. (2001). "An efficient algorithm for finding short approximate non-tandem repeats." Bioinformatics 17 Suppl 1: S5-S12.
Andrade, M. A., C. P. Ponting, et al. (2000). "Homology-based method for identification of protein repeats using statistical significance estimates." J Mol Biol 298(3): 521-37.
Batchelor, A. H., D. E. Piper, et al. (1998). "The structure of GABPalpha/beta: an ETS domain- ankyrin repeat heterodimer bound to DNA." Science 279(5353): 1037-41.
Delcher, A. L., S. Kasif, et al. (1999). "Alignment of whole genomes." Nucleic Acids Res 27(11): 2369-76.
Gusfield, D. (1997). Algorithms on Strings, Trees and Sequences.
Heger, A. and L. Holm (2000). "Rapid automatic detection and alignment of repeats in protein sequences." Proteins 41(2): 224-37.
Henikoff, S. and J. G. Henikoff (1992). "Amino acid substitution matrices from protein blocks." Proc Natl Acad Sci U S A 89(22): 10915-9.
Junier, T. and M. Pagni (2000). "Dotlet: diagonal plots in a web browser." Bioinformatics 16(2): 178-9.
Kajava, A. V. (1998). "Structural diversity of leucine-rich repeat proteins." J Mol Biol 277(3): 519-27.
Kajava, A. V. (2001). "Review: proteins with repeated sequence--structural prediction and modeling." J Struct Biol 134(2-3): 132-44.
Katti, M. V., R. Sami-Subbu, et al. (2000). "Amino acid repeat patterns in protein sequences: their diversity and structural-functional implications." Protein Sci 9(6): 1203-9.
Kobe, B. and J. Deisenhofer (1995). "A structural basis of the interactions between leucine-rich repeats and protein ligands." Nature 374(6518): 183-6.
Kobe, B. and A. V. Kajava (2001). "The leucine-rich repeat as a protein recognition motif." Curr Opin Struct Biol 11(6): 725-32.
Kohl, A., H. K. Binz, et al. (2003). "Designed to be stable: crystal structure of a consensus ankyrin repeat protein." Proc Natl Acad Sci U S A 100(4): 1700-5.
Kurtz, S., E. Ohlebusch, et al. (2000). "Computation and visualization of degenerate repeats in complete genomes." Proc Int Conf Intell Syst Mol Biol 8: 228-38.
Kurtz, S. and C. Schleiermacher (1999). "REPuter: fast computation of maximal repeats in complete genomes." Bioinformatics 15(5): 426-7.
Lux, S. E., K. M. John, et al. (1990). "Analysis of cDNA for human erythrocyte ankyrin indicates a repeated structure with homology to tissue-differentiation and cell-cycle control proteins." Nature 344(6261): 36-42.
Myers, E. (1994). "A sub-linear algorithm for approximate keyword matching." Algorithmica 12(4-5): 345-347.
Notredame, C. (2001). "Mocca: semi-automatic method for domain hunting." Bioinformatics 17(4): 373-4.
Pellegrini, M., E. M. Marcotte, et al. (1999). "A fast algorithm for genome-wide analysis of proteins with repeated sequences." Proteins 35(4): 440-6.
Sagot, M.-F. (1998). "Spelling approximate repeated or common motifs using a suffix tree." LNCS 1380: 111-127.
Sedgwick, S. G. and S. J. Smerdon (1999). "The ankyrin repeat: a diversity of interactions on a common structural framework." Trends Biochem Sci 24(8): 311-6.
Thompson, J. D., D. G. Higgins, et al. (1994). "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice." Nucleic Acids Res 22(22): 4673-80.
TIGER (1999). "Repeat-finder." [http://www.tigre.org/tdb/rice/repeatinfo-MUMmer.shtml].
Ukkonen, E. (1985). "Algorithms for approximate string matching." Information and Control 64: 100-118.
Walker, R. G., A. T. Willingham, et al. (2000). "A Drosophila mechanosensory transduction channel." Science 287(5461): 2229-34.

指導教授

洪炯宗(Jorng-Tzong Horng)

審核日期

2003-7-11

推文