Wednesday, January 19, 2011

Bioinformatics · A Concept-Based Introduction






Contents
1 Introduction to Biological Systems........................................................ 1
Claude-Henry Volmar, Nikunj Patel, Amita N. Quadros,
Daniel Paris, Venkatarajan S. Mathura and Michael Mullan
1. Molecules of Life.................................................................................. 1
2. Nucleic Acids: DNA Versus RNA ....................................................... 2
3. Understanding Proteins: Sequence–Structure–Function....................... 4
4. Biological Systems, Signals, and Pathways.......................................... 5
5. Technological Advances and Their Benefits to Biology ...................... 7
6. The Role of Bioinformatics in Big Picture ........................................... 8
7. Exercises ............................................................................................... 9
References............................................................................................... 10
2 Computer Programming Fundamentals and Concepts .................... 13
Deepak N. Kolippakkam, Pankaj Gupta
and Venkatarajan S. Mathura
1. Purpose ............................................................................................... 13
2. Learning Objective ............................................................................. 13
3. Perl Programming ............................................................................... 14
3.1 Variables ....................................................................................... 14
3.2 Operators....................................................................................... 15
3.3 Control Structures ......................................................................... 16
3.4 Regular Expressions ..................................................................... 17
3.5 File Handling ................................................................................ 18
3.6 Subroutines and Functions............................................................ 18
4. PHP Programming .............................................................................. 19
4.1 Language Syntax and Data Types................................................. 19
4.2 Creating Web Interfaces ............................................................... 22
5. Basic RDBMS and SQL ..................................................................... 24
5.1 Data Definition Language (DDL)................................................. 24
5.2 Data Manipulation Language (DML) ........................................... 25
5.3 Data Control Language (DCL) ..................................................... 26
6. Web-Pointers ...................................................................................... 26
3 Introduction to Algorithms .................................................................. 27
Senthilkumar Radhakrishnan, Deepak Kolippakkam
and Venkatarajan S. Mathura
1. Introduction......................................................................................... 27
1.1 Classification ................................................................................ 27
1.2 Hypothesis Testing ....................................................................... 28
1.3 Decision Tree................................................................................ 28
1.4 Clustering...................................................................................... 29
1.5 Principal Component Analysis ..................................................... 29
1.6 Multidimensional Scaling ............................................................. 29
1.7 Regression Analysis...................................................................... 29
1.8 Linear Discriminant Analysis ....................................................... 30
1.9 Fuzzy Logic .................................................................................. 30
1.10 Pattern Recognition..................................................................... 31
1.11 Bayesian Statistics ...................................................................... 31
1.12 Neural Networks ......................................................................... 32
1.13 Hidden Markov Model................................................................ 32
1.14 Support Vector Machines ........................................................... 33
2. Exercises ............................................................................................. 33
3. Useful Web-Pointers........................................................................... 34
References............................................................................................... 35
4 Biological Sequence Databases ............................................................ 39
Meena Sakharkar, Pandjassarame Kangueane
and Venkatarajan S. Mathura
1. Purpose ............................................................................................... 39
2. Learning Objective ............................................................................. 39
3. Introduction......................................................................................... 39
3.1 Genomic Sequence Databases – GenBank, EMBL, DDBJ .......... 41
3.2 Protein Sequence Databases ......................................................... 42
3.3 Secondary Databases on Molecular Evolution ............................. 44
References............................................................................................... 46
5 Biological Sequence Search and Analysis ........................................... 47
Venkatarajan S. Mathura
1. Purpose ............................................................................................... 47
2. Learning Objectives............................................................................ 47
3. Introduction......................................................................................... 48
3.1 Similarity Matrices and Alignment............................................... 48
3.2 Sequence Search and Pair-Wise Alignment ................................. 50
3.3 Global Alignment Using Needleman-Wunsch Algorithm............ 51
3.4 Sequence Search Tools ................................................................. 53
3.5 Pair-Wise and Multiple-Sequence Alignment Tools .................... 55
3.6 Sequence Motifs ........................................................................... 57
References............................................................................................... 61
6 Protein Structure Prediction................................................................ 63
Hongyi Zhou, Yaoqi Zhou and Venkatarajan S. Mathura
1. Introduction......................................................................................... 63
2. Secondary Structure Prediction .......................................................... 65
3. Comparative Modeling ....................................................................... 66
3.1 Steps Involved in Comparative Modeling .................................... 67
3.2 Homologous Sequence Search Using Sequence
Comparison Tools......................................................................... 67
3.3 Identifying Remote Templates Using Fold-Recognition
Methods ........................................................................................ 68
3.4 Selection of the Alignment ........................................................... 69
3.5 Construction of 3D Models Using Modeling Programs ............... 69
3.6 Protein Modeling Package – MPACK.......................................... 70
3.7 SP3 – A Web-Based Structure-Prediction Tool Using
Known Protein Structures as Templates ....................................... 70
3.8 Modeling Servers.......................................................................... 73
3.9 Critical Assessment of Structure Prediction ................................. 74
3.10 Objective Testing of Modeling Tools in CASP .......................... 74
References............................................................................................... 75
7 Protein-Protein Interaction and Macromolecular Visualization...... 79
Arun Ramani, Venkatarajan S. Mathura, Cui Zhanhua
and Pandjassarame Kangueane
1. Introduction......................................................................................... 79
2. Experimental Methods........................................................................ 80
2.1 Yeast Two-Hybrid ........................................................................ 80
2.2 Affinity Tagging ........................................................................... 81
2.3 Computational Methods................................................................ 82
2.4 Co-evolution ................................................................................. 83
2.5 Structure Based Methods .............................................................. 83
3. Protein Structure Visualization........................................................... 91
4. Databases ............................................................................................ 91
References............................................................................................... 93
8 Genes, Genomics, Microarray Methods and Analysis ...................... 97
Ghania Ait-Ghezala and Venkatarajan S. Mathura
1. Introduction......................................................................................... 97
2. Gene Identification and Characterization ........................................... 98
2.1 Identifying Human Genes and Cloning ........................................ 98
3. Microarray Experiments ................................................................... 102
3.1 Microarray Databases ................................................................. 104
3.2 Gene Annotations, Ontology, and Pathway Databases............... 104
References............................................................................................. 105
9 Introduction to Proteomics ................................................................ 107
Fai Poon and Venkatarajan S. Mathura
1. Introduction....................................................................................... 107
2. Sample Preparation ........................................................................... 108
3. Two-Dimensional (2D) Gel Electrophoresis .................................... 108
3.1 Image Analysis and Statistical Analysis ..................................... 109
3.2 In-Gel Digestion and Mass Spectrometry................................... 109
4. Mass Spectrometry ........................................................................... 109
4.1 Mass Spectrometry in Proteomics .............................................. 110
5. Bioinformatics Applications for Identification................................. 111
6. Conclusion ........................................................................................ 113
References............................................................................................. 113
10 Biomedical Literature Mining ........................................................... 115
Chaolin Zhang and Michael Q. Zhang
1. Introduction....................................................................................... 115
2. Literature Sources for Mining........................................................... 117
3. Recognition of Biological Terms...................................................... 118
3.1 Gene/Protein Name Recognition ................................................ 119
3.2 Removing Gene/Protein Name Ambiguities .............................. 120
3.3 Collecting Other Keywords ........................................................ 120
4. Mining Biological Relationships ...................................................... 121
4.1 Detecting Gene Interactions by Co-occurrence .......................... 121
4.2 Inferring Implicit Relationships.................................................. 122
4.3 Identifying Sub-networks of Communities................................. 123
4.4 Evaluating Functional Coherence of Gene Group ...................... 124
5. Acknowledgments ............................................................................ 124
References............................................................................................. 125
11 Computational Immunology: HLA-Peptide Binding Prediction.... 129
Pandjassarame Kangueane, Bing Zhao and Meena K. Sakharkar
1. Background....................................................................................... 129
2. HLA Molecules ................................................................................ 131
3. HLA Binding Peptide Based Methods.............................................. 132
3.1 Sequence Based Prediction Models ............................................ 133
3.2 Molecular Structure Based Predictions....................................... 143
4. Conclusion ........................................................................................ 150
References............................................................................................. 151
12 Bioinformatics Application: Eukaryotic Gene
Count and Evolution ........................................................................... 155
Meena K. Sakharkar and Pandjassarame Kangueane
1. Introduction....................................................................................... 155
2. Methodology..................................................................................... 156
2.1 Identification of SEG .................................................................. 156
2.2 Identification of MEG................................................................. 156
2.3 Pseudogenes................................................................................ 157
2.4 Caveats........................................................................................ 157
2.5 Total Genes ................................................................................. 158
3. Results and Discussion ..................................................................... 158
3.1 Utility of SEG and MEG Sequences to the Study of Evolution.... 158
3.2 Selection of SEG and MEG in Different Eukaryotic Genomes .... 158
3.3 Mechanism of SEG Origin ......................................................... 160
4. Conclusion ........................................................................................ 161
References............................................................................................. 162
13 Bioinformatics Application: Predicting Protein Subcellular
Localization by Applying Machine Learning ................................... 163
Pingzhao Hu, Clement Chung, Hui Jiang and Andrew Emili
1. Introduction....................................................................................... 163
2. Methods ............................................................................................ 165
2.1 Data Sets and Preprocessing ....................................................... 165
2.2 Learning Algorithm .................................................................... 166
2.3 Evaluating Performance of the Learning Algorithm................... 167
2.4 Strategy for Multi-class/Multi-label Classification..................... 167
2.5 Optimal Sampling Methods for Imbalanced Data Sets............... 168
2.6 Algorithm of Asymmetric Bagging Strategy .............................. 169
3. Results............................................................................................... 170
4. Discussion......................................................................................... 172
References............................................................................................. 172
14 Bioinformatics Analysis: Gene Fusion .............................................. 175
Meena Kishore Sakharkar, Yiting Yu
and Pandjassarame Kangueane
1. Introduction....................................................................................... 175
2. Identification of Fusion Proteins....................................................... 176
2.1 Human Fusion Proteins Mimicking Bacterial Operons .............. 177
2.2 Human Fusion Proteins Simulating Bacterial Subunit
Interfaces..................................................................................... 177
2.3 Fusion Proteins Exhibiting Multiple Functions .......................... 177
2.4 Fusion Proteins Showing Alternative Splicing ........................... 178
3. Remarks on Fusion Proteins ............................................................. 178
References............................................................................................. 180
Index .......................................................................................................... 183


Another Bioinformatics Books
Download

No comments:

Post a Comment

Related Posts with Thumbnails

Put Your Ads Here!