BHARATHIAR UNIVERSITY-COIMBATORE-641046
M. Phil.,/ Ph.D. - BIOINFORMATICS
PART I - SYLLABUS
(For candidates admitted from the academic year 2018-19 onwards)
PAPER I - RESEARCH METHODOLOGY
UNIT I:
Research Methodology: Introduction; the meaning of research; objectives of the research; types of research; research approaches; significances of research; research methods vs methodology; research and scientific method; the importance of knowing how research is done; research process; criteria of good research; problem encountered by researchers in India; Defining the research problem; What is the research problem? Selecting the problem; Techniques involved in defining the problem; Research design; Need for research Design; Features of good Design, important concepts relating to design; different research designs; basic principles of experimental designs.
UNIT II:
Hypothesis testing: What is Hypothesis? Basic concepts concerning testing of hypothesis; Procedure for hypothesis testing; Probability; Markov models and Hidden Markov Models; Probability distribution; Binomial; Poisson; Normal distribution and Multiple testing Methods ANOVA; Test of significance-t-test; F-test.
UNIT III:
Interpretation and Report writing; Meaning of Interpretations; Techniques of interpretation; precautions of interpretations; significances of report writing; Different steps in report writing; layout of the research project; types of the report; oral presentation; mechanics of writing a research project; precautions for writing research reports; conclusions
UNIT IV:
Elements of C Programming; Features of C; Variables; Constants; keywords; Data types; operators; statements; loops – simple programs using Loops, Arrays – integer arrays – character arrays – simple programs using arrays; Introductions to functions – simple programs using functions – Introduction to pointers, structures string Manipulations using pointers and arrays; Files; Defining and opening a file, Closing a file, input/output operations on files. PERL: Basic syntax-I/O – Variables, strings & arrays-control structures – regular expressions –simple programs
UNIT V:
Algorithms in computer sciences inspired by biology genetic algorithms, Neural networks, and path optimization
References:
1. Kothari. C.R. 2004 Research Methodology – Methods and Techniques, New Age
International (P) Ltd.
2. E Balagurusamy. Programming in ANSI C Tata Mc Graw Hill.
3. Randa L.Schwartz, tom phoenix, learning Perl, third edition.
PAPER-II – ADVANCES IN BIOINFORMATICS
UNIT I:
High throughput genome sequencing and genome assembly, Gene finding algorithms, DNA, Microarrays, and large gene expression data sets, clustering algorithms
UNIT II:
Protein and Nucleic acid sequence alignments, Sequence databases, the use of algorithm BLAST, Multiple sequence alignments
UNIT III:
Protein Structure Analysis; Protein structure databases; Protein Structure comparison; Fold Recognition;3D – ID Profiles; Threading; Comparative Structure Modeling
UNIT IV:
Phylogeny (evolutionary trees) biological networks; pathway analysis
UNIT V:
Emerging new ideas on treating biological systems; Pharmacogenesis and its applications; SNPs and their applications
References:
1. Andreas D Baxevanis and BF Francis Oueliene 2001 Bioinformatics A Practical Guide to
the analysis of Genes and Proteins,A John wiley & sons,INC,Pub
2. David W Mount,2003 Bioinformatics – Sequence and Genome Analysis,CBS
Publishers,Ian Korf,Mark Yandell & Joseph Bedell,2003
3. Ian Korf.Mark Yandel & Joseph Bedell.2003 BLAST(O’ RELLY)SPD Pvt Ltd
PAPER III – 1. BIOLOGICAL DATABASES, DATA MINING, AND GENOMIC DATA ANALYSIS
UNIT I:
Biological database - Database browsers and search engines; Sequence databases; Microarray databases; Other specialized databases: Interaction databases-KEGG and STRING; Expression Databases - SRA and GEO.
UNIT II:
Data mining definition – Classification and clustering of data – Association rules – Data visualization.
UNIT III:
Introduction to Microarrays - Oligonucleotide and Spotted cDNA arrays - Design considerations for microarray experiments – Goals of a microarray experiment. Use of array analysis programs – SAM - TIGR programs – MEV.
UNIT IV:
Introduction to NGS and its methodology, Data analysis workflows - Reference based and de- novo assembly, NGS Platforms, Types of NGS - DNA-Seq, RNA-Seq, ChIP-Seq.
UNIT V:
Basic research applications with microarrays, Microarrays and Cancer, Applications of NGS in crop improvement and development.
References:
1. Analysis of DNA Microarray Data by Steen Knudsen.
2. Discovering Genomics, Proteomics, and Bioinformatics by A.M. Campbell and L.J.
Heyer.
3. Next-generation genome sequencing: Towards Personalized Medicine by Michal Janitz,
Wiley-VCH, 2008
PAPER III - 2. COMPUTATIONAL BIOLOGY METHODS AND TOOLS
UNIT I: Data Mining and Sequence Analysis
Biological background for sequence analysis.
Searching for database for similar to a new sequence.
Identification of protein primary sequence from DNA sequence.
Searching for database for similar to a new sequence.
Calculation of sequence alignment for evolutionary interferences and to aid in
structural and functional analysis.
UNIT II: Similarity Searches & Construction of Phylogenetic Guide Tree
Distance and similarity.
The evolutionary basis for sequence alignment.
Substitution scores and gap penalties.
Optimal alignment method.
Database similarity searching.
FASTA and BLAST.
Conclusion and internet software availability.
UNIT III: Practical Aspect of Multiple Sequence Alignment
Introduction.
MULT ALIN.
BLOCKS.
MOST.
Probe.
MacBoxshade.
UNIT IV: Phylogenetic Analysis
Introduction.
Phylogenetic tree building methods.
Multiple tree alignment procedures.
Searching for trees.
Evaluating trees and data.
Phylogenetic software’s.
Internet resources.
UNIT V: Predictive Methods Using Protein and Nucleic Acid Sequences
Introduction.
Detecting functional sites in DNA.
Internet tools for identification of protein coding genes.
Internet resources for repeat analysis.
Predictive methods using protein sequences.
AACompIdent and AACompsin.
Secondary structures and folding classes.
nnPredict, predict protein, ssPRED, SOPMA.
Tertiary structures.
References:
1. Computer methods for macromolecular sequence analysis. Doolittle R.F (Ed.).
Academic Press, San Diego (1996).
2. Introduction to Bioinformatics. Teresa K. Attwood and David J. Parry-Smith.
3. Bioinformatics-concepts, skills, applications. S.C. Rastogi, Namita Mendiratta,
Parag Rastogi.
4. Bioinformatics – A practical approach 2004. K. Mani and N. Vijayaraj. Aparna
publications.
5. Handbook of computational Molecular Biology. Edited by Srinivas Aluru.
Chapman and Hall 2006.
6. Computational Methods in Molecular Biology Edited by S. Salzberg, D. Searls,
and S. Kasif. Elsevier Science, 1998.
7. Sequence and Genome Analysis. By David W. Mount Published 2004 CSHL
Press Science.
8. Trends in Bioinformatics. By Dr. P. Shanmughavel. 2006 Pointer publishers,
Jaipur, India.
9. Principles of Bioinformatics. By Dr. P. Shanmughavel. 2005 Pointer publishers, Jaipur,
India.
PAPER III – 3. STRUCTURAL BIOLOGY AND BIOPHYSICS
UNIT I: Basics
Fundamentals of proteins, carbohydrates and Nucleic Acids, Classification of proteins, Helix, Sheet, Strand, Loop and Coil, Active site, Class and Domains, Fold, motif, Profile, Protein stability, protein folding.
UNIT II: Structural classification of proteins:
Understanding various structures of protein, globular and fibrous protein, and membrane protein. Functional classification of proteins: Cell surface receptors, GPCR, kinases, channel proteins,
Ubiquitin.
UNIT III: Structure Prediction
Protein sequencing; Secondary structure prediction tools and methods, tertiary structure prediction tools and methods; Structure alignment, validation, refinement, prediction; protein-protein interactions.
UNIT IV: Scope and methods of Biophysics
Basics of X-rays, crystals and symmetry; X-ray crystallography, nuclear magnetic resonance, UV spectrophotometry, electron microscopy, cryo electron microscopy, atomic force microscopy, MALDI-TOF, Mass spectrophotometry, synchrotron radiation and its uses, Protein and DNA microarray.
UNIT V: Databases
Protein Sequence databases; Structure Databases (CATH, SCOP, FSSP, MMDB,PDB, MPDB, TMPDB, SARF); Docking, QSAR, Drug Discovery, Intellectual Property rights.
References:
1. Outline of Crystallography for Biologists- David Blow
2. Principles of Proteomics - R.M.Twyman
3. Structural Biology of Membrane Proteins – Reinhard Grisshammer and Susan K
Buchanan
4. Proteins Structures and Molecular Properties - Thomas. E. Creighton
5. Biophysical Chemistry Part II Techniques for the study of biological structure and
function - Cantor and Schimmel
6. Foundations of Structural Biology - Leonard Banaszak
7. Structural Bioinformatics - Philip E. Bourne
8. Textbook of Biochemistry - Thomas M Devlin
PAPER III – 4. COMPREHENSIVE ANALYSIS IN BIOCHEMATICS
UNIT I: Genome analysis
Isolation of genomic and organelle DNA from Prokaryotes and Eukaryotes. Mapping and sequencing genes, Electrophoretic karyotyping, Construction and screening of genomic DNA libraries. Functional genomics: Sequence based, Microarray based approaches, insilico vector construction.
UNIT II: Techniques for Isolation and Purification of Protein & Bio-active compounds
Extraction (soxhlet and cold percolation), Isolation of Alkaloids and Flavonoids, Protein extraction from Micro organisms, Plants and Animals. Purification: Hanging drop, Native gel, Chromatographic methods (Column, Preparative TLC, HPLC, HPTLC, Ion exchange, Gel filtration, Affinity), Crystallization.
UNIT III: Structure elucidation of Protein and Bioactive compounds
Crystal studies, IR, NMR, MASS, CHN analysis, X-ray diffraction, 2-D Electrophoresis, Protein microarray. Tools used for protein structure prediction: BLAST, PDB, Swiss Model, Modeler, PSIPRED, JPRED; Structure validation: SAVS; Motif databases: BLOCKS, PROSITE, PFAM, COG.
UNIT IV: Metabolomics and Evolutionary Biology
Analyzing databases for Metabolic Pathways (WIT, KEGG, PathDB, PathCase); Reconstruction of metabolic pathways (BioCyc, ASGARD); Metabolic and Cellular simulation: Gepasi , Virtual cell; Tools for Phylogenetic analysis: CLUSTALW, PHYLIP, MEGA.
UNIT V: Molecular Interaction and Docking
Determination of active site and hot spots, Receptor-Ligand interactions, Pharmacophore identification (Catalyst, DISCO, GASP), De novo drug designing (Group Build, Gen Star). Tools used for docking (AUTODOCK, FLEX X, GLIDE).
References:
1. Sujata V. Bhat, Bhimsen A. Nagasampagi and Meenakshi Sivakumar. Chemistry
of Natural Products, Narosa Publishing House.
2. Daniel M. Bollag, Michael D. Rozycki and Stuart J. Edelstein. Protein Methods.
Wiley-Liss. A John Wiley & Sons, INC, Publications.
3. Mount, David W. Sequence and Genome Analysis. Cold Spring Harbor
Laboratory Press Publications.
4. S.B.Primrose and R.M.Twyman. Principles of Gene Manipulation & Genomics.
Black well Publishing.
5. Cynthia Gibas and Per Jambeck. Developing Bioinformatics Computer Skills.
O’Reily and Associates.
6. Jin Xiong. Essential Bioinformatics. Cambridge University Press
7. Thomas Lenganr (Ed). Bioinformatics - From Genomes to Drugs Volume I and II.
Wiley-Veh, Germany.
PAPER III – 5. ESSENTIALS OF BIOPROGRAMMING, BIOPHYSICS AND CADD
UNIT I: UNIX/LINUX Operating System
UNIX – Introduction - Text processing - UNIX file system and related Commands - types
of files - Commands and Operation of UNIX - UNIX filenames and file protections - UNIX
commands for working with directories - Repeating functions loops and IF statements - Different
File Editors - Mastering the special features of the UNIX system - Advanced Unix commands -
Configuring services in Unix - Introduction to Linux - System Processes - User Management -
Types of users, Creating users- Granting Rights - File Quota, File-System Management and
Layout - Login Process- Linux shells (bash and tcsh) - Shell Programming Networking on Linux
- Printing and print sharing- ftp service, http service.
UNIT II: Perl for Bioinformatics
Variables and Operators in Perl Scalar Variables - Array Operations and Functions -
Hash Functions - Perl Subroutines - File handling functions - Perl Regular Expression – Pattern
Matching and String Manipulations - Common Gateway Interface (CGI) - Perl and the Web –
Perl development with eclipse - General Bioperl Classes - Sequences - Features and Location
Classes (Extracting CDS) - Alignments (AlignIO) - Analysis (Blast, Genscan) - Databases
(Database Classes, Accessing a local database) – References and Complex data structures
UNIT III: R Programming and Matlab
How R works - Data with R - Objects – File operations - Operators - The data editor –
Useful R functions - Graphics with R – Packages in R - Statistical analyses with R - A simple
example of analysis of variance - Packages Programming with R in practice – Bioconductor -
Loops and vectorization - Writing a program in R - Writing own functions - The Matlab interface
- Writing in script files - Importing data – Plotting - Using in-built functions - Creating your own
functions - Basic programming in Matlab (including for loops) – Case study with biological
examples
UNIT IV: Basic Concepts of Molecular Mechanics and Simulation
Empirical force field - Energy minimization: Steepest descent, conjugate gradient –
Derivatives - non derivatives minimization methods - Simulation methods: Newton’s equation of
motion, equilibrium point, radial distribution function, pair correlation functions, MD
methodology, periodic box, Solvent access, Equilibration, cutoffs, algorithm for time
dependence - uses in drug designing, ligand protein interactions - Various methods of MD,
Monte Carlo, systematic and random search methods - Differences between MD and MC,
Energy, Pressure, Temperature, Temperature dynamics, simulation softwares - Various methods
of MD, Monte Carlo, systematic and random search methods
UNIT V: Computer-aided drug discovery (CADD) concepts
Discovery and design of new drugs, computer representation of molecules, 3d database
searching, conformation searches, deriving and using the 3d Pharmacophore- keys constrained
systematic search, clique detection techniques, maximum likelihood method, molecular docking,
scoring functions, structure based de novo Ligand design, quantitative structure activity
relationship QSAR, QSPRs methodology, various descriptors quantum chemical, use of genetic
algorithms, Neural Network and Principle components analysis in QSAR equations,
combinatorial libraries, design of “Drug like” libraries.
References:
1. Matthew, N., & Stones, R. (2004). Beginning Linux Programming (Third Edition).
2. Thomas, R. (1985). A User Guide to the Unix system. Osborne McGraw-Hill.
3. Tisdall, J. (2001). Beginning Perl for Bioinformatics. " O'Reilly Media, Inc.".
4. Schwartz, R. L., Phoenix, T., & Foy, B. D. (2008). Learning Perl. 5th.
5. Wall, L., Christiansen, T., & Orwant, J. (2000). Programming Perl. 3rd. Edition. ISBN,
978-0596000271.
6. Paradis, E. (2002). R for Beginners.
7. Coghlan, A. (2011). Little book of R for Bioinformatics.
8. Singh, G. B. (2015). Fundamentals of Bioinformatics and Computational Biology.
Springer International Publishing.
9. Andrew, R. L. (2001). Molecular Modeling Principles and Applications. 2nd. Editor.:Pearson Education Limited.
10. Schlick, T. (2010). Molecular Modeling and Simulation: An Interdisciplinary Guide:
(Vol. 21). Springer Science & Business Media.
11. Lednicer, D. (2009). Strategies for Organic Drug Synthesis and Design. John Wiley &
Sons.
12. Gordon, E. M., & Kerwin, J. F. (Eds.). (1998). Combinatorial Chemistry and Molecular
Diversity in Drug Discovery. Wiley-Liss.
PAPER III – 6. GENOMIC DATA AND NEXT-GENERATION SEQUENCING
UNIT I: Foundations of Big Data Systems
Introduction to Big Data and its Applications, Data Abstraction, Linear data structures
(Hashtables, Hashmaps & Bloom Filters), Non-linear data structures (Binary Search Trees & KD
Trees), Distributed Algorithm Design, and Algorithm Design using MapReduce.
UNIT II: Small RNA/MicroRNA
Introduction of miRNA, Regulations of miRNA, Importance of miRNA, Discovery of
miRNA, Discovery of RNA interference, MiRNA biogenesis, Mirna targets, Mirna evolution,
Application of RNA interference, Mirna sequencing, miRBase, miRDB and miRFinder.
UNIT III: Essential Wet Lab Techniques for NGS
Preparation of Nucleic Acid Templates - Illumina MiSeq, DNA quantification, DNA
quality assessment, Tagmentation & Tagmentation reaction cleanup, Library quality assessment,
Examining qPCR results, Pooling DNA libraries, Load MiSeq machine.
UNIT IV: Essential Computer Lab Techniques for NGS
Unix/Linux - Command line navigation, Directory creation, File movement, Text editing,
Piping, Sequence data file formats and Manipulation, Assembly, Comparison and alignment.
Mapping MiSeq Reads, Variant calling; Genome browsers and analysing MiSeq Runs, BLAST
from the command line, Gene finding programs (Maker, AUGUSTUS and SNAP).
UNIT V: Advances and Applications of NGS
Complete genome resequencing, Reduced representation sequencing, Targeted genomic
resequencing, Paired end sequencing, Metagenomics sequencing, Transcriptome sequencing,
Small RNA sequencing, Sequencing of bisulfite-treated DNA, Chromatin immunoprecipitation-
sequencing (ChIP-Seq), Nuclease fragmentation and sequencing, Molecular barcoding.
References:
1. Parag Kulkarni, Big Data Analytics (Kindle Edition).
2. Borko Furht and Flavio Villanustre, Big Data Technologies and Applications, Springer.
3. Huang, J. et al. Bioinformatics in MicroRNA Research, Springer Protocols, ISBN 978-1-
4939-7046-9.
4. Zentralblatt Math, Next-generation sequencing: Current Technologies and Applications,
Caister Academic Press, ISBN: 978-1-908230-33-1.
5. Xinkun Wang, Next Generation Sequencing Data Analysis, CRC Press, (ISBN13:
9781482217889).
PAPER III – 7. MOLECULAR GENOMICS
UNIT I: Genomics
Genes and Genomes – organization and features, gene expression, cDNA library,
expressed sequence tags (EST). Tools and Approach for gene identification; codon-bias
detection, detecting functional sites in the DNA. Microarrays - Tools for microarray analysis;
soft-finder, xCluster, MADAM, SAGE, GEO database.
UNIT II: Molecular Dynamics
Biophysical aspects of proteins and nucleic acids - RNA folding, RNA loops,
conformational study, ribose ring conformations and puckering, protein-protein interactions,
protein-ligand Interactions, DNA and RNA binding proteins, Ramachandran plot, 3D Structures
of membrane proteins, 310 helix and loops, Protein Structure and functional sites prediction,
Protein folding problem and folding classes.
UNIT III: Proteomics
Tools and techniques in proteomics; gel electrophoresis, gel filtration, PAGE, isoelectric
focusing, affinity chromatography, HPLC, ICAT, fixing and spot visualization, Mass
spectroscopy for protein analysis, MALDI-TOF, Electrospray ionization (EST), Tandem mass
spectroscopy (MS/MS) analysis; tryptic digestion and peptide fingerprinting (PMF). PPI
Networks- databases and software: DIP, PPI Server, BIND, PIM and GRID.
UNIT IV: Metabolomics
Metabolic pathway regulation - mechanisms and strategies - pathway structure and
methods of regulation - metabolite profiling; types of metabolic pathways and typical reactions-
metabolic fingerprinting - Gene Ontology and KEGG Pathway analysis. Basics of metabolomics
analysis - LC-MS analytical platform - metabolite identification methods.
UNIT V: Cancer Biology
Multistep process of carcinogenesis, Hallmarks of cancer, Clonal expansion, tumor suppressor
genes and oncogenes, signal transduction pathway, Apoptosis and cancer, cell cycle regulation in
cancer, Molecular mediators of angiogenesis, Invasion and metastasis.
References:
1. Mount D. (2004). Bioinformatics: Sequence and Genome Analysis; Cold Spring Harbor
Laboratory Press, New York.
2. Christoph W. Sensen. (2002). Essentials of genomics and Proteomics. Wiley-VCH
3. Bourne, P. E. & Weissig, H. (2003) “Structural bioinformatics”; Wiley‐Liss, 2003.
4. Richard, J.R. (2003) “Analysis of Genes and Genomes”; Wiley Publications.
5. T. Palzkill. (2002). Proteomics, Kluwer Academic Publishers, New York, USA
6. T.A. Brown. (2006) Genomes 3. (III edition), Garland Science,
7. S.G. Villas-Boas. (2007).Metabolome Analysis:An Introduction,Wiley-Blackwell.
8. Robert A. Weinburg. (2012) The Biology of cancer, Garland Science.
9. Lauren Pecorino. (2008) Molecular Biology of cancer: Mechanisms, targets and
therapeutics, Oxford University Press.