|
An Introduction to Statistics and Data Analysis for Bioinformatics using R
Sorin Draghici
|
Últimas novedades biología
|
From the very basics to linear models, this book provides a complete introduction to statistics, data analysis, and R for bioinformatics research and applications. It covers linear models, ANOVA, cluster analysis, visualization tools, and machine learning techniques. Suitable for self-study and courses in computational biology, bioinformatics, statistics, and the life sciences, the text also presents examples of microarrays and bioinformatics applications. R code illustrates all of the essential concepts and is available on an accompanying CD-ROM.
|
indíce |
Introduction
Bioinformatics — an emerging discipline
Introduction to R
Introduction to R
The basic concepts
Data structures and functions
Other capabilities
The R environment
Installing Bioconductor
Graphics
Control structures in R
Programming in R vs C/C++/Java
Bioconductor: Principles and Illustrations
Overview
The portal
Some explorations and analyses
Elements of Statistics
Introduction
Some basic concepts
Elementary statistics
Degrees of freedom
Probabilities
Bayes’ theorem
Testing for (or predicting) a disease
Probability Distributions
Probability distributions
Central limit theorem
Are replicates useful?
Basic Statistics in R
Introduction
Descriptive statistics in R
Probabilities and distributions in R
Central limit theorem
Statistical Hypothesis Testing
Introduction
The framework
Hypothesis testing and significance
"I do not believe God does not exist"
An algorithm for hypothesis testing
Errors in hypothesis testing
Classical Approaches to Data Analysis
Introduction
Tests involving a single sample
Tests involving two samples
Analysis of Variance (ANOVA)
Introduction
One-way ANOVA
Two-way ANOVA
Quality control
Linear Models in R
Introduction and model formulation
Fitting linear models in R
Extracting information from a fitted model: testing hypotheses and making predictions
Some limitations of the linear models
Dealing with multiple predictors and interactions in the linear models, and interpreting model coefficients
Experiment Design
The concept of experiment design
Comparing varieties
Improving the production process
Principles of experimental design
Guidelines for experimental design
A short synthesis of statistical experiment designs
Some microarray specific experiment designs
Multiple Comparisons
Introduction
The problem of multiple comparisons
A more precise argument
Corrections for multiple comparisons
Corrections for multiple comparisons in R
Analysis and Visualization Tools
Introduction
Box plots
Gene pies
Scatter plots
Volcano plots
Histograms
Time series
Time series plots in R
Principal component analysis (PCA)
Independent component analysis (ICA)
Cluster Analysis
Introduction
Distance metric
Clustering algorithms
Partitioning around medoids (PAM)
Biclustering
Clustering in R
Machine Learning Techniques
Introduction
Main concepts and definitions
Supervised learning
Practicalities using R
The Road Ahead
|
|
|