Statistical Genomics

We develop statistical techniques that can be used to analyse the large quantities of genomic data generated at MRC Harwell Institute, helping to distinguish true results from normal variation and highlight patterns in the data

The research agenda of this group relates to the development of mathematical techniques for the analysis of experimental data arising from the mouse genome towards improvements in the understanding of human disease. The group specialises in the use of modern computationally intensive statistical methods including statistical pattern recognition tools, machine learning techniques and nonlinear stochastic process models of biological systems. These methods bypass the traditional, often unrealistic, assumptions of linear additive Normal (Gaussian) models, allowing the data to ‘speak for itself’ rather than imposing parametric forms on the functional relationship between quantities.

Our work, where possible, is underpinned by Bayesian probability theory. Particular projects of interest include:

  • Analysis of time-course microarray data (e.g. clustering genes exhibiting similar dynamical transcription behaviour)
  • Use of nonparametric, distribution free, methods to detect differential expression
  • Use of nonlinear models for detecting and quantifying gene-gene interaction