An Introduction

Tax3 is mathematical framework uniquely suited to analyze large datasets in drug discovery and development, investigating complex and heterogeneous disorders. It's multivariate, powerful and probes heterogeneity. Background New technologies have increased dramatically the number of variables that can be investigated in complex and heterogeneous disorders. It is now common to analyze datasets composed of several million variables (e.g. SNPs) and several thousand observations (e.g. patients). In this new environment, classical methods such as association studies using univariate tests are overwhelmed and sub-optimal (ref 1).

Tax3 is uniquely suited to cope with these challenges

Multivariate analysis of high dimensional dataset

Tax3 can be applied to high dimensional datasets where several groups are investigated (e.g. whole genome scan on cases and controls), handling the large datasets generated by the latest genotyping technologies.

Probe heterogeneity in cohorts

Tax3 is designed to address complex disorders and human heterogeneity. It investigates correlation patterns, small or big, between subjects and variables. Therefore, tax3 reveals variables of interest for specific sub-groups of subjects: it will not only deliver variables of importance for the case/control distinction as a whole, but also variables distinguishing some cases from some controls.

Powerful signal detection with small number of patients

Tax3 can detect significant differences between cohorts composed of few subjects, even if a very large number of variables are investigated. This is because, unlike univariate tests, Tax3 is a multivariate method, so it does not require any correction for multiple testing.

Genome wide inference, predictive model

Tax3 can build powerful predictive models using all available variables. For example, in a case/control study, genome wide inference of case/control status is easy to implement.

Co-analysis of several data types: SNPs, -omics, clinical, epidemiological, etc...

In a typical Tax3 analysis, variables of several types are analyzed together. All variables are treated equally, and can be used to detect signal, probe heterogeneity and build predictive models.

Large scale analysis of variables interactions and genetic epistasis

Tax3 can investigate large scale variable interaction and genetic epitasis. This can lead to massive improvement in signal detection, and to the discovery of new pathways.

Discovery of new patients sub-groups and new sub-phenotypes

By probing heterogeneity, Tax3 can compartmentalize complex disorders and classify patients into relevant sub-groups that can be distinguished with specific variables. This gives you the means to discover new sub-phenotypes, critically important for disease understanding and for patient selection in clinical studies.

Discovery of new targets, development of personalized medicines

Patients' sub-groups discovered by tax3 are identified by specific variables. These variables form potential drug targets that are specific to these patients and their sub-phenotypes, paving the way for personalized medicines.

Patients selection in clinical R&D

In clinical trials, Tax3 analysis helps select patients more relevant to the molecular pathway being targeted. This leads to smaller, more powerful phase II proof of concept studies. In addition, powerful predictive tax3 models can help exclude patients at risk of drug-induced adverse events.

Leading centre for expertise and development

Tax3 is an important component of PGXIS business plans. PGXIS possesses world-class expertise and know-how for the development and implementation of tax3 methods.

Highly efficient proprietary software for analysis of high dimensional dataset

In 2009, PGXIS has developed efficient software, allowing the rapid analysis of large datasets. For example, the multivariate analysis of 300 subjects and 1 million SNPs can be analyzed in a few hours

Service provision or technological transfer

PGXIS can provide tax3 analysis capabilities and transfer know-how to other organizations.

Publications and application

Publications and demo software are available at http://taxonomy.delrieu.org

Tax3 is a published and peer reviewed method

The Tax3 framework was discovered in 2005. Every year since then, theoretical developments have been presented and published at the Leeds Annual Statistical Review (LASR) workshop, sponsored and organised by the Department of Statistics, University of Leeds, UK (refs 5 to 9).

References

1. Drinking from the fire hose--statistical issues in genome wide association studies. Hunter DJ, Kraft P. N Engl J Med. 2007 Aug 2;357(5):436-9. Epub 2007 Jul 18
2. TNF, LTA, HSPA1L and HLA-DR gene polymorphisms in HIV-positive patients with hypersensitivity to cotrimoxazole. Alfirevic A, Vilar FJ, Alsbou M, Jawaid A, Thomson W, Ollier WE, Bowman CE, Delrieu O, Park BK, Pirmohamed M. Pharmacogenomics. 2009 Apr;10(4):531-40.
3. Investigation into the multidimensional genetic basis of drug-induced Stevens–Johnson syndrome and toxic epidermal necrolysis. Pirmohamed M, Arbuckle J, Bowman C, Brunner M, Burns D, Delrieu O, Dix L, Twomey J and Stern R. Pharmacogenomics 2007 Dec; 8(12), 1661–1691.
4. Visualizing gene determinants of disease in drug discovery. Delrieu O and Bowman C. Pharmacogenomics. 2006 Apr;7(3):311-29.
5. LASR 2009: Correlation laplacians, haplotype networks and residual pharmacogenetics. Bowman C. & Delrieu O.
6. LASR 2008: Whole genome scan algebra and smoothing. Charalambous C, Delrieu O and Bowman C.
7. LASR 2007: On using the correlations of divergences. Delrieu O and Bowman C.
8. LASR 2006: Filtering pharmacogenetic signals. Bowman C, Delrieu O and Roger J.
9. LASR 2005: Visualisation of gene and pathway determinants of disease. Delrieu O and Bowman C.
10. Visualisation of gene by gene interactions in pharmacogenetics (poster). Delrieu O and Bowman C. International Congress Of Human Genetics, Brisbane Australia, 6-11th August 2006