Computational Diagnostics Group compdiag MPI for Molecular Genetics

Group Members
Dept. Vingron
Group Seminar
NGFN Microarray Data Analysis Resource

Documentation of Diagnostic Signatures

Dennis Kostka

Background: It has been shown that microarray based gene expression signatures are a powerful tool for patient stratification, diagnosis of disease, prognosis of survival, assessment of risk group and selection of treatment. However, documentation standards in current publications do not allow for a signatures unambiguous application to study-external patients. This hinders independent evaluation, effectively delaying the use of signatures in clinical practice. Documenting a signature is conceptually different from reporting a list of genes, since the latter does not determine how a study-external patient should be diagnosed.

Our approach: One prominent reason for ambiguity in gene expression signatures lies in the data preprocessing common to microarray studies: Estimates of expression values for a fixed microarray change when additional arrays are added to the study and preprocessing is repeated. This is a severe problem for applying a signature to study-external data: The original data needs to be included in the normalization of external arrays. The re-normalization of the complete data set changes the original expression values, affecting the signature and the molecular diagnosis of patients in the original study.
To address the problem, we have investigated two popular preprocessing schemes for Affymertix microarray data. We provide an "add-on" mode consistent with the original procedures. This mode allows the preprocessing of a core data set and the successive addition of arrays without changing the normalized core data. This is shown to greatly reduce the ambiguity of diagnosis in several publicly available clinical microarray studies.


In preparation


Software will be made available in form of an R package upon publication

Imprint  Comments on this webpage