Computational Diagnostics Group compdiag MPI for Molecular Genetics

Group Members
Dept. Vingron
Group Seminar
NGFN Microarray Data Analysis Resource

Core Group Extension

Stefan Bentink, Dennis Kostka

Background: Given microarray data and starting from a small core group of highly similar samples, our objective is to find a set of signature genes, which distinguishes them from the majority of other cases. At the same time, we want to identify additional cases that have expression levels coherent with the core group (across the a priori unknown set of signature genes). This problem is related (but not identical) to the well studied problem of supervised classification.

Our approach: In a supervised classification setting classification rules (signatures) are derived from labeled data. The procedure involves three steps: training, model selection and model evaluation. To approach our objective, we propose some modification to the supervised classification procedure: Since we are unsure about the labels of the cases not in the core group, their misclassification is not perceived the same way as the misclassification of a core group member.
Training and model selection are performed with an unusual objective: we do not take the number of misclassifications as a performance measure. Given the original label configuration, we derive a signature yielding an estimate of the probability of a sample to belong to the core group. This encompasses gene selection, and new core group members can be assigned based on a cutoff. Model selection is then based on the criteria of high sensitivity (all core group members should be correctly identified) and generalization ability (the high sensitivity should also be achieved for independent test sets) as well as on the fact that the posterior probabilities should clearly distinguish between core group members and non-members. Evaluation is performed by assessing the robustness of core group / non-core group calls of the signature using the bootstrap on an independent test set.


  • A biologic definition of Burkitt's lymphoma from transcriptional and genomic profiling
    Hummel, Bentink*, Berger*, Klapper*, Wessendorf*, Barth, Bernd, Cogliatti, Dierlamm, Feller, Hansmann, Haralambieva, Harder, Hasenclever, Kühn, Lenze, Lichter, Martin-Subero, Möller, Müller-Hermelink, Ott, Parwaresch, Pott, Rosenwald, Rosolowski, Schwaenen, Stürzenhofecker, Szcepanowski, Trautmann, Wacker, Spang , Löffler, Trümper, Stein, Siebert, for the Molecular Mechanisms in Malignant Lymphomas Network Project of the Deutsche Krebshilfe
    New England Journal of Medicine 2006;354:2419-2430. (*These authors contributed equally.)
    [Abstract/Full Text]

Imprint  Comments on this webpage