GfKl 2004 contest: Correspondence clustering of Dortmund city districts

Stefanie Scheid, MPIMG, CMB, CompDiag

We combine correspondence analysis (CA) and $K$-means clustering to divide Dortmund's districts into groups that are associated to particular variables and thus represent a social cluster. CA visualizes associations between rows and columns of a frequency matrix and can be used for dimension reduction. Based on the first three dimensions after CA mapping we find a stable partition into five clusters. We further identify variables that are highly associated with the cluster centers and thus represent a cluster's social condition.