University of Limerick Institutional Repository

Cluster analysis stopping rules in Stata

DSpace Repository

Show simple item record

dc.contributor.author Halpin, Brendan
dc.date.accessioned 2017-02-03T09:00:20Z
dc.date.available 2017-02-03T09:00:20Z
dc.date.issued 2016
dc.identifier.uri http://hdl.handle.net/10344/5492
dc.description non-peer-reviewed en_US
dc.description.abstract Analysts doing cluster analysis sometimes want the data to tell them the optimum number of clusters. Common "stopping rules" use the Calinski-Harabasz pseudo-F statistic and Duda-Hart indices, which are based on squared Euclidean distances between cases. Cluster analysis operates on a pairwise matrix of distances between the objects clusters, which are usually created from the observed variables. However, approaches such as expert judgement or algorithmic pattern-recognition (as used for instance in sequence analysis) often output matrices of pairwise similarity or difference whose relationship to the observed variables is much less direct. Built-in Stata utilities allow calculation of the CH and DH indices when cluster analysis starts from variables, but not with cluster analysis that starts from a pairwise distance matrix (unless the distances are squared Euclidean distances defined on variables which are still available). In this note I present two small Stata utilities that will calculate the CH and DH statistics from the distance matrix, if the distances are squared Euclidean. If the distances have another metric, these utilities can be seen as calculating a pseudo-CH pseudo-F or pseudo-DH statistic, potentially extending their use to new applications. en_US
dc.language.iso eng en_US
dc.publisher Department of Sociology; University of Limerick
dc.relation.ispartofseries University of Limerick Department of Sociology Working Paper Series;WP2016-01
dc.subject Stata en_US
dc.subject stopping rules en_US
dc.title Cluster analysis stopping rules in Stata en_US
dc.type info:eu-repo/semantics/workingPaper en_US
dc.type.supercollection all_ul_research en_US
dc.rights.accessrights info:eu-repo/semantics/openAccess en_US
dc.internal.authorcontactother Brendan.Halpin@ul.ie


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search ULIR


Browse

My Account

Statistics