Title: | Methods for Detection of Clusters in Hierarchical Clustering Dendrograms |
---|---|
Description: | Contains methods for detection of clusters in hierarchical clustering dendrograms. |
Authors: | Peter Langfelder <[email protected]> and Bin Zhang <[email protected]>, with contributions from Steve Horvath <[email protected]> |
Maintainer: | Peter Langfelder <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.63-1 |
Built: | 2024-12-31 08:05:23 UTC |
Source: | CRAN |
Contains methods for detection of clusters in hierarchical clustering dendrograms.
Package: | dynamicTreeCut |
Version: | 1.63-1 |
Date: | 2016-03-10 |
Depends: | R, stats |
ZipData: | no |
License: | GPL version 2 or newer |
URL: | http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting/ |
Index:
cutreeDynamic Adaptive branch pruning of hierarchical clustering dendrograms. cutreeDynamicTree Dynamic dendrogram pruning based on dendrogram only cutreeHybrid Hybrid adaptive tree cut for hierarchical clustering dendrograms. indentSpaces Spaces for indented output. merge2Clusters Merge two clusters printFlush Print arguments and flush the console. treecut-package Methods for detection of clusters in hierarchical clustering dendrograms.
Peter Langfelder <[email protected]> and Bin Zhang <[email protected]>, with contributions from Steve Horvath <[email protected]>
Maintainer: Peter Langfelder <[email protected]>
This wrapper provides a common access point for two methods of adaptive branch pruning of hierarchical clustering dendrograms.
cutreeDynamic( dendro, cutHeight = NULL, minClusterSize = 20, # Basic tree cut options method = "hybrid", distM = NULL, deepSplit = (ifelse(method=="hybrid", 1, FALSE)), # Advanced options maxCoreScatter = NULL, minGap = NULL, maxAbsCoreScatter = NULL, minAbsGap = NULL, minSplitHeight = NULL, minAbsSplitHeight = NULL, # External (user-supplied) measure of branch split externalBranchSplitFnc = NULL, minExternalSplit = NULL, externalSplitOptions = list(), externalSplitFncNeedsDistance = NULL, assumeSimpleExternalSpecification = TRUE, # PAM stage options pamStage = TRUE, pamRespectsDendro = TRUE, useMedoids = FALSE, maxDistToLabel = NULL, maxPamDist = cutHeight, respectSmallClusters = TRUE, # Various options verbose = 2, indent = 0)
cutreeDynamic( dendro, cutHeight = NULL, minClusterSize = 20, # Basic tree cut options method = "hybrid", distM = NULL, deepSplit = (ifelse(method=="hybrid", 1, FALSE)), # Advanced options maxCoreScatter = NULL, minGap = NULL, maxAbsCoreScatter = NULL, minAbsGap = NULL, minSplitHeight = NULL, minAbsSplitHeight = NULL, # External (user-supplied) measure of branch split externalBranchSplitFnc = NULL, minExternalSplit = NULL, externalSplitOptions = list(), externalSplitFncNeedsDistance = NULL, assumeSimpleExternalSpecification = TRUE, # PAM stage options pamStage = TRUE, pamRespectsDendro = TRUE, useMedoids = FALSE, maxDistToLabel = NULL, maxPamDist = cutHeight, respectSmallClusters = TRUE, # Various options verbose = 2, indent = 0)
dendro |
A hierarchical clustering dendorgram such as one returned by |
cutHeight |
Maximum joining heights that will be considered. For |
minClusterSize |
Minimum cluster size. |
method |
Chooses the method to use. Recognized values are "hybrid" and "tree". |
distM |
Only used for method "hybrid". The distance matrix used as input to |
deepSplit |
For method "hybrid", can be either logical or integer in the range 0 to 4. For method
"tree", must be logical. In both cases, provides a rough control over sensitivity to cluster splitting.
The higher the value (or if |
maxCoreScatter |
Only used for method "hybrid".
Maximum scatter of the core for a branch to be a cluster, given as the fraction of |
minGap |
Only used for method "hybrid".
Minimum cluster gap given as the fraction of the difference between |
maxAbsCoreScatter |
Only used for method "hybrid".
Maximum scatter of the core for a branch to be a cluster given as absolute heights. If given, overrides
|
minAbsGap |
Only used for method "hybrid".
Minimum cluster gap given as absolute height difference. If given, overrides |
minSplitHeight |
Minimum split height given as the fraction of the difference between
|
minAbsSplitHeight |
Minimum split height given as an absolute height.
Branches merging below this height will automatically be merged. If not given (default), will be determined
from |
externalBranchSplitFnc |
Optional function to evaluate split (dissimilarity) between two branches.
Either a single function or a list in which each component is a function (see
|
minExternalSplit |
Thresholds to decide whether two branches should be merged.
It should be a numeric vector of the same length as the number of functions in
|
externalSplitOptions |
Further arguments to function |
externalSplitFncNeedsDistance |
Optional specification of whether the external branch split
functions need the distance matrix as one of their arguments. Either |
assumeSimpleExternalSpecification |
Logical: when |
pamStage |
Only used for method "hybrid". If TRUE, the second (PAM-like) stage will be performed. |
pamRespectsDendro |
Logical, only used for method "hybrid". If |
useMedoids |
Only used for method "hybrid" and only if |
maxDistToLabel |
Deprecated, use |
maxPamDist |
Only used for method "hybrid" and only if |
respectSmallClusters |
Only used for method "hybrid" and only if |
verbose |
Controls the verbosity of the output. 0 will make the function completely quiet, values up to 4 gradually increase verbosity. |
indent |
Controls indentation of printed messages (see |
This is a wrapper for two related but different methods for cluster detection in hierarchical clustering dendrograms.
In order to make the shape parameters maxCoreScatter
and minGap
more universal, their
values are interpreted relative to cutHeight
and the 5th percetile of the merging heights (we
arbitrarily chose the 5th percetile rather than the minimum for reasons of stability). Thus, the absolute
maximum allowable core scatter is calculated as maxCoreScatter * (cutHeight - refHeight) +
refHeight
and the absolute minimum allowable gap as minGap * (cutHeight - refHeight)
, where
refHeight
is the 5th percentile of the merging heights.
A vector of numerical labels giving assignment of objects to modules. Unassigned objects are labeled 0, the largest module has label 1, next largest 2 etc.
Peter Langfelder, [email protected]
Langfelder P, Zhang B, Horvath S, 2007. http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting
hclust
, cutreeHybrid
, cutreeDynamicTree
.
Detect clusters in a hierarchical dendrogram using a variable cut height approach. Uses only the information in the dendrogram itself is used (which may give incorrect assignment for outlying objects).
cutreeDynamicTree(dendro, maxTreeHeight = 1, deepSplit = TRUE, minModuleSize = 50)
cutreeDynamicTree(dendro, maxTreeHeight = 1, deepSplit = TRUE, minModuleSize = 50)
dendro |
Hierarchical clustering dendrogram such produced by |
maxTreeHeight |
Maximum joining height of objects to be considered part of clusters. |
deepSplit |
If |
minModuleSize |
Minimum module size. Branches containing fewer than |
A variable height branch pruning technique for dendrograms produced by hierarchical clustering.
Initially, branches are cut off at the height maxTreeHeight
; the resulting clusters are then
examined for substructure and if subclusters are detected, they are assigned separate labels. Subclusters
are detected by structure and are required to have a minimum of minModuleSize
objects on them to
be assigned a separate label. A rough degree of control over what it means to be a subcluster is
implemented by the parameter deepSplit
.
A vector of numerical labels giving assignment of objects to modules. Unassigned objects are labeled 0, the largest module has label 1, next largest 2 etc.
Bin Zhang, [email protected], with contributions by Peter Langfelder, [email protected].
http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting
Detect clusters in a dendorgram produced by the function hclust
.
cutreeHybrid( # Input data: basic tree cutiing dendro, distM, # Branch cut criteria and options cutHeight = NULL, minClusterSize = 20, deepSplit = 1, # Advanced options maxCoreScatter = NULL, minGap = NULL, maxAbsCoreScatter = NULL, minAbsGap = NULL, minSplitHeight = NULL, minAbsSplitHeight = NULL, # External (user-supplied) measure of branch split externalBranchSplitFnc = NULL, minExternalSplit = NULL, externalSplitOptions = list(), externalSplitFncNeedsDistance = NULL, assumeSimpleExternalSpecification = TRUE, # PAM stage options pamStage = TRUE, pamRespectsDendro = TRUE, useMedoids = FALSE, maxPamDist = cutHeight, respectSmallClusters = TRUE, # Various options verbose = 2, indent = 0)
cutreeHybrid( # Input data: basic tree cutiing dendro, distM, # Branch cut criteria and options cutHeight = NULL, minClusterSize = 20, deepSplit = 1, # Advanced options maxCoreScatter = NULL, minGap = NULL, maxAbsCoreScatter = NULL, minAbsGap = NULL, minSplitHeight = NULL, minAbsSplitHeight = NULL, # External (user-supplied) measure of branch split externalBranchSplitFnc = NULL, minExternalSplit = NULL, externalSplitOptions = list(), externalSplitFncNeedsDistance = NULL, assumeSimpleExternalSpecification = TRUE, # PAM stage options pamStage = TRUE, pamRespectsDendro = TRUE, useMedoids = FALSE, maxPamDist = cutHeight, respectSmallClusters = TRUE, # Various options verbose = 2, indent = 0)
dendro |
a hierarchical clustering dendorgram such as one returned by |
distM |
Distance matrix that was used as input to |
cutHeight |
Maximum joining heights that will be considered. It defaults to 99 of the range between the 5th percentile and the maximum of the joining heights on the dendrogram. |
minClusterSize |
Minimum cluster size. |
deepSplit |
Either logical or integer in the range 0 to 4. Provides a rough control over
sensitivity to cluster splitting. The higher the value, the more and smaller clusters will be produced.
A finer control can be achieved via |
maxCoreScatter |
Maximum scatter of the core for a branch to be a cluster, given as the fraction
of |
minGap |
Minimum cluster gap given as the fraction of the difference between |
maxAbsCoreScatter |
Maximum scatter of the core for a branch to be a cluster given as absolute
heights. If given, overrides |
minAbsGap |
Minimum cluster gap given as absolute height difference. If given, overrides
|
minSplitHeight |
Minimum split height given as the fraction of the difference between
|
minAbsSplitHeight |
Minimum split height given as an absolute height.
Branches merging below this height will automatically be merged. If not given (default), will be determined
from |
externalBranchSplitFnc |
Optional function to evaluate split (dissimilarity) between two branches.
Either a single function or a list in which each component is a function (see
|
minExternalSplit |
Thresholds to decide whether two branches should be merged.
It should be a numeric vector of the same length as the number of functions in
|
externalSplitOptions |
Further arguments to function |
externalSplitFncNeedsDistance |
Optional specification of whether the external branch split
functions need the distance matrix as one of their arguments. Either |
assumeSimpleExternalSpecification |
Logical: when |
pamStage |
Logical, only used for method "hybrid". If |
pamRespectsDendro |
Logical, only used for method "hybrid".
If |
useMedoids |
if TRUE, the second stage will be use object to medoid distance; if FALSE, it will use average object to cluster distance. The default (FALSE) is recommended. |
maxPamDist |
Maximum object distance to closest cluster that will result in the object
assigned to that cluster. Defaults to |
respectSmallClusters |
If TRUE, branches that failed to be clusters in stage 1 only because of insufficient size will be assigned together in stage 2. If FALSE, all objects will be assigned individually. |
verbose |
Controls the verbosity of the output. 0 will make the function completely quiet, values up to 4 gradually increase verbosity. |
indent |
Controls indentation of printed messages (see |
The function detects clusters in a hierarchical dendrogram based on the shape of branches on the dendrogram. For details on the method, see http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting.
In order to make the shape parameters maxCoreScatter
and minGap
more universal, their
values are interpreted relative to cutHeight
and the 5th percetile of the merging heights (we
arbitrarily chose the 5th percetile rather than the minimum for reasons of stability). Thus, the absolute
maximum allowable core scatter is calculated as maxCoreScatter * (cutHeight - refHeight) +
refHeight
and the absolute minimum allowable gap as minGap * (cutHeight - refHeight)
, where
refHeight
is the 5th percentile of the merging heights.
A list containg the following elements:
labels |
Numerical labels of clusters, with 0 meaning unassigned, label 1 labeling the largest cluster etc. |
cores |
Numerical labels indicating cores of found clusters. |
smallLabels |
Numerical labels for branches that failed to be recognized clusters only because of insufficient number of objects. |
mergeDiagnostics |
A data.frame with one row per merge in the input dendrogram. The columns give the values of the various merging criteria used by the algorithm. Missing data indicate that at least one of the "branches" merged was actually a singleton (single node) and hence the branch merging was automatic. |
mergeCriteria |
Values of the merging thresholds. Either a copy of the corresponding input thresholds
or values determined by |
branches |
A list detailing the deteced branch structure. |
Peter Langfelder, [email protected]
Langfelder P, Zhang B, Horvath S, 2007. http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting
Returns a character string containing two times indent
spaces.
indentSpaces(indent = 0)
indentSpaces(indent = 0)
indent |
Desired level of indentation. The number of returned spaces will be twice this argument. |
A character string containing spaces, of length twice indent
.
Peter Langfelder, [email protected]
spaces = indentSpaces(0); print(paste(spaces, "This output is not indented...")); spaces = indentSpaces(1); print(paste(spaces, "...while this one is."))
spaces = indentSpaces(0); print(paste(spaces, "This output is not indented...")); spaces = indentSpaces(1); print(paste(spaces, "...while this one is."))
Merge 2 clusters into 1.
merge2Clusters(labels, mainClusterLabel, minorClusterLabel)
merge2Clusters(labels, mainClusterLabel, minorClusterLabel)
labels |
a vector or factor giving the cluster labels |
mainClusterLabel |
label of the first merged cluster. The merged cluster will have this label. |
minorClusterLabel |
label of the second merged cluster. |
A vector or factor of the merged labels.
Bin Zhang and Peter Langfelder
options(stringsAsFactors = FALSE); # Works with character labels: labels = c(rep("grey", 5), rep("blue", 2), rep("red", 3)) merge2Clusters(labels, "blue", "red") # Works with factor labels: labelsF = factor(labels) merge2Clusters(labelsF, "blue", "red") # Works also with numeric labels: labelsN = as.numeric(factor(labels)) labelsN merge2Clusters(labelsF, 1, 3)
options(stringsAsFactors = FALSE); # Works with character labels: labels = c(rep("grey", 5), rep("blue", 2), rep("red", 3)) merge2Clusters(labels, "blue", "red") # Works with factor labels: labelsF = factor(labels) merge2Clusters(labelsF, "blue", "red") # Works also with numeric labels: labelsN = as.numeric(factor(labels)) labelsN merge2Clusters(labelsF, 1, 3)
Passes all its arguments unchaged to the standard print
function; after the
execution of print it flushes the console, if possible.
printFlush(...)
printFlush(...)
... |
Arguments to be passed to the standard |
Passes all its arguments unchaged to the standard print
function; after the
execution of print it flushes the console, if possible.
Returns the value of the print
function.
Peter Langfelder, [email protected]