Algorithms implementing populations of agents that interact with one another and sense their environment may exhibit emergent behavior such as selforganization and swarm intelligence. Here, a swarm system called Databionic swarm (DBS) is introduced which was published in Thrun, M.C., Ultsch A.: "Swarm Intelligence for SelfOrganized Clustering" (2020), Artificial Intelligence, <DOI:10.1016/j.artint.2020.103237>. DBS is able to adapt itself to structures of highdimensional data such as natural clusters characterized by distance and/or density based structures in the data space. The first module is the parameterfree projection method called Pswarm (Pswarm()), which exploits the concepts of selforganization and emergence, game theory, swarm intelligence and symmetry considerations. The second module is the parameterfree highdimensional data visualization technique, which generates projected points on the topographic map with hypsometric tints defined by the generalized Umatrix (GeneratePswarmVisualization()). The third module is the clustering method itself with noncritical parameters (DBSclustering()). Clustering can be verified by the visualization and vice versa. The term DBS refers to the method as a whole. It enables even a nonprofessional in the field of data mining to apply its algorithms for visualization and/or clustering to data sets with completely different structures drawn from diverse research fields. The comparison to common projection methods can be found in the book of Thrun, M.C.: "Projection Based Clustering through SelfOrganization and Swarm Intelligence" (2018) <DOI:10.1007/9783658205409>.
For a brief introduction to DatabionicSwarm please see the vignette Short Intro to the Databionic Swarm (DBS). The license is CC BYNCSA 4.0.
Index of help topics:
DBSclustering Databonic swarm clustering (DBS) DatabionicSwarmpackage Swarm Intelligence for SelfOrganized Clustering DefaultColorSequence Default color sequence for plots Delaunay4Points Adjacency matrix of the delaunay graph for BestMatches of Points Delta3DWeightsC Intern function DijkstraSSSP Internal function: Dijkstra SSSP GeneratePswarmVisualization Generates the Umatrix for Pswarm algorithm Hepta Hepta is part of the Fundamental Clustering Problem Suit (FCPS) [Thrun/Ultsch, 2020]. Lsun3D Lsun3D is part of the Fundamental Clustering Problem Suit (FCPS) [Thrun/Ultsch, 2020]. ProjectedPoints2Grid Transforms ProjectedPoints to a grid Pswarm A Swarm of Databots based on polar coordinates (Polar Swarm). PswarmCurrentRadiusC2botsPositive intern function, do not use yourself RelativeDifference Relative Difference ShortestGraphPathsC Shortest GraphPaths = geodesic distances UniquePoints Unique Points findPossiblePositionsCsingle Intern function, do not use yourself getCartesianCoordinates Intern function: Transformation of Databot indizes to coordinates getUmatrix4Projection depricated! see GeneralizedUmatrix() Generalisierte UMatrix fuer Projektionsverfahren plotSwarm Intern function for plotting during the Pswarm annealing process rDistanceToroidCsingle Intern function for 'Pswarm' sESOM4BMUs Intern function: Simplified Emergent SelfOrganizing Map setGridSize Sets the grid size for the Pswarm algorithm setPolarGrid Intern function: Sets the polar grid setRmin Intern function: Estimates the minimal radius for the Databot scent setdiffMatrix setdiffMatrix shortens Matrix2Curt by those rows that are in both matrices. trainstepC Internal function for sESOM
Further information is available in the following vignettes:
DatabionicSwarm 
Short Intro to the Databionic Swarm (DBS) (source, pdf) 
For interactive Island Generation of a generalized Umatrix
see interactiveGeneralizedUmatrixIsland
function in the package ProjectionBasedClustering.
If you want to verifiy your clustering result externally, you can use Heatmap
or SilhouettePlot
of the CRAN package DataVisualizations.
Michal Thrun
Maintainer: Michael Thrun <m.thrun@gmx.net>
[Thrun/Ultsch, 2021] Thrun, M. C., and Ultsch, A.: Swarm Intelligence for SelfOrganized Clustering, Artificial Intelligence, Vol. 290, pp. 103237, doi:10.1016/j.artint.2020.103237, 2021.
[Thrun/Ultsch, 2021] Thrun, M. C., & Ultsch, A.: Swarm Intelligence for SelfOrganized Clustering (Extended Abstract), in Bessiere, C. (Ed.), 29th International Joint Conference on Artificial Intelligence (IJCAI), Vol. IJCAI20, pp. 5125–5129, doi:10.24963/ijcai.2020/720, Yokohama, Japan, Jan., 2021.
[Thrun/Ultsch, 2020] Thrun, M. C., & Ultsch, A.: Uncovering HighDimensional Structures of Projections from Dimensionality Reduction Methods, MethodsX, Vol. 7, pp. 101093, DOI doi:10.1016/j.mex.2020.101093, 2020.
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
[Ultsch/Thrun, 2017] Ultsch, A., & Thrun, M. C.: Credible Visualizations for Planar Projections, in Cottrell, M. (Ed.), 12th International Workshop on SelfOrganizing Maps and Learning Vector Quantization, Clustering and Data Visualization (WSOM), IEEE Xplore, France, 2017.
[Thrun et al., 2016] Thrun, M. C., Lerch, F., Loetsch, J., & Ultsch, A.: Visualization and 3D Printing of Multivariate Data of Biomarkers, in Skala, V. (Ed.), International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision (WSCG), Vol. 24, Plzen, http://wscg.zcu.cz/wscg2016/short/A43full.pdf, 2016.
Successfully used in
[Thrun et al., 2018] Thrun, M. C., Breuer, L., & Ultsch, A. : Knowledge discovery from lowfrequency stream nitrate concentrations: hydrology and biology contributions, Proc. European Conference on Data Analysis (ECDA), pp. 4647, Paderborn, Germany, 2018.
[WeyerMenkhoff et al., 2018] WeyerMenkhoff, I., Thrun, M. C., & Loetsch, J.: Machinelearned analysis of quantitative sensory testing responses to noxious cold stimulation in healthy subjects, European Journal of Pain, Vol. 22(5), pp. 862874, DOI doi:10.1002/ejp.1173, 2018.
[Kringel et al., 2018] Kringel, D., Geisslinger, G., Resch, E., Oertel, B. G., Thrun, M. C., Heinemann, S., & Loetsch, J. : Machinelearned analysis of the association of nextgeneration sequencing based human TRPV1 and TRPA1 genotypes with the sensitivity to heat stimuli and topically applied capsaicin, Pain, Vol. 159 (7 ), pp. 13661381, DOI doi:10.1097/j.pain.0000000000001222, 2018
[Thrun, 2019] Thrun, M. C.: : Cluster Analysis of Per Capita Gross Domestic Products, Entrepreneurial Business and Economics Review (EBER), Vol. 7(1), pp. 217231, DOI: doi:10.15678/EBER.2019.070113, 2019.
[LopezGarcia et al., 2020] LopezGarcia, P., Argote, D. L., & Thrun, M. C.: Projectionbased Classification of Chemical Groups and Provenance Analysis of Archaeological Materials, IEEE Access, Vol. 8, pp. 152439152451, DOI doi:10.1109/ACCESS.2020.3016244, 2020.
data('Lsun3D')
##2d projection, without instant visualization of steps
#Alternative I:
#DistanceMatrix hast to be defined by the user.
InputDistances=as.matrix(dist(Lsun3D$Data))
projection=Pswarm(InputDistances)
#2d projection, with instant visualization
## Not run:
#Alternative II: DataMatrix, Distance is Euclidean per default
projection=Pswarm(Lsun3D$Data,Cls=Lsun3D$Cls,PlotIt=T)
## End(Not run)
#
##Computation of Generalized Umatrix
# If Non Euclidean Distances are used, Please Use \code{MDS}
# from the ProjectionBasedClustering package with the correct OutputDimension
# to generate a new DataMatrix from the distances (see SheppardDiagram
# or KruskalStress)
genUmatrixList=GeneratePswarmVisualization(Data = Lsun3D$Data,
projection$ProjectedPoints,projection$LC)
## Visualizuation of GenerelizedUmatrix,
# Estimation of the Number of Clusters=Number of valleys
library(GeneralizedUmatrix)#install if not installed
GeneralizedUmatrix::plotTopographicMap(genUmatrixList$Umatrix,genUmatrixList$Bestmatches)
## Automatic Clustering
# number of Cluster from dendrogram (PlotIt=TRUE) or visualization
Cls=DBSclustering(k=3, Lsun3D$Data,
genUmatrixList$Bestmatches, genUmatrixList$LC,PlotIt=FALSE)
# Verification, often its better to mark Outliers manually
GeneralizedUmatrix::plotTopographicMap(genUmatrixList$Umatrix,genUmatrixList$Bestmatches,Cls)
## Not run:
# To generate the 3D landscape in the shape of an island
# from the toroidal topograpic map visualization
# you may cut your island interactivly around high mountain ranges
Imx = ProjectionBasedClustering::interactiveGeneralizedUmatrixIsland(genUmatrixList$Umatrix,
genUmatrixList$Bestmatches,Cls)
GeneralizedUmatrix::plotTopographicMap(genUmatrixList$Umatrix,
genUmatrixList$Bestmatches, Cls=Cls,Imx = Imx)
## End(Not run)
## Not run:
library(ProjectionBasedClustering)#install if not installed
Cls2=ProjectionBasedClustering::interactiveClustering(genUmatrixList$Umatrix,
genUmatrixList$Bestmatches, Cls)
## End(Not run)
DBS is a flexible and robust clustering framework that consists
of three independent modules. The first module is the parameterfree
projection method Pswarm Pswarm
, which exploits the concepts of
selforganization and emergence, game theory, swarm intelligence and symmetry
considerations [Thrun/Ultsch, 2021]. The second module is a parameterfree
highdimensional data visualization technique, which generates projected points
on a topographic map with hypsometric colors
GeneratePswarmVisualization
, called the generalized Umatrix.
The third module is a clustering method with no sensitive parameters
DBSclustering
(see [Thrun, 2018, p. 104 ff]). The clustering can
be verified by the visualization and vice versa. The term DBS refers to the
method as a whole.
The DBSclustering
function applies the automated Clustering
approach of the Databonic swarm using abstract U distances, which are the
geodesic distances based on highdimensional distances combined with low
dimensional graph paths by using ShortestGraphPathsC
.
DBSclustering(k, DataOrDistance, BestMatches, LC, StructureType = TRUE,
PlotIt = FALSE, ylab,main, method = "euclidean",...)
k 
number of clusters, how many to you see in the topographic map (3D landscape)? 
DataOrDistance 
Either [1:n,1:d] Matrix of Data (n cases, d dimensions) that will be used. One DataPoint per row or symmetric Distance matrix [1:n,1:n] 
BestMatches 
[1:n,1:2] Matrix with positions of Bestmatches or ProjectedPoints, one matrix line per data point 
LC 
grid size c(Lines,Columns), please see details 
StructureType 
Optional, bool; = TRUE: compact structure of clusters assumed, =FALSE: connected structure of clusters assumed. For the two options for Clusters, see [Thrun, 2018] or Handl et al. 2006 
PlotIt 
Optional, bool, Plots Dendrogramm 
ylab 
Optional, character vector, ylabel of dendrogramm 
main 
Optional, character vctor, title of dendrogramm 
method 
Optional, one of 39 distance methods of 
... 
Further arguments passed on to the 
The input of the LC
parameter depends on the choice of Bestmatches
input argument. Usually as the name of the argument states, the Bestmatches of
the GeneratePswarmVisualization
function are used which is define
in the notation of selforganizing map. In this case please see example one.
However, as written above, clustering and visualization can be applied
independently of each other. In this case the places of Lines L and Columns C
are switched because Lines is a value slightly above the maximum of the xcoordinates and Columns is a value slightly above the maximum of the ycoordinates of ProjectedPoint.
Hence, one should give DBSclustering
the argument
LC[2,1]
as shown in example 2.
Often it is better to mark the outliers manually after the prozess of
clustering and sometimes a clustering can be improved through human interaction
[Thrun/Ultsch,2017] <DOI:10.13140/RG.2.2.13124.53124>; use in this case the
visualization plotTopographicMap
of the
package GeneralizedUmatrix. If you would like to mark the outliers interactivly
in the visualization use the ProjectionBasedClustering package with the
function interactiveClustering()
, or for full interactive clustering
IPBC()
. The package is available on CRAN. An example is shown in case
of interactiveClustering()
function in the third example.
[1:n] numerical vector of numbers defining the classification as the main output
of this cluster analysis for the n cases of data corresponding to the n
bestmatches. It has k unique numbers representing the arbitrary labels of the
clustering. You can use plotTopographicMap(Umatrix,Bestmatches,Cls)
for
verification.
If you want to verifiy your clustering result externally, you can use
Heatmap
or SilhouettePlot
of the package DataVisualizations
available on CRAN.
Michael Thrun
[Thrun/Ultsch, 2021] Thrun, M. C., and Ultsch, A.: Swarm Intelligence for SelfOrganized Clustering, Artificial Intelligence, Vol. 290, pp. 103237, doi:10.1016/j.artint.2020.103237, 2021.
data("Lsun3D")
Data=Lsun3D$Data
InputDistances=as.matrix(dist(Data))
projection=Pswarm(InputDistances)
## Example One
genUmatrixList=GeneratePswarmVisualization(Data,
projection$ProjectedPoints,projection$LC)
Cls=DBSclustering(k=3, Data,
genUmatrixList$Bestmatches, genUmatrixList$LC,PlotIt=TRUE)
## Example Two
#automatic Clustering without GeneralizedUmatrix visualization
Cls=DBSclustering(k=3, Data,
projection$ProjectedPoints, projection$LC[c(2,1)],PlotIt=TRUE)
## Not run:
## Example Three
## Sometimes an automatic Clustering can be improved
## thorugh an interactive approach,
## e.g. if Outliers exist (see [Thrun/Ultsch, 2017])
library(ProjectionBasedClustering)
Cls2=ProjectionBasedClustering::interactiveClustering(genUmatrixList$Umatrix,
genUmatrixList$Bestmatches, Cls)
## End(Not run)
Defines the default color sequence for plots made within the Projections package.
data("DefaultColorSequence")
A vector with 562 different strings describing colors for plots.
Calculates the adjacency matrix of the delaunay graph for BestMatches (BMs) in tiled form if BestMatches are located on a toroid grid.
Delaunay4Points(Points, IsToroid = TRUE,LC,PlotIt=FALSE,
Gabriel=FALSE)
Points 
[1:n,1:3] matrix containing the BMKey, X and Y coordinates of the n, BestMatches NEED NOT to be UNIQUE, however, there is an edge in the Deaunay between duplicate points! 
IsToroid 
Optional, logical, indicating if BM's are on a toroid grid. Default is True 
LC 
Optional, A vector of length 2, containing the number of lines and columns of the Grid. Lines is a value slightly above the maximum of the xcoordinates and Columns is a value slightly above the maximum of the ycoordinates of Points. 
PlotIt 
Optional, bool, Plots the graph 
Gabriel 
Optional, bool, default: FALSE, If TRUE: calculates the gabriel graph instead of the delaunay graph 
Delaunay[1:n,1:n] adjacency matrix of the DelaunayGraph
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
Implementation of the main equation for SOM, ESOM or the sESOM algorithms
Delta3DWeightsC(vx,Datasample)
vx 
array of weights [1:Lines,1:Columns,1:Weights] 
Datasample 
NumericVector of one Datapoint[1:n] 
intern function in case of ComputeInR==FALSE
in GeneratePswarmVisualization
,
see chapter 5.3 of [Thrun, 2018] for generalized Umatrix and especially the sESOM4BMUs
algorithm.
modified array of weights [1:Lines,1:Columns,1:]
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
Dijkstra's SSSP (Single source shortest path) algorithm:
gets the shortest path (geodesic distance) from source vertice(point) to all other vertices(points) defined by the edges of the adjasency matrix
DijkstraSSSP(Adj, Costs, source)
Adj 
[1:n,1:n] 0/1 adjascency matrix, e.g. from delaunay graph or gabriel graph 
Costs 
[1:n,1:n] matrix, distances between n points (normally euclidean) 
source 
integer vertice(point) from which to calculate the geodesic distance to all other points 
Preallocating space for DataStructures accordingly to the maximum possible number of vertices which is fixed set at
the number 10001.
This is an internal function of ShortestGraphPathsC
, no errors or misusage is caught here.
ShortestPaths[1:n] vector, shortest paths (geodesic) to all other vertices including the source vertice itself
runs in O(E*Log(V))
Michael Thrun
uses a changed code which is inspired by Shreyans Sheth 28.05.2015, see https://ideone.com/qkmt31
Finds all possible jumping position regarding a grid anda Radius for DataBots
findPossiblePositionsCsingle(RadiusPositionsschablone,
jumplength, alpha, Lines)
RadiusPositionsschablone 
NumericMatrix, see 
jumplength 
double radius of databots regarding neighborhood, they can jump to 
alpha 
double, zu streichen 
Lines 
double, jumpinglength has to smaller than Lines/2 and Lines/2 has to yield to a integer number. 
Algorithm is described in [Thrun, 2018, p. 95, Listing 8.1].
OpenPositions 
NumericMatrix, indizes of open positions 
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
DBS is a flexible and robust clustering framework that consists of three
independent modules. The first module is the parameterfree projection method
Pswarm Pswarm
, which exploits the concepts of selforganization
and emergence, game theory, swarm intelligence and symmetry considerations. The
second module is a parameterfree highdimensional data visualization technique,
which generates projected points on a topographic map with hypsometric colors
GeneratePswarmVisualization
, called the generalized Umatrix. The
third module is a clustering method with no sensitive parameters
DBSclustering
. The clustering can be verified by the visualization
and vice versa. The term DBS refers to the method as a whole.
The GeneratePswarmVisualization
function generates the special
case (please see [Thrun, 2018]) of the generalized Umatrix with the help of an
unsupervised neural network (simplified emergent selforganizing map published
in [Thrun/Ultsch, 2020]). From the generalized Umatrix a topographic map with
hypsometric tints can be visualized. To see this visualization use
plotTopographicMap
of the package
GeneralizedUmatrix.
GeneratePswarmVisualization(Data,ProjectedPoints,LC,PlotIt=FALSE,
ComputeInR=FALSE,Parallel=TRUE)
Data 
[1:n,1:d] array of data: n cases in rows, d variables in columns 
ProjectedPoints 
matrix, ProjectedPoints[1:n,1:2] n by 2 matrix
containing coordinates of the Projection: A matrix of the fitted configuration.
See output of 
LC 
size of the grid c(Lines,Columns), number of Lines and Columns
automatic calculated by Sometimes is better to choose a different grid size, e.g. to to reduce computional effort contrary to SOM, here the grid size defined only the resolution of the visualizations. The real grid size is predefined by Pswarm, but you may choose a factor x*res$LC if you so desire. Therefore, The resulting grid size is given back in the Output. 
PlotIt 
Optional, default(FALSE), If TRUE than uses

ComputeInR 
Optional, =TRUE: Rcode, =FALSE C++ implementation 
Parallel 
Optional, =TRUE: Parallel C++ implementation, =FALSE C++ implementation 
Tiled: The topographic map is visualized 4 times because the projection is toroidal. The reason is that there are no border in the visualizations and clusters (if they exist) are not disrupted by borders of the plot.
If you used Pswarm
with distance matrix instead of a data matrix
(in the sense that you do not have any data matrix available), you may transform
your distances into data by using MDS
of the
ProjectionBasedClustering package in order to use the
GeneratePswarmVisualization
function. The correct dimension can be
found through the Sheppard diagram or kruskals stress.
list of
Bestmatches 
Numeric matrix [1:n,1:2], BestMatches of the Umatrix, contrary to ESOM they are always fixed, because predefined by GridPoints. 
Umatrix 
Numeric matrix [1:Lines,1:Columns], 
WeightsOfNeurons 
Numeric 3D array [1:Lines,1:Columns,1:d], d is the dimension of the weights, the same as in the ESOM algorithm 
GridPoints 
Integer matrix [1:n,1:2], quantized projected points: projected points now lie on a predefined grid. 
LC 
c(Lines,Columns), normally equal to grid size of Pswarm, sometimes it a better or a lower resolution for the visualization is better. Therefore here the grid size of the neurons is given back. 
PlotlyHandle 
If PlotIt=FALSE: NULL, otherwise plotly object for ploting topview of topographic map. 
If you used pswarm with distance matrix instead of a data matrix you can mds
transform your distances into data (see the MDS
function of the
ProjectionBasedClustering package.). The correct dimension can be found through
the Sheppard diagram or kruskals stress.
The extraction of an island out of the generalized Umatrix can be performed
using the interactiveGeneralizedUmatrixIsland
function in the package
ProjectionBasedClustering.
The main code of both functions GeneralizedUmatrix
and
GeneratePswarmVisualization
is the same C++ function
sESOM4BMUs
which is described in [Thrun/Ultsch, 2020].
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
[Thrun/Ultsch, 2020] Thrun, M. C., & Ultsch, A.: Uncovering HighDimensional Structures of Projections from Dimensionality Reduction Methods, MethodsX, Vol. 7, pp. 101093, doi:10.1016/j.mex.2020.101093, 2020.
Pswarm
and
plotTopographicMap
and
GeneralizedUmatrix
of the package
GeneralizedUmatrix
data("Lsun3D")
Data=Lsun3D$Data
Cls=Lsun3D$Cls
InputDistances=as.matrix(dist(Data))
projList=Pswarm(InputDistances)
genUmatrixList=GeneratePswarmVisualization(Data,
projList$ProjectedPoints,projList$LC,
Parallel=FALSE)#CRAN guidelines do not allow =TRUE for testing
library(GeneralizedUmatrix)
plotTopographicMap(genUmatrixList$Umatrix,genUmatrixList$Bestmatches,Cls)
Transforms Databot indizes to exact cartesian coordinates on an toroid two dimensional grid.
DataBotsPos 
[1:N] complex vector Two Indizes per Databot describing its positions in an two dimensional grid 
GridRadius 
[Lines,Columns] Radii Matrix of all possible Positions of
DataBots in Grid, see also documentation of 
GridAngle 
[Lines,Columns] Angle Matrix of all possible Positions of
DataBots in Grid, see also documentation of 
Lines 
Defines Size of planar toroid two dimensional grid 
Columns 
Defines Size of planar toroid two dimensional grid 
QuadOrHexa 
Optional, FALSE=If DataPos on hexadiagonal grid, round to 2 decimals after value, Default=TRUE 
Transformation is described in [Thrun, 2018, p. 93].
BestMatchingUnits[1:N,2] coordinates on an two dimensional grid for each
databot excluding unique key, such that by using
GeneratePswarmVisualization
a visualization of the Pswarm
projection is possible
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
depricated! see GeneralizedUmatrix()
getUmatrix4Projection(Data,ProjectedPoints,
PlotIt=TRUE,Cls=NULL,toroid=T,Tiled=F,ComputeInR=F)
Data 
[1:n,1:d] Numeric matrix: n cases in rows, d variables in columns 
ProjectedPoints 
[1:n,2]n by 2 matrix containing coordinates of the Projection: A matrix of the fitted configuration. 
PlotIt 
Optional,bool, defaut=FALSE, if =TRUE: UMarix of every current Position of Databots will be shown 
Cls 
Optional, For plotting, see 
toroid 
Optional, Default=FALSE, ==FALSE planar computation ==TRUE: toroid borderless computation, set so only if projection method is also toroidal 
Tiled 
Optional,For plotting see 
ComputeInR 
Optional, =T: Rcode, =F Cpp Code 
List with
Umatrix 
[1:Lines,1:Columns] (see 
EsomNeurons 
[Lines,Columns,weights] 3dimensional numeric array (wide format), not wts (long format) 
Bestmatches 
[1:n,OutputDimension] GridConverted Projected Points information converted by convertProjectionProjectedPoints() to predefined Grid by Lines and Columns 
gplotres 
Ausgabe von ggplot 
unbesetztePositionen 
Umatrix[unbesetztePositionen] = NA 
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, ISBN: 9783658205393, Heidelberg, 2018.
data("Lsun3D")
Data=Lsun3D$Data
Cls=Lsun3D$Cls
InputDistances=as.matrix(dist(Data))
res=cmdscale(d=InputDistances, k = 2, eig = TRUE, add = FALSE, x.ret = FALSE)
ProjectedPoints=as.matrix(res$points)
# Stress = KruskalStress(InputDistances, as.matrix(dist(ProjectedPoints)))
#resUmatrix=GeneralizedUmatrix(Data,ProjectedPoints)
#plotTopographicMap(resUmatrix$Umatrix,resUmatrix$Bestmatches,Cls)
clearly defined clusters, different variances
data("Hepta")
Size 212, Dimensions 3, stored in Hepta$Data
Classes 7, stored in Hepta$Cls
[Thrun/Ultsch, 2020] Thrun, M. C., & Ultsch, A.: Clustering Benchmark Datasets Exploiting the Fundamental Clustering Problems, Data in Brief,Vol. 30(C), pp. 105501, DOI 10.1016/j.dib.2020.105501 , 2020.
data(Hepta)
str(Hepta)
clearly defined clusters, different variances
data("Lsun3D")
Size 404, Dimensions 3
Dataset defined discontinuites, where the clusters have different variances. Three main Clusters, and four Outliers (in Cluster 4). See for a more detailed description in [Thrun, 2018].
[Thrun/Ultsch, 2020] Thrun, M. C., & Ultsch, A.: Clustering Benchmark Datasets Exploiting the Fundamental Clustering Problems, Data in Brief,Vol. 30(C), pp. 105501, DOI 10.1016/j.dib.2020.105501 , 2020.
data(Lsun3D)
str(Lsun3D)
Cls=Lsun3D$Cls
Data=Lsun3D$Data
Intern function, generates a scatter plot of the progess of the Pswarm algorithm
after every nash equlibirum. Every point symbolizes a Databot. If a prior
classification is given (Cls
) then the Databots have the colors defined
by the class labels.
plotSwarm(Points,Cls,xlab,ylab,main)
Points 
ProjectedPoints or DataBot positions in cartesian coordinates 
Cls 
optional, Classification as a numeric vector, if given 
xlab 
='X', optional, string 
ylab 
='Y', optional, string 
main 
="DataBots", optional, string 
Michael Thrun
Pswarm
with PlotIt
=TRUE
quantized xy cartesianncoordinates of ProjectedPoints
ProjectedPoints2Grid(ProjectedPoints, Lines, Columns,PlotIt=FALSE, Cls)
ProjectedPoints 
[1:n,1:2] numeric matrix of cartesian xy coordinates 
Lines 
double, length of small side of the rectangular grid 
Columns 
double, length of big side of the rectangular grid 
PlotIt 
optional, bool, shows the result if TRUE 
Cls 
[1:n] numeric vector of classes for each projected point 
intern function, described in [Thrun, 2018, p.47]
BestMatches[1:n,1:3] columns in order: Key,Lines,Columns
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
This projetion method is a part of the databionic swarm which uses the nash equlibrium [Thrun/Ultsch, 2021]. Using polar coordinates for agents (here Databots) in two dimensions has many advantages, for further details see [Thrun, 2018] and [Thrun/Ultsch, 2021].
Pswarm(DataOrDistance,PlotIt=FALSE,Cls=NULL,Silent=TRUE,
Debug=FALSE,LC=c(NULL,NULL),method= "euclidean",Parallel=FALSE,...)
DataOrDistance 
Numeric matrix [1:n,1:n]: symmetric matrix of dissimilarities, if variable unsymmetric (Numeric matrix [1:d,1:n]) it is assumed as a dataset and the euclidean distances are calculated of d variables and n cases. 
PlotIt 
Optional, bool, default=FALSE, If =TRUE, Plots the projection during the computation prozess after every nash equlibirum. 
Cls 
Optional, numeric vector [1:n], given Classification in numbers, only for plotting if PlotIt=TRUE, irrelevant for computations. 
Silent 
Optional, bool, default=FALSE, If =TRUE results in various console messages 
Debug 
Optional, Debug, default=FALSE, =TRUE results in various console messages, depricated for CRAN, because cout is not allowed. 
LC 
Optional, grid size c(Lines, Columns), sometimes it is better to call

method 
Optional, one of 39 distance methods of 
Parallel 
Optional, =TRUE: Parallel C++ implementation, =FALSE C++ implementation 
... 
Further arguments passed on to the 
DBS is a flexible and robust clustering framework that consists of three
independent modules. The first module is the parameterfree projection method
Pswarm Pswarm
, which exploits the concepts of selforganization
and emergence, game theory, swarm intelligence and symmetry considerations. The
second module is a parameterfree highdimensional data visualization technique,
which generates projected points on a topographic map with hypsometric colors
GeneratePswarmVisualization
, called the generalized Umatrix. The
third module is a clustering method with no sensitive parameters
DBSclustering
. The clustering can be verified by the visualization
and vice versa. The term DBS refers to the method as a whole.
List with
ProjectedPoints 
[1:n,1:2] xy cartesian coordinates of projection 
LC 
number of Lines and Columns in c(Lines,Columns). Lines is a value slightly above the maximum of the xcoordinates and Columns is a value slightly above the maximum of the ycoordinates of ProjectedPoints 
Control 
List, only for intern debugging 
LC is now automatically estimated; LC is the size of the grid c(Lines,Columns), number of Lines and Columns, default c(NULL,NULL) and automatic calculation by setGridSize
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
[Thrun/Ultsch, 2021] Thrun, M. C., and Ultsch, A.: Swarm Intelligence for SelfOrganized Clustering, Artificial Intelligence, Vol. 290, pp. 103237, doi:10.1016/j.artint.2020.103237, 2021.
data("Lsun3D")
Data=Lsun3D$Data
Cls=Lsun3D$Cls
InputDistances=as.matrix(dist(Data))
#If not called separately setGridSize() is called in Pswarm
LC=setGridSize(InputDistances)
res=Pswarm(InputDistances,LC=LC,Cls=Cls,PlotIt=TRUE)
Finds the weak Nash equilibirium for DataBots in one epoch(Radius), requires the
setting of constants, grid, and so on in Pswarm
PswarmCurrentRadiusC2botsPositive( AllDataBotsPosOld, Radius, DataDists,
IndPossibleDBPosR, RadiusPositionsschablone, pp, Nullpunkt, Lines, Columns,
nBots, limit, steigungsverlaufind, StressConstAditiv, debug)
AllDataBotsPosOld 
ComplexVector [1:n,1], DataBots position in the last NashEqulibriuum 
Radius 
double, Radius of payoff function, neighborhood, where other DatsBots can be smelled 
DataDists 
NumericMatrix, Inputdistances[1:n,1:n] 
IndPossibleDBPosR 
ComplexVector, see output of

RadiusPositionsschablone 
NumericMatrix, see 
pp 
NumericVector, number of jumping simultaneously DataBots of one epoch (per nashequilibirum), this vector is linearly monotonically decreasing 
Nullpunkt 
NumericVector, equals

Lines 
double, small edge length of rectangulare grid 
Columns 
double, big edge length of rectangulare grid 
nBots 
double, intern constant, equals 
limit 
int, intern constant, equals 
steigungsverlaufind 
int, intern constant 
StressConstAditiv 
double, intern constant, sum of payoff of all databots in random condition before the algorithm starts 
debug 
optional, bool: If TRUE prints status every 100 iterations 
Algorithm is described in [Thrun, 2018, p. 95, Listing 8.1].
list of
AllDataBotsPos 
ComplexVector, indizes of DataBot Positions after a weak Nash equlibrium is found 
stressverlauf 
NumericVector, intern result, for debugging only 
fokussiertlaufind 
NumericVector, intern result, for debugging only 
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
Pswarm
toroid distance calculation
rDistanceToroidCsingle(AllDataBotsPosX, AllDataBotsPosY, AllallowedDBPosR0,
Lines, Columns, Nullpunkt)
AllDataBotsPosX 
NumericVector [1:n,1], positions of on grid 
AllDataBotsPosY 
NumericVector [1:n,1], positions of on grid 
AllallowedDBPosR0 
NumericMatrix 
Lines 
double 
Columns 
double 
Nullpunkt 
NumericVector 
Part of the algorithm described in [Thrun, 2018, p. 95, Listing 8.1].
numeric matrix of toroid Distances[1:n,1:n]
do not use yourself
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
Calculates the difference between positive x and y values
RelativeDifference(X, Y, epsilon = 10^10,na.rm=FALSE)
X 
either a value or numerical vector of [1:n] 
Y 
either a value or numerical vector of [1:n] 
epsilon 
Optional, If both x and y are approximatly zero the output is also zero 
na.rm 
Optional, function does not work with non finite values. If these cases should be automatically removed, set parameter TRUE 
Contrary to other approaches in this cases the range of values lies between
[2,2]. The approach is only valid for positive values ofX
and Y
.
The realtive difference R
is defined with
$R=\frac{YX}{0.5*(X+Y)}$
Negative value indicate that X
is higher than Y
and positive
values that X
is lower than Y
.
R
It can be combined with the GabrielClassificationError
if a clear baseline is defined.
Michael Thrun
Ultsch, A.: Is Log Ratio a Good Value for Measuring Return in Stock Investments? GfKl 2008, pp, 505511, 2008.
x=c(1:5)
y=runif(5,min=1,max=10)
RelativeDifference(x,y)
Intern function for the simplified ESOM (sESOM) algorithm for fixed BestMatchingUnits.
sESOM4BMUs(BMUs,Data, esom, toroid, CurrentRadius,
ComputeInR=FALSE,Parallel=TRUE)
BMUs 
[1:Lines,1:Columns], BestMAtchingUnits generated by ProjectedPoints2Grid() 
Data 
[1:n,1:d] array of data: n cases in rows, d variables in columns 
esom 
[1:Lines,1:Columns,1:weights] array of NeuronWeights, see ListAsEsomNeurons() 
toroid 
TRUE/FALSE  topology of points 
CurrentRadius 
number betweeen 1 to x 
ComputeInR 
=TRUE: Rcode, =FALSE: Cpp 
Parallel 
Optional, =TRUE: Parallel C++ implementation, =FALSE C++ implementation 
Algorithm is described in [Thrun, 2018, p. 48, Listing 5.1].
esom 
numeric array [1:Lines,1:Columns,1:d], d is the dimension of the weights, the same as in the ESOM algorithm. modified esomneuros regarding a predefined neighborhood defined by a radius 
Usually not for seperated usage!
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
setdiffMatrix shortens Matrix2Curt by those rows that are in both matrices.
Matrix2Curt 
[n,k] matrix, which will be shortened by x rows 
Matrix2compare 
[m,k] matrix whose rows will be compared to those of Matrix2Curt x rows in Matrix2compare equal rows of Matrix2Curt (order of rows is irrelevant). Has the same number of columns as Matrix2Curt. 
V$CurtedMatrix[nx,k] Shortened Matrix2Curt
CL,MT 12/2014
Automatically sets the size of the grid, formula see [Thrun, 2018, p. 9394].
setGridSize(InputDistances,minp=0.01,maxp=0.99,alpha=4)
InputDistances 
[1:n,1:n] symmetric matrix of input distances 
minp 
default value: 0.01,see 
maxp 
default value: 0.99, see 
alpha 
Do not change! Intern parameter, Only if Java Version of Pswarm instead of C++ version is used. 
grid is set such that minimum and maximum distances can be shown on the grid
LC=c(Lines,Columns) size of the grid for Pswarm
Michael Thrun, Florian Lerch
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
automatic choice of LC for Pswarm
data("Lsun3D")
Data=Lsun3D$Data
Cls=Lsun3D$Cls
InputDistances=as.matrix(dist(Data))
#If not called separately setGridSize() is called in Pswarm
LC=setGridSize(InputDistances)
Sets a polar grid for a swarm in an rectangular shape
setPolarGrid(Lines,Columns,QuadOrHexa,PlotIt,global)
Lines 
Integer, hast to be able to be divided by 2 
Columns 
Integer, with Columns>=Lines 
QuadOrHexa 
bool, default(TRUE) If False Hexagonal grid, default quad grid 
PlotIt 
bool, default(FALSE) 
global 
bool, default(TRUE), intern parameter, how shall the radii be calculated? 
Part of the Algorithm described in [Thrun, 2018, p. 95, Listing 8.1].
list of
GridRadii 
matrix [1:Lines,1:Columns], Radii Matrix of all possible Positions of DataBots in Grid 
GridAngle 
matrix [1:Lines,1:Columns], Angle Matrix of all possible Positions of DataBots in Grid 
AllallowedDBPosR0 
matrix [1:Lines+1,1:Columns+1], Matrix of radii in polar coordinates respecting origin (0,0) of all allowed DataBots Positions in one jump 
AllallowedDBPosPhi0 
matrix [1:Lines+1,1:Columns+1], # V$AllallowedDBPosPhi0[Lines+1,Lines+1] Matrix of angle in polar coordinates respecting origin (0,0) of all allowed DataBots Positions in one jump 
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
estimates the minimal radius on apolar grid in the automated annealing process of Pswarm, details of how can be read in [Thrun, 2018, p. 97]
Lines 
xvalue determining the size of the map, i.e. how many open places for DataBots will be available on the 2dimensional grid BEWARE: has to be able to be divided by 2 
Columns 
yvalue determining the size of the map, i.e. how many open places for DataBots will be available on the 2dimensional grid Columns>Lines 
AllallowedDBPosR0 
[1:Lines+1,1:Lines+1]Matrix of radii in polar coordinates respecting origin (0,0) of all allowed DataBots Positions in one jump 
p 
percent of gitterpositions, which should be considered 
Rmin Minimum Radius
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
Dijkstra's SSSP (Single source shortest path) algorithm, from all points to all points
ShortestGraphPathsC(Adj, Cost)
Adj 
[1:n,1:n] 0/1 adjascency matrix, e.g. from delaunay graph or gabriel graph. 
Cost 
[1:n,1:n] matrix, distances between n points (normally euclidean) 
Vertices are the points, edges have the costs defined by weights (normally a distance). The algorithm runs in runs in O(n*E*Log(V)), see also [Jungnickel, 2013, p. 87]. Further details can be foubd in [Jungnickel, 2013, p. 8387] and [Thrun, 2018, p. 12].
ShortestPaths[1:n,1:n] vector, shortest paths (geodesic) to all other vertices including the source vertice itself from al vertices to all vertices, stored as a matrix
require C++11 standard (set flag in Compiler, if not set automatically)
Michael Thrun
[Dijkstra,1959] Dijkstra, E. W.: A note on two problems in connexion with graphs, Numerische mathematik, Vol. 1(1), pp. 269271. 1959.
[Jungnickel, 2013] Jungnickel, D.: Graphs, networks and algorithms, (4th ed ed. Vol. 5), Berlin, Heidelberg, Germany, Springer, ISBN: 9783642322785, 2013.
[Thrun/Ultsch, 2017] Thrun, M.C., Ultsch, A.: Projection based Clustering, Conf. Int. Federation of Classification Societies (IFCS),DOI:10.13140/RG.2.2.13124.53124, Tokyo, 2017.
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
Does the training for fixed bestmatches in one epoch of the sESOM algorithm (see [Thrun, 2018] for details).
trainstepC(vx,vy, DataSampled,BMUsampled,Lines,Columns, Radius, toroid)
vx 
array [1:Lines,1:Columns,1:Weights], WeightVectors that will be trained, internally transformed von NumericVector to cube 
vy 
array [1:Lines,1:Columns,1:2], meshgrid for output distance computation 
DataSampled 
NumericMatrix, n cases shuffled Dataset[1:n,1:d] by

BMUsampled 
NumericMatrix, n cases shuffled BestMatches[1:n,1:2] by

Lines 
double, Height of the grid 
Columns 
double, Width of the grid 
Radius 
double, The current Radius that should be used to define neighbours to the bm 
toroid 
bool, Should the grid be considered with cyclically connected borders? 
Algorithm is described in [Thrun, 2018, p. 48, Listing 5.1].
WeightVectors, array[1:Lines,1:Columns,1:weights] with the adjusted Weights
Usually not for seperated usage!
Michael Thrun
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through SelfOrganization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 9783658205393, doi:10.1007/9783658205409, 2018.
return only the unique points in Datapoints
UniquePoints(Datapoints, Cls, Eps=1e10)
Datapoints 
[1:n,1:d] numeric matrix of Datapoints points of dimension d, the points are in the rows 
Cls 
[1:n] numeric vector of classes for each datapoint. 
Eps 
Optional,scalar above zero that defines minimum nonidentical euclidean distance between two points 
Euclidean distance is computed and used within. Setting Eps
to a very small number results in the identification of unique data points. Setting epsilon to a higher number results in the definition of mesh points within an ddimensional Rball graph.
List with
Unique 
[1:k,1:d] Datapoints points without duplicate points 
UniqueInd 
[1:k] index vector such that Unique == Datapoints[UniqueInd,], it has k nonconsecutive numbers or labels, each label defines a row number within Datapoints[1:n,1:d] of a unique data point 
Uniq2DatapointsInd 
[1:n] index vector. It has k unique index numbers
representing the arbitrary labels. Each labels is mapped uniquely to a point in

NewUniqueInd 
[1:k] index vector stating the index of the newly defined datastructure Unique. 
NewUniq2DataIdx 
[1:k] index vector such that Unique[NewUniq2DataIdx,] == Datapoints[Uniq2DatapointsInd,], it has n nonconsecutive numbers or labels, each label defines a row number within Unique[1:k,1:d] of a unique data point 
IsDuplicate 
[1:n,1:n] matrix,for i!=j IsDuplicate[i,j]== 1 if Datapoints[i,] == Datapoints[j,] IsDuplicate[i,i]==0 
Eps 
Numeric stating the neighborhood radius around unique points. 
Michael Thrun
Datapoints = rbind(c(0,0), c(1,1), c(2,2))
Datapoints2 = rbind(Datapoints, Datapoints+0.001)
Datapoints3 = rbind(Datapoints2, c(1,1)0.001)
Datapoints = rbind(c(0,0), c(0,0.015), c(0,0.01), c(0,0.015))
V1 = UniquePoints(Datapoints = Datapoints, Eps = 0.01)
V2 = UniquePoints(Datapoints = Datapoints2, Eps = 0.01)
V3 = UniquePoints(Datapoints = Datapoints3, Eps = 0.01)