Title: | Longitudinal Data |
---|---|
Description: | Tools for longitudinal data and joint longitudinal data (used by packages kml and kml3d). |
Authors: | Christophe Genolini [cre, aut], Bruno Falissard [ctb], Dai Fang [ctb], Patrice Kiener [ctb], Luke Tierney [ctb] |
Maintainer: | Christophe Genolini <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.4.7 |
Built: | 2024-11-13 06:42:45 UTC |
Source: | CRAN |
longitudinalData
package provide some tools to deal with the clusterization
of longitudinal data.
Package: | longitudinalData |
Type: | Package |
Version: | 2.4.1 |
Date: | 2016-02-02 |
License: | GPL (>= 2) |
LazyData: | yes |
Depends: | methods,clv,rgl,misc3d |
URL: | http://www.r-project.org |
longitudinalData
provide some tools to deal with the clustering of longitudinal data, mainly:
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSM, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] Christophe M. Genolini and Bruno Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] Christophe M. Genolini and Bruno Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
Classes: LongData
, Partition
Methods: longData
, partition
, ordered
Plot: plotTrajMeans
, plotTrajMeans3d
Imputation: imputation
Criterion: qualityCriterion
### Generation of artificial longData data(artificialJointLongData) myData <- longData3d(artificialJointLongData,timeInData=list(var1=2:12,var2=13:23,var3=24:34)) part <- partition(rep(1:3,each=50)) plotTrajMeans3d(myData,part) ### Quality criterion qualityCriterion(myData,part)
### Generation of artificial longData data(artificialJointLongData) myData <- longData3d(artificialJointLongData,timeInData=list(var1=2:12,var2=13:23,var3=24:34)) part <- partition(rep(1:3,each=50)) plotTrajMeans3d(myData,part) ### Quality criterion qualityCriterion(myData,part)
Some artificial joint longitudinal data.
data(artificialJointLongData)
data(artificialJointLongData)
Some joint longitudinal data in wide format. It includes 90 trajectories divided in 3 groups.
id
unique idenfier for each patient.
v0
Measurment of variable 'V' at time t0
v1
Measurment of variable 'V' at time t1
...
...
v10
Measurment of variable 'V' at time t0
w0
Measurment of variable 'W' at time t0
w1
Measurment of variable 'W' at time t1
...
...
w10
Measurment of variable 'W' at time t0
x0
Measurment of variable 'X' at time t0
x1
Measurment of variable 'X' at time t1
...
...
x10
Measurment of variable 'X' at time t0
Some joint longitudinal data in wide format. It includes 90 trajectories divided in 3 groups.
Christophe Genolini
data(artificialJointLongData) str(artificialJointLongData)
data(artificialJointLongData) str(artificialJointLongData)
Some artificial longitudinal data.
data(artificialLongData)
data(artificialLongData)
Some longitudinal data in wide format. It includes 120 trajectories divided in 4 groups.
id
unique idenfier for each patient.
t0
Measurment at time t0
t1
Measurment at time t1
...
...
t10
Measurment at time t10
Some artificial longitudinal data in wide format. It includes 120 trajectories divided in 4 groups.
Christophe Genolini
data(artificialLongData) str(artificialLongData)
data(artificialLongData) str(artificialLongData)
Constants define in the package ~
MAX_CLUSTERS CRITERION_NAMES DISTANCE_METHODS CHOICE_STYLE
MAX_CLUSTERS CRITERION_NAMES DISTANCE_METHODS CHOICE_STYLE
MAX_CLUSTERS = 26
CLUSTER_NAMES = paste("c",2:MAX_CLUSTERS,sep="")
CRITERION_NAMES <- c(
"Calinski.Harabatz","Kryszczuk.Calinski","Genolini.Calinski","Ray.Turi","Davies.Bouldin",
"BIC","BIC2","AIC","AICc","AICc2","postProbaGlobal","random"
)
DISTANCE_METHODS = c("manhattan", "euclidean", "minkowski", "maximum", "canberra", "binary")
CHOICE_STYLE = list(
typeTraj=c("l","l","n"),
colTraj=c("clusters","black","black"),
typeMean=c("b","b","b","b","l","l","n"),
colMean=c("clusters","black","clusters","black","clusters","black","black"),
pchMean=c("letters","letters","symbols","symbols","letters","letters","letters")
)
### Maximum number of clusters that kml can deal with MAX_CLUSTERS ### Names of the field that save clusters in object 'ClusterLongData' cat(CLUSTER_NAMES,"\n") ### List of the available criterion CRITERION_NAMES ### Distance available DISTANCE_METHODS[2] ### Define the style use by choice CHOICE_STYLE[['typeTraj']][2]
### Maximum number of clusters that kml can deal with MAX_CLUSTERS ### Names of the field that save clusters in object 'ClusterLongData' cat(CLUSTER_NAMES,"\n") ### List of the available criterion CRITERION_NAMES ### Distance available DISTANCE_METHODS[2] ### Define the style use by choice CHOICE_STYLE[['typeTraj']][2]
Compute Frechet distance between two trajectories.
distFrechet(Px,Py,Qx, Qy, timeScale=0.1, FrechetSumOrMax = "max")
distFrechet(Px,Py,Qx, Qy, timeScale=0.1, FrechetSumOrMax = "max")
Px |
[vector(numeric)] Times (abscisse) of the first trajectories. |
Py |
[vector(numeric)] Values of the first trajectories. |
Qx |
[vector(numeric)] Times of the second trajectories. |
Qy |
[vector(numeric)] Values of the second trajectories. |
timeScale |
[ |
FrechetSumOrMax |
[ |
Given two curve P and Q, Frechet distance between P and Q is define as
inf_{a,b} max_{t} d(P(a(t)),Q(b(t)))
. It's computation is a
NP-complex problem. When P and Q are trajectories (discrete curve), the
problem is polynomial.
The Frechet distance can also be define using a sum instead of a max:
inf_{a,b} sum_{t} d(P(a(t)),Q(b(t)))
The function distFrechet
is C compiled,
the function distFrechetR
is in R,
the function distFrechetRec
is in recursive (the slowest) in R.
A numeric value.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSM, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] Thomas Eiter & Heikki Mannila:
"Computing Discrete Fr´echet Distance"
[2] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[3] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
distTraj
Px <- 1:20 Py <- dnorm(1:20,12,2) Qx <- 1:20 Qy <- dnorm(1:20,8,2) distFrechet(Px,Py,Qx,Qy) ### Frechet using sum instead of max. distFrechet(Px,Py,Qx,Qy,FrechetSumOrMax="sum")
Px <- 1:20 Py <- dnorm(1:20,12,2) Qx <- 1:20 Qy <- dnorm(1:20,8,2) distFrechet(Px,Py,Qx,Qy) ### Frechet using sum instead of max. distFrechet(Px,Py,Qx,Qy,FrechetSumOrMax="sum")
Prepare the values of an object
ParLongData
to make them being usable by a
plotting function.
expandParLongData(xParLongData, y)
expandParLongData(xParLongData, y)
xParLongData |
|
y |
|
ParLongData
object can hold values that are easy
to specify (like col="clusters"
or pch="symbol"
) but that can not
be directly used by graphical functions plotTrajMeans and
plotTrajMeans3d. This function modify theses values to
make them fit with plotTrajMeans
and plotTrajMeans3d
expectations.
The field col
and pch
are the ones concern by this
function.
If y
is a Partition
, col
and pch
are
extanded to fit with the number of individual. If y
is a number of
clusters, col
and pch
are extanded to fit with the
number of clusters.
If col='clusters'
, a color is affected to each clusters. Then
the field col
receive a vector of color such that each
individual (if y
is a Partition
) or each clusters (if
y
is a number of clusters) get its corresponding color.
If pch='letters'
, a letters is affected to each clusters. Then
the field pch
receive a vector of letters such that each
individual (if y
is a Partition
) or each clusters (if
y
is a number of clusters) get its corresponding letters.
Same if pch='symbols'
.
An object of class ParLongData
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
################### ### Some parameters for trajectories (paramTraj <- parTRAJ(col="clusters")) ### Expand to a small partition with 3 clusters part <- partition(LETTERS[rep(1:3,4)]) expandParLongData(paramTraj,part) ################### ### Some parameters for the mean trajectories paramMean <- parMEAN() ### If there is 3 clusters : expandParLongData(paramMean,3) ### If there is 5 clusters : expandParLongData(paramMean,5)
################### ### Some parameters for trajectories (paramTraj <- parTRAJ(col="clusters")) ### Expand to a small partition with 3 clusters part <- partition(LETTERS[rep(1:3,4)]) expandParLongData(paramTraj,part) ################### ### Some parameters for the mean trajectories paramMean <- parMEAN() ### If there is 3 clusters : expandParLongData(paramMean,3) ### If there is 5 clusters : expandParLongData(paramMean,5)
imputation
is a function that offer different methods to impute
missing value of a LongData
(or a matrix).
imputation(traj,method="copyMean",lowerBound="globalMin",upperBound="globalMax")
imputation(traj,method="copyMean",lowerBound="globalMin",upperBound="globalMax")
traj |
|
method |
|
lowerBound |
|
upperBound |
|
imputation
is a function that impute
missing value of a LongData
or a matrix
.
Several imputation methods are available. A brief description
follows. For a fully detailled description, see [3].
Illustrating examples showing strenghs and weakness of methods are presented section "examples".
For each method, the imputation has to deal with monotone missing value (at start and at end of the trajectories) and intermitant (in the middle). Here is a brief description of each methods.
values imediatly surounding the missing are join by a line.
imputed by 'locf' or 'nocb'.
values imediatly surounding the missing are join by a line.
the line joining the first and last non-missing value is considered (this line is the everage progression of the actual individual trajectoire). Missing-value at start and at end are chosen on this line.
values imediatly surounding the missing are join by a line.
the line joining the first and second non-missing value is considered. Missing-value at start are chose on this line.
the line joining the last and penultimate non-missing value is considered. Missing-value at end are chosen on this line.
values imediatly surounding the missing are join by a line.
linearInterpol.global is not sensitive to local variation, linearInterpol.local might be too much sensitive to abnormal value. linearInterpol.bisector offer a medium solution by considering the bissectrice of Global and Local solution. Point are chosen on the bissectrices.
this method impute in two stages. First, it use 'linearInterpol.locf'. Then it add to each imputed value a variation that make the imputed value follow the shape of the average trajectory. For more details, see [3] and examples' section.
this method impute in two stages. First, it use 'linearInterpol.global'. Then it add to each imputed value a variation that make the imputed value follow the shape of the average trajectory. For more details, see [3] and examples' section.
this method impute in two stages. First, it use 'linearInterpol.local'. Then it add to each imputed value a variation that make the imputed value follow the shape of the average trajectory. For more details, see [3] and examples' section.
this method impute in two stages. First, it use 'linearInterpol.bisector'. Then it add to each imputed value a variation that make the imputed value follow the shape of the average trajectory. For more details, see [3] and examples' section.
THIS METHOD HAS BEEN PROUVEN TO NOT BE EFFICIANT SEVERAL TIME BY VARIOUS AUTHOR, we strongly recommand to not use it !
the previous non-missing value is dipplicated forward.
the first non-missing value is dupplicated backward (nocb).
THIS METHOD HAS BEEN PROUVEN TO NOT BE EFFICIANT SEVERAL TIME BY VARIOUS AUTHOR, we strongly recommand to not use it !
the next non-missing value is dipplicated backward.
the last non-missing value is dupplicated forward (locf).
missing are imputed by the mean of the trajectory.
missing are imputed by the median of the trajectory.
each missing is imputed by one non-missing (randomly choosen) value of the trajectory.
missing value at time t are imputed by the mean of all value present at time t.
missing value at time t are imputed by the median of all value present at time t.
each missing value at time t is imputed by one non-missing (randomly choosen) value present at time t.
A LongData
or a matrix
with no missing values.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121,
2011
[3] Christophe Genolini, René Écochard and Hélène Jacqmin-Gadda
"Copy Mean: A New Method to Impute Intermittent Missing Values in Longitudinal Studies"
Open Journal of Statistics, vol 3(26),2013
LongData
, Partition
, qualityCriterion
################## ### Preparation of the data par(ask=TRUE) timeV <- 1:14 matMissing <- matrix( c(NA ,NA ,NA ,18 ,22 ,NA ,NA ,NA ,NA , 24 , 22 , NA , NA , NA, 24 ,21 ,24 ,26 ,27 ,32 ,30 ,22 ,26 , 26 , 28 , 24 , 23 , 21, 14 ,13 , 10 , 8 , 7 ,18 ,16 , 8 ,12 , 6 , 10 , 10 , 9 , 7, 3 ,1 , 1 , 1 , 3,9 , 7 , -1 , 3 , 2 , 4 , 1 , 0 , -2 ),4,byrow=TRUE ) matplot(t(matMissing),col=c(2,1,1,1),lty=1,type="l",lwd=c(3,1,1,1),pch=16, xlab="Black=trajectories; Green=mean trajectory\nRed=trajectory to impute", ylab="",main="Four trajectories") moy <- apply(matMissing,2,mean,na.rm=TRUE) lines(moy,col=3,lwd=3) # # # # # # # # # # # # # # # # # # # # # # # # # # # Illustration of the different imputing method # # The best are at end !!! # # # # # # # # # # # # # # # # # # # # # # # # # # ################## ### Methods using cross sectionnal information (cross-methods) par(mfrow=c(1,3)) mat2 <- matrix(c( NA, 9, 8, 8, 7, 6,NA, 7, 6,NA,NA,NA, 4,5, 3, 4, 3,NA,NA, 2,3, NA,NA, 1,NA,NA, 1,1),4,7,byrow=TRUE) ### crossMean matplot(t(imputation(mat2,"crossMean")),type="l",ylim=c(0,10), lty=1,col=1,main="crossMean") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### crossMedian matplot(t(imputation(mat2,"crossMedian")),type="l",ylim=c(0,10), lty=1,col=1,main="crossMedian") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### crossHotDeck matplot(t(imputation(mat2,"crossHotDeck")),type="l",ylim=c(0,10), lty=1,col=1,main="crossHotDeck") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ################## ### Methods using trajectory information (traj-methods) par(mfrow=c(2,3)) mat1 <- matrix(c(NA,NA,3,8,NA,NA,2,2,1,NA,NA),1,11) ### locf matplot(t(imputation(mat1,"locf")),type="l",ylim=c(0,10), main="locf\n DO NOT USE, BAD METHOD !!!") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ### nocb matplot(t(imputation(mat1,"nocb")),type="l",ylim=c(0,10), main="nocb\n DO NOT USE, BAD METHOD !!!") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ### trajMean matplot(t(imputation(mat1,"trajMean")),type="l",ylim=c(0,10), main="trajMean") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ### trajMedian matplot(t(imputation(mat1,"trajMedian")),type="l",ylim=c(0,10), main="trajMedian") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ### trajHotDeck matplot(t(imputation(mat1,"trajHotDeck")),type="l",ylim=c(0,10), main="trajHotDeck 1") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ### spline matplot(t(imputation(mat1,"spline",lowerBound=NA,upperBound=NA)), type="l",ylim=c(-10,10),main="spline") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ################## ### Different linear interpolation par(mfrow=c(2,2)) ### linearInterpol.locf matplot(t(imputation(mat1,"linearInterpol.locf",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="linearInterpol.locf") matlines(t(mat1),type="o",col=2,lwd=3,pch=16,lty=1) ### linearInterpol.global matplot(t(imputation(mat1,"linearInterpol.global",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="linearInterpol.global") matlines(t(mat1),type="o",col=2,lwd=3,pch=16,lty=1) ### linearInterpol.local matplot(t(imputation(mat1,"linearInterpol.local",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="linearInterpol.local") matlines(t(mat1),type="o",col=2,lwd=3,pch=16,lty=1) ### linearInterpol.bisector matplot(t(imputation(mat1,"linearInterpol.bisector",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="linearInterpol.bisector") matlines(t(mat1),type="o",col=2,lwd=3,pch=16,lty=1) ################## ### Copy mean mat3 <- matrix(c( NA, 9, 8, 8, 7, 6,NA, 7, 6,NA,NA,NA, 4,5, 3, 4, 3,NA,NA, 2,3, NA,NA, 1,NA,NA, 1,1),4,7,byrow=TRUE) par(mfrow=c(2,2)) ### copyMean.locf matplot(t(imputation(mat2,"copyMean.locf",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="copyMean.locf") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### copyMean.global matplot(t(imputation(mat2,"copyMean.global",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="copyMean.global") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### copyMean.local matplot(t(imputation(mat2,"copyMean.local",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="copyMean.local") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### copyMean.bisector matplot(t(imputation(mat2,"copyMean.bisector",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="copyMean.bisector") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### crossMean matImp <- imputation(matMissing,method="crossMean") matplot(t(matImp),col=c(2,1,1,1),lty=c(2,1,1,1),type="l",lwd=c(2,1,1,1),pch=16, xlab="Dotted red=imputed trajectory\nFull red=trajectory to impute", ylab="",main="Method 'crossMean'") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### crossMedian matImp <- imputation(matMissing,method="crossMedian") matplot(t(matImp),col=c(2,1,1,1),lty=c(2,1,1,1),type="l",lwd=c(2,1,1,1),pch=16, xlab="Dotted red=imputed trajectory\nFull red=trajectory to impute",ylab="", main="Method 'crossMedian'") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### crossHotDeck matImp <- imputation(matMissing,method="crossHotDeck") matplot(t(matImp),col=c(2,1,1,1),lty=c(2,1,1,1),type="l",lwd=c(2,1,1,1),pch=16, xlab="Dotted red=imputed trajectory\nFull red=trajectory to impute",ylab="", main="Method 'crossHotDeck'") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ################## ### Method using trajectory par(mfrow=c(2,3)) ### trajMean matImp <- imputation(matMissing,method="trajMean") plot(timeV,matImp[1,],type="l",lwd=2,ylim=c(10,30),ylab="",xlab="nocb") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### trajMedian matImp <- imputation(matMissing,method="trajMedian") plot(timeV,matImp[1,],type="l",lwd=2,ylim=c(10,30),ylab="",xlab="nocb") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### trajHotDeck matImp <- imputation(matMissing,method="trajHotDeck") plot(timeV,matImp[1,],type="l",lwd=2,ylim=c(10,30),ylab="",xlab="nocb") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### locf matImp <- imputation(matMissing,method="locf") plot(timeV,matImp[1,],type="l",lwd=2,ylim=c(10,30),ylab="",xlab="locf") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### nocb matImp <- imputation(matMissing,method="nocb") plot(timeV,matImp[1,],type="l",lwd=2,ylim=c(10,30),ylab="",xlab="nocb") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) par(mfrow=c(2,2)) ### linearInterpol.locf matImp <- imputation(matMissing,method="linearInterpol.locf") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### linearInterpol.local matImp <- imputation(matMissing,method="linearInterpol.local") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### linearInterpol.global matImp <- imputation(matMissing,method="linearInterpol.global") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### linearInterpol.bisector matImp <- imputation(matMissing,method="linearInterpol.bisector") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) par(mfrow=c(2,2)) ### copyMean.locf matImp <- imputation(matMissing,method="copyMean.locf") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) lines(timeV,moy,col=3,type="o",lwd=3) ### copyMean.local matImp <- imputation(matMissing,method="copyMean.local") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) lines(timeV,moy,col=3,type="o",lwd=3) ### copyMean.global matImp <- imputation(matMissing,method="copyMean.global") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) lines(timeV,moy,col=3,type="o",lwd=3) ### copyMean.bisector matImp <- imputation(matMissing,method="copyMean.bisector") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) lines(timeV,moy,col=3,type="o",lwd=3) par(ask=FALSE)
################## ### Preparation of the data par(ask=TRUE) timeV <- 1:14 matMissing <- matrix( c(NA ,NA ,NA ,18 ,22 ,NA ,NA ,NA ,NA , 24 , 22 , NA , NA , NA, 24 ,21 ,24 ,26 ,27 ,32 ,30 ,22 ,26 , 26 , 28 , 24 , 23 , 21, 14 ,13 , 10 , 8 , 7 ,18 ,16 , 8 ,12 , 6 , 10 , 10 , 9 , 7, 3 ,1 , 1 , 1 , 3,9 , 7 , -1 , 3 , 2 , 4 , 1 , 0 , -2 ),4,byrow=TRUE ) matplot(t(matMissing),col=c(2,1,1,1),lty=1,type="l",lwd=c(3,1,1,1),pch=16, xlab="Black=trajectories; Green=mean trajectory\nRed=trajectory to impute", ylab="",main="Four trajectories") moy <- apply(matMissing,2,mean,na.rm=TRUE) lines(moy,col=3,lwd=3) # # # # # # # # # # # # # # # # # # # # # # # # # # # Illustration of the different imputing method # # The best are at end !!! # # # # # # # # # # # # # # # # # # # # # # # # # # ################## ### Methods using cross sectionnal information (cross-methods) par(mfrow=c(1,3)) mat2 <- matrix(c( NA, 9, 8, 8, 7, 6,NA, 7, 6,NA,NA,NA, 4,5, 3, 4, 3,NA,NA, 2,3, NA,NA, 1,NA,NA, 1,1),4,7,byrow=TRUE) ### crossMean matplot(t(imputation(mat2,"crossMean")),type="l",ylim=c(0,10), lty=1,col=1,main="crossMean") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### crossMedian matplot(t(imputation(mat2,"crossMedian")),type="l",ylim=c(0,10), lty=1,col=1,main="crossMedian") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### crossHotDeck matplot(t(imputation(mat2,"crossHotDeck")),type="l",ylim=c(0,10), lty=1,col=1,main="crossHotDeck") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ################## ### Methods using trajectory information (traj-methods) par(mfrow=c(2,3)) mat1 <- matrix(c(NA,NA,3,8,NA,NA,2,2,1,NA,NA),1,11) ### locf matplot(t(imputation(mat1,"locf")),type="l",ylim=c(0,10), main="locf\n DO NOT USE, BAD METHOD !!!") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ### nocb matplot(t(imputation(mat1,"nocb")),type="l",ylim=c(0,10), main="nocb\n DO NOT USE, BAD METHOD !!!") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ### trajMean matplot(t(imputation(mat1,"trajMean")),type="l",ylim=c(0,10), main="trajMean") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ### trajMedian matplot(t(imputation(mat1,"trajMedian")),type="l",ylim=c(0,10), main="trajMedian") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ### trajHotDeck matplot(t(imputation(mat1,"trajHotDeck")),type="l",ylim=c(0,10), main="trajHotDeck 1") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ### spline matplot(t(imputation(mat1,"spline",lowerBound=NA,upperBound=NA)), type="l",ylim=c(-10,10),main="spline") matlines(t(mat1),type="o",col=2,lwd=3,pch=16) ################## ### Different linear interpolation par(mfrow=c(2,2)) ### linearInterpol.locf matplot(t(imputation(mat1,"linearInterpol.locf",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="linearInterpol.locf") matlines(t(mat1),type="o",col=2,lwd=3,pch=16,lty=1) ### linearInterpol.global matplot(t(imputation(mat1,"linearInterpol.global",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="linearInterpol.global") matlines(t(mat1),type="o",col=2,lwd=3,pch=16,lty=1) ### linearInterpol.local matplot(t(imputation(mat1,"linearInterpol.local",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="linearInterpol.local") matlines(t(mat1),type="o",col=2,lwd=3,pch=16,lty=1) ### linearInterpol.bisector matplot(t(imputation(mat1,"linearInterpol.bisector",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="linearInterpol.bisector") matlines(t(mat1),type="o",col=2,lwd=3,pch=16,lty=1) ################## ### Copy mean mat3 <- matrix(c( NA, 9, 8, 8, 7, 6,NA, 7, 6,NA,NA,NA, 4,5, 3, 4, 3,NA,NA, 2,3, NA,NA, 1,NA,NA, 1,1),4,7,byrow=TRUE) par(mfrow=c(2,2)) ### copyMean.locf matplot(t(imputation(mat2,"copyMean.locf",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="copyMean.locf") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### copyMean.global matplot(t(imputation(mat2,"copyMean.global",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="copyMean.global") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### copyMean.local matplot(t(imputation(mat2,"copyMean.local",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="copyMean.local") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### copyMean.bisector matplot(t(imputation(mat2,"copyMean.bisector",NA,NA)),type="l", ylim=c(-5,10),lty=1,col=1,main="copyMean.bisector") matlines(t(mat2),type="o",col=2,lwd=3,pch=16,lty=1) ### crossMean matImp <- imputation(matMissing,method="crossMean") matplot(t(matImp),col=c(2,1,1,1),lty=c(2,1,1,1),type="l",lwd=c(2,1,1,1),pch=16, xlab="Dotted red=imputed trajectory\nFull red=trajectory to impute", ylab="",main="Method 'crossMean'") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### crossMedian matImp <- imputation(matMissing,method="crossMedian") matplot(t(matImp),col=c(2,1,1,1),lty=c(2,1,1,1),type="l",lwd=c(2,1,1,1),pch=16, xlab="Dotted red=imputed trajectory\nFull red=trajectory to impute",ylab="", main="Method 'crossMedian'") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### crossHotDeck matImp <- imputation(matMissing,method="crossHotDeck") matplot(t(matImp),col=c(2,1,1,1),lty=c(2,1,1,1),type="l",lwd=c(2,1,1,1),pch=16, xlab="Dotted red=imputed trajectory\nFull red=trajectory to impute",ylab="", main="Method 'crossHotDeck'") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ################## ### Method using trajectory par(mfrow=c(2,3)) ### trajMean matImp <- imputation(matMissing,method="trajMean") plot(timeV,matImp[1,],type="l",lwd=2,ylim=c(10,30),ylab="",xlab="nocb") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### trajMedian matImp <- imputation(matMissing,method="trajMedian") plot(timeV,matImp[1,],type="l",lwd=2,ylim=c(10,30),ylab="",xlab="nocb") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### trajHotDeck matImp <- imputation(matMissing,method="trajHotDeck") plot(timeV,matImp[1,],type="l",lwd=2,ylim=c(10,30),ylab="",xlab="nocb") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### locf matImp <- imputation(matMissing,method="locf") plot(timeV,matImp[1,],type="l",lwd=2,ylim=c(10,30),ylab="",xlab="locf") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### nocb matImp <- imputation(matMissing,method="nocb") plot(timeV,matImp[1,],type="l",lwd=2,ylim=c(10,30),ylab="",xlab="nocb") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) par(mfrow=c(2,2)) ### linearInterpol.locf matImp <- imputation(matMissing,method="linearInterpol.locf") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### linearInterpol.local matImp <- imputation(matMissing,method="linearInterpol.local") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### linearInterpol.global matImp <- imputation(matMissing,method="linearInterpol.global") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) ### linearInterpol.bisector matImp <- imputation(matMissing,method="linearInterpol.bisector") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) par(mfrow=c(2,2)) ### copyMean.locf matImp <- imputation(matMissing,method="copyMean.locf") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) lines(timeV,moy,col=3,type="o",lwd=3) ### copyMean.local matImp <- imputation(matMissing,method="copyMean.local") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) lines(timeV,moy,col=3,type="o",lwd=3) ### copyMean.global matImp <- imputation(matMissing,method="copyMean.global") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) lines(timeV,moy,col=3,type="o",lwd=3) ### copyMean.bisector matImp <- imputation(matMissing,method="copyMean.bisector") plot(timeV,matImp[1,],type="o",ylim=c(0,30),ylab="",xlab="LI-Global") lines(timeV,matMissing[1,],col=2,type="o",lwd=3) lines(timeV,moy,col=3,type="o",lwd=3) par(ask=FALSE)
This function provide different way of setting the initial partition for an EM algoritm.
initializePartition(nbClusters, lengthPart, method = "kmeans++", data)
initializePartition(nbClusters, lengthPart, method = "kmeans++", data)
nbClusters |
[numeric]: number of clusters of that the initial partition should have. |
lengthPart |
[numeric]: number of individual in the partition. |
method |
[character]: one off "randomAll", "randomK", "maxDist", "kmeans++", "kmeans+", "kmeans–" or "kmeans-". |
data |
[matrix]: |
Before alternating the phase Esperance and Maximisation, the EM algorithm needs to initialize a starting configuration. This initial partition has been proven to have an important impact on the final result and the convergence time.
This function provides different ways of setting the initial partition.
randomAll: all the individual are randomly assigned to a cluster with at least one individual in each clusters.
randomK: K individuals are randomly assigned to a cluster, all the other are not assigned (each cluster has only one individual).
maxDist: K indivuals are chosen. The two formers are the individual separated by the highest distance. The latter are added one by one, they are the "farthest" individual among those that are already been selected. "farthest" is the individual with the highest distance (min) to the selected individuals (if "t" are the individual already selected, the next selected individual is "i" such that max_i(min_t(dist(IND_i,IND_t))) ). This method is efficient but time consuming.
kmeans++: see [3]
kmeans+, kmeans–, kmeans-: experimental methods derived from [3].
vecteur of numeric.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
[3] D. Arthur and S. Vassilvitskii
"k-means++: the advantages of careful seeding"
Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete
algorithms. pp. 1027-1035, 2007.
par(ask=TRUE) ################### ### Constrution of some longitudinal data data(artificialLongData) dn <- longData(artificialLongData) plotTrajMeans(dn) ################### ### partition using randamAll pa1a <- initializePartition(3,lengthPart=200,method="randomAll") plotTrajMeans(dn,partition(pa1a),parMean=parMEAN(type="n"),parTraj=parTRAJ(col="clusters")) pa1b <- initializePartition(3,lengthPart=200,method="randomAll") plotTrajMeans(dn,partition(pa1b),parMean=parMEAN(type="n"),parTraj=parTRAJ(col="clusters")) ################### ### partition using randamK pa2a <- initializePartition(3,lengthPart=200,method="randomK") plotTrajMeans(dn,partition(pa2a),parMean=parMEAN(type="n"),parTraj=parTRAJ(col="clusters")) pa2b <- initializePartition(3,lengthPart=200,method="randomK") plotTrajMeans(dn,partition(pa2b),parMean=parMEAN(type="n"),parTraj=parTRAJ(col="clusters")) ################### ### partition using maxDist pa3 <- initializePartition(3,lengthPart=200,method="maxDist",data=dn["traj"]) plotTrajMeans(dn,partition(pa3),parMean=parMEAN(type="n"),parTraj=parTRAJ(col="clusters")) ### maxDist is deterministic, so no need for a second example ################### ### Example to illustrate "maxDist" method on classical clusters point <- matrix(c(0,0, 0,1, -1,0, 0,-1, 1,0),5,byrow=TRUE) points <- rbind(point,t(t(point)+c(10,0)),t(t(point)+c(5,6))) points <- rbind(points,t(t(points)+c(30,0)),t(t(points)+c(15,20)),t(-t(point)+c(20,10))) plot(points,main="Some points") paInit <- initializePartition(2,nrow(points),method="maxDist",points) plot(points,main="Two farest points") lines(points[!is.na(paInit),],col=2,type="p",pch=16) paInit <- initializePartition(3,nrow(points),method="maxDist",points) plot(points,main="Three farest points") lines(points[!is.na(paInit),],col=2,type="p",pch=16) paInit <- initializePartition(4,nrow(points),method="maxDist",points) plot(points, main="Four farest points") lines(points[!is.na(paInit),],col=2,type="p",pch=16) par(ask=FALSE)
par(ask=TRUE) ################### ### Constrution of some longitudinal data data(artificialLongData) dn <- longData(artificialLongData) plotTrajMeans(dn) ################### ### partition using randamAll pa1a <- initializePartition(3,lengthPart=200,method="randomAll") plotTrajMeans(dn,partition(pa1a),parMean=parMEAN(type="n"),parTraj=parTRAJ(col="clusters")) pa1b <- initializePartition(3,lengthPart=200,method="randomAll") plotTrajMeans(dn,partition(pa1b),parMean=parMEAN(type="n"),parTraj=parTRAJ(col="clusters")) ################### ### partition using randamK pa2a <- initializePartition(3,lengthPart=200,method="randomK") plotTrajMeans(dn,partition(pa2a),parMean=parMEAN(type="n"),parTraj=parTRAJ(col="clusters")) pa2b <- initializePartition(3,lengthPart=200,method="randomK") plotTrajMeans(dn,partition(pa2b),parMean=parMEAN(type="n"),parTraj=parTRAJ(col="clusters")) ################### ### partition using maxDist pa3 <- initializePartition(3,lengthPart=200,method="maxDist",data=dn["traj"]) plotTrajMeans(dn,partition(pa3),parMean=parMEAN(type="n"),parTraj=parTRAJ(col="clusters")) ### maxDist is deterministic, so no need for a second example ################### ### Example to illustrate "maxDist" method on classical clusters point <- matrix(c(0,0, 0,1, -1,0, 0,-1, 1,0),5,byrow=TRUE) points <- rbind(point,t(t(point)+c(10,0)),t(t(point)+c(5,6))) points <- rbind(points,t(t(points)+c(30,0)),t(t(points)+c(15,20)),t(-t(point)+c(20,10))) plot(points,main="Some points") paInit <- initializePartition(2,nrow(points),method="maxDist",points) plot(points,main="Two farest points") lines(points[!is.na(paInit),],col=2,type="p",pch=16) paInit <- initializePartition(3,nrow(points),method="maxDist",points) plot(points,main="Three farest points") lines(points[!is.na(paInit),],col=2,type="p",pch=16) paInit <- initializePartition(4,nrow(points),method="maxDist",points) plot(points, main="Four farest points") lines(points[!is.na(paInit),],col=2,type="p",pch=16) par(ask=FALSE)
An object of class ListPartition
contain several liste of
Partition
sorted by cluster numbers.
Objects are mainly design to store the numerous Partition
found
by kml
or kml3d
.
criterionActif
[character]: Store the criterion name that will be used by fonctions that need a single criterion (like plotCriterion or ordered).
initializationMethod
[vector(chararcter)]: list all
the initialization method that has allready been used to find some
Partition
(usefull to not run several time a deterministic method).
sorted
[logical]
: are the Partition
curently hold in the object sorted in decreasing (or increasing, according to
criterionActif
) order ?
c1
[list(Partition)]: list of Partition
with 1 clusters.
c2
[list(Partition)]: list of Partition
with 2 clusters.
c3
[list(Partition)]: list of Partition
with 3 clusters.
c4
[list(Partition)]: list of Partition
with 4 clusters.
c5
[list(Partition)]: list of Partition
with 5 clusters.
c6
[list(Partition)]: list of Partition
with 6 clusters.
c7
[list(Partition)]: list of Partition
with 7 clusters.
c8
[list(Partition)]: list of Partition
with 8 clusters.
c9
[list(Partition)]: list of Partition
with 9 clusters.
c10
[list(Partition)]: list of Partition
with 10 clusters.
c11
[list(Partition)]: list of Partition
with 11 clusters.
c12
[list(Partition)]: list of Partition
with 12 clusters.
c13
[list(Partition)]: list of Partition
with 13 clusters.
c14
[list(Partition)]: list of Partition
with 14 clusters.
c15
[list(Partition)]: list of Partition
with 15 clusters.
c16
[list(Partition)]: list of Partition
with 16 clusters.
c17
[list(Partition)]: list of Partition
with 17 clusters.
c18
[list(Partition)]: list of Partition
with 18 clusters.
c19
[list(Partition)]: list of Partition
with 19 clusters.
c20
[list(Partition)]: list of Partition
with 20 clusters.
c21
[list(Partition)]: list of Partition
with 21 clusters.
c22
[list(Partition)]: list of Partition
with 22 clusters.
c23
[list(Partition)]: list of Partition
with 23 clusters.
c24
[list(Partition)]: list of Partition
with 24 clusters.
c25
[list(Partition)]: list of Partition
with 25 clusters.
c26
[list(Partition)]: list of Partition
with 26 clusters.
Class ListPartition
objects are mainly constructed by
kml
.
Neverdeless, it is also possible to construct them from
scratch using the fonction listPartition
that does
create an empty object.
object['xxx']
If 'xxx' is 'cX',
'initializationMethod', 'sorted'
or 'criterionActif', get the value of the field
xxx
.
object['criterionValues',j]
Give the values of the criterion 'j' for all the Partitions. The result is return as a list. If 'j' is missing, the criterion actif is used.
object['criterionValuesAsMatrix',j]
Give the values of the criterion 'j' for all the Partitions. The result is return as a matrix. If 'j' is missing, the criterion actif is used.
object['xxx']
If 'xxx' is a criterion, this is equivalent to object['criterionValuesAsMatrix','xxx']
object['initializationMethod']<-value
Set the field to
value
object['criterionActif']<-value
If 'value' is one of CRITERION_NAMES, it sets the field to the criterion 'value'.
object['add']<-value
If 'value' is an object of class 'Partition', then value is added to the Partition already hold in the field 'cX'. Note that a Partition with 'X' clusters is automatiquely added to the correct list 'cX' according to its number of clusters.
object['clear']<-'cX'
Clear the list 'cX'.
listPartition
Constructor. Build an empty object.
ordered
Order the Partition according to the criterion actif.
regroup
Order then merge identical Partition (usefull
to reduce the size of the ListPartition
)
Christophe Genolini^{1,2}
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSM, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] Christophe M. Genolini and Bruno Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] Christophe M. Genolini and Bruno Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
Classes: LongData
Methods: Partition
############## ### Preparing data data(artificialLongData) traj <- as.matrix(artificialLongData[,-1]) ### Some clustering part2 <- partition(rep(c("A","B"),time=100),traj) part3 <- partition(rep(c("A","B","C","A"),time=50),traj) part3b <- partition(rep(c("A","B","C","B","C"),time=40),traj) part4 <- partition(rep(c("A","B","A","C","D"),time=40),traj) ################ ### ListPartition listPart <- listPartition() plotCriterion(listPart) listPart["add"] <- part2 listPart["add"] <- part3 listPart["add"] <- part3b listPart["add"] <- part4 listPart["add"] <- part4 listPart["add"] <- part3 listPart["add"] <- part3b plotCriterion(listPart) ordered(listPart) plotCriterion(listPart) regroup(listPart) plotCriterion(listPart) plotAllCriterion(listPart)
############## ### Preparing data data(artificialLongData) traj <- as.matrix(artificialLongData[,-1]) ### Some clustering part2 <- partition(rep(c("A","B"),time=100),traj) part3 <- partition(rep(c("A","B","C","A"),time=50),traj) part3b <- partition(rep(c("A","B","C","B","C"),time=40),traj) part4 <- partition(rep(c("A","B","A","C","D"),time=40),traj) ################ ### ListPartition listPart <- listPartition() plotCriterion(listPart) listPart["add"] <- part2 listPart["add"] <- part3 listPart["add"] <- part3b listPart["add"] <- part4 listPart["add"] <- part4 listPart["add"] <- part3 listPart["add"] <- part3b plotCriterion(listPart) ordered(listPart) plotCriterion(listPart) regroup(listPart) plotCriterion(listPart) plotAllCriterion(listPart)
longData
is a constructor for the class LongData
.
It create object LongData
containing a single variable-trajectory.
For creating joint variable-trajectories, see longData3d
.
longData(traj, idAll, time, timeInData,varNames,maxNA)
longData(traj, idAll, time, timeInData,varNames,maxNA)
traj |
|
idAll |
|
time |
|
timeInData |
|
varNames |
|
maxNA |
|
longData
construct a object of class LongData
.
Two cases can be distinguised:
traj
is an array
:lines are individual. Column are time of measurment.
If idAll
is missing, the individuals are labelled i1
,
i2
, i3
,...
If timeInData
is missing, all the column
are used (timeInData=1:ncol(traj)
).
traj
is a data.frame
:lines are individual. Column are time of measurement.
If idAll
is missing, then the first column of the
data.frame
is used for idAll
If timeInData
is missing and idAll
is missing, then
all the columns but the first are used for timeInData
(the
first is omited since it is already used for idAll
): idAll=traj[,1],timeInData=2:ncol(traj)
.
If timeInData
is missing but idAll
is not missing,
then all the column including the first are used for timeInData
: timeInData=1:ncol(traj)
.
An object of class LongData
.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
##################### ### From matrix ### Small data mat <- matrix(c(1,NA,3,2,3,6,1,8,10),3,3,dimnames=list(c(101,102,104),c("T2","T4","T8"))) longData(mat) (ld1 <- longData(traj=mat,idAll=as.character(c(101,102,104)),time=c(2,4,8),varNames="V")) plotTrajMeans(ld1) ### Big data mat <- matrix(runif(1051*325),1051,325) (ld2 <- longData(traj=mat,idAll=paste("I-",1:1051,sep=""),time=(1:325)+0.5,varNames="Random")) #################### ### From data.frame dn <- data.frame(id=1:3,v1=c(NA,2,1),v2=c(NA,1,0),v3=c(3,2,2),v4=c(4,2,NA)) ### Basic longData(dn) ### Selecting some times (ld3 <- longData(dn,timeInData=c(1,2,4),varNames=c("Hyp"))) ### Excluding trajectories with more than 1 NA (ld3 <- longData(dn,maxNA=1))
##################### ### From matrix ### Small data mat <- matrix(c(1,NA,3,2,3,6,1,8,10),3,3,dimnames=list(c(101,102,104),c("T2","T4","T8"))) longData(mat) (ld1 <- longData(traj=mat,idAll=as.character(c(101,102,104)),time=c(2,4,8),varNames="V")) plotTrajMeans(ld1) ### Big data mat <- matrix(runif(1051*325),1051,325) (ld2 <- longData(traj=mat,idAll=paste("I-",1:1051,sep=""),time=(1:325)+0.5,varNames="Random")) #################### ### From data.frame dn <- data.frame(id=1:3,v1=c(NA,2,1),v2=c(NA,1,0),v3=c(3,2,2),v4=c(4,2,NA)) ### Basic longData(dn) ### Selecting some times (ld3 <- longData(dn,timeInData=c(1,2,4),varNames=c("Hyp"))) ### Excluding trajectories with more than 1 NA (ld3 <- longData(dn,maxNA=1))
LongData
is an objet containing the longitudinal
data (the individual trajectories) and some associate value (like time, individual
identifiant,...). It can be used either for a single
variable-trajectory or for joint variable-trajectories.
Object LongData
for single variable-trajectory can be created using
the fonction longData
on a data.frame
or on a matrix
.
LongData
for joint trajectories can be created by calling
the fonction longData3d
on a data.frame
or on an array
.
idAll
[vector(character)]
: Single identifier
for each of the longData (each individual). Usefull to export clusters.
idFewNA
[vector(character)]
: Restriction of
idAll
to the trajectories that does not have 'too many' missing
value. See maxNA
for 'too many' definition.
time
[numeric]
: Time at which measures are made.
varNames
[character]
: Name of the variable measured.
traj
[matrix(numeric)]
: Contains
the longitudianl data. Each lines is the trajectories of an
individual. Each column is the time at which measures
are made.
dimTraj
[vector3(numeric)]
: size of the matrix
traj
(ie dimTraj=c(length(idFewNA),length(time))
).
maxNA
[numeric]
or [vector(numeric)]
:
Individual whose trajectories contain 'too many' missing value
are exclude from traj
and will no be use in
the analysis. Their identifier is preserved in idAll
but
not in idFewNA
. 'too many' is define by maxNA
: a
trajectory with more missing than maxNA
is exclude.
reverse
[matrix(numeric)]
: if the trajectories
are scale using the function scale
, the 'scaling
parameters' (probably mean and standard deviation) are saved in
reverse
. This is usefull to restore the original data after a
scaling operation.
Object LongData
for single variable-trajectory can be created by calling
the fonction longData
on a data.frame
or on a matrix
.
LongData
for joint trajectories can be created by calling
the fonction longData3d
on a data.frame
or on an array
.
[vecteur(character)]: Gets the full list of individual
identifiant (the value of the slot idAll
)
[vecteur(character)]: Gets the list of individual
identifiant with not too many missing values (the value of the slot idFewNA
)
[character]: Gets the name(s) of the variable (the value of the slot varNames
)
[vecteur(numeric)]: Gets the times (the value of the slot time
)
[array(numeric)]: Gets all the longData' values (the value of the slot traj
)
[vector3(numeric)]: Gets the dimension of traj
.
[numeric]: Gets the first dimension of
traj
(ie the number of individual include in the analysis).
[numeric]: Gets the second dimension of
traj
(ie the number of time measurement).
[numeric]: Gets the third dimension of
traj
(ie the number of variables).
[vecteur(numeric)]: Gets maxNA.
[matrix(numeric)]: Gets the matrix of the scaling parameters.
scale
scale the trajectories. Usefull to normalize variable trajectories measured with different units.
restoreRealData
restore original data that have been modified after a scaling operation.
longDataFrom3d
Extract a variable trajectory form a dataset of joint trajectories.
plotTrajMeans
plot all the variables of the LongData
, optionnaly according to a Partition
.
plotTrajMeans3d
plot two variables of the LongData
in 3 dimensions, optionnaly according to a Partition
.
plot3dPdf
create 'Triangle objects' representing in
3D the cluster's center according to a
Partition
. 'Triangle object' can latter be
include in a LaTeX file to get a dynamique (rotationg) pdf
figure.
imputation
Impute the missing values of the trajectories.
qualityCriterion
Compute some quality criterion that
can be use to compare the quality of differents Partition
.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
Overview: longitudinalData-package
Methods: longData
, longData3d
, imputation
, qualityCriterion
Plot: plotTrajMeans
,
plotTrajMeans3d
, plot3dPdf
################# ### building trajectory (longData) mat <- matrix(c(NA,2,3,4,1,6,2,5,1,3,8,10),4) ld <- longData(mat,idAll=c("I1","I2","I3","I4"),time=c(2,4,8),varNames="Age") ### '[' and '[<-' ld["idAll"] ld["idFewNA"] ld["varNames"] ld["traj"] (ld) ### Plot plotTrajMeans(ld,parMean=parMEAN(type="n"))
################# ### building trajectory (longData) mat <- matrix(c(NA,2,3,4,1,6,2,5,1,3,8,10),4) ld <- longData(mat,idAll=c("I1","I2","I3","I4"),time=c(2,4,8),varNames="Age") ### '[' and '[<-' ld["idAll"] ld["idFewNA"] ld["varNames"] ld["traj"] (ld) ### Plot plotTrajMeans(ld,parMean=parMEAN(type="n"))
longData3d
is a constructor of the class LongData
.
It create object LongData
containing several joint trajectory (two
or more variable-trajectories). For creating a single
variable-trajectory, see longData
.
longData3d(traj, idAll, time, timeInData,varNames,maxNA)
longData3d(traj, idAll, time, timeInData,varNames,maxNA)
traj |
|
idAll |
|
time |
|
timeInData |
|
varNames |
|
maxNA |
|
longData3d
construct a object of class
LongData
. Two cases can be distinguised:
traj
is an array
: the first dimension (line) are
individual. The second dimension (column) are time at which the
measurement are made. The third dimension are the differents
variable-trajectories. For example, traj[,,2]
is the second variable-trajectory.
If idAll
is missing, the individuals are labelled i1
,
i2
, i3
,...
If timeInData
is missing, all the column
are used (1:ncol(traj)
).
traj
is a data.frame
: lines are individual. Time of
measurement and variables should be provide through
timeInData
. timeInData
is a list.
The label of the list are the
variable-trajectories names. Elements of the list are the column
containning the trajectories. For example, if
timeInData=list(V=c(2,3,4),W=c(6,8,12))
, then the first
variable-trajectory is 'V', its measurement are in column 2,3 and
4. The second variable-trajectory is 'W', its measurement are in column
6,8 and 12.
If idAll
is missing, the first column of the data.frame
is used.
An object of class LongData
.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
################# ### From array mat <- array(c(1,NA,3,2,3,6,1,8,10,1,NA,1,2,NA,3,2,3,2),dim=c(3,3,2)) longData3d(mat) (ld1 <- longData3d(mat,varNames=c("Hyp","Col"),idAll=c("i101","i104","i105"))) plotTrajMeans3d(ld1) ################# ### From data.frame dn <- data.frame(id=1:3,v1=c(2,2,1),t1=c(20,21,22),v1=c(3,2,2),t2=c(23,20,28),t3=c(25,24,29)) longData3d(dn,timeInData=list(c(2,4),c(3,5)),varNames=c("V","T")) (ld3 <- longData3d(dn,timeInData=list(V=c(2,4,NA),T=c(3,5,6)))) plotTrajMeans3d(ld3)
################# ### From array mat <- array(c(1,NA,3,2,3,6,1,8,10,1,NA,1,2,NA,3,2,3,2),dim=c(3,3,2)) longData3d(mat) (ld1 <- longData3d(mat,varNames=c("Hyp","Col"),idAll=c("i101","i104","i105"))) plotTrajMeans3d(ld1) ################# ### From data.frame dn <- data.frame(id=1:3,v1=c(2,2,1),t1=c(20,21,22),v1=c(3,2,2),t2=c(23,20,28),t3=c(25,24,29)) longData3d(dn,timeInData=list(c(2,4),c(3,5)),varNames=c("V","T")) (ld3 <- longData3d(dn,timeInData=list(V=c(2,4,NA),T=c(3,5,6)))) plotTrajMeans3d(ld3)
LongData3d
is an objet containing joint longitudinal
data and some associate value (like time, individual
identifiant,...).
Object LongData3d
can be created using
the fonction longData3d
on a data.frame
or on an array
.
idAll
[vector(character)]
: Single identifier
for each of the longData3d (each individual). Usefull to export clusters.
idFewNA
[vector(character)]
: Restriction of
idAll
to the trajectories that does not have 'too many' missing
value. See maxNA
for 'too many' definition.
time
[numeric]
: Time at which measures are made.
varNames
[vector(character)]
: Names of the variable measured.
traj
[array(numeric)]
: Contains
the joint variable-trajectories. Each horizontal plan (first
dimension) corresponds to the joint-trajectories of an
individual. Vertical plans (second dimension) refer to the time at which measures
are made. Transversal plans (the third dimension) are for variables.
dimTraj
[vector3(numeric)]
: size of the array
traj
(ie dimTraj=c(length(idFewNA),length(time),length(varNames))
).
maxNA
[numeric]
or [vector(numeric)]
:
Individual whose trajectories contain 'too many' missing value
are exclude from traj
and will no be use in
the analysis. Their identifier is preserved in idAll
but
not in idFewNA
. 'too many' is define by maxNA
: a
trajectory with more missing than maxNA
is exclude.
When maxNA
is a single number, it is
recycled for all the variables.
reverse
[matrix(numeric)]
: if the trajectories
are scale using the function scale
, the 'scaling
parameters' (probably mean and standard deviation) are saved in
reverse
. This is usefull to restore the original data after a
scaling operation.
LongData3d
can be created by calling
the fonction longData3d
on a data.frame
or on an array
.
[vecteur(character)]: Gets the full list of individual
identifiant (the value of the slot idAll
)
[vecteur(character)]: Gets the list of individual
identifiant with not too many missing values (the value of the slot idFewNA
)
[character]: Gets the name(s) of the variable (the value of the slot varNames
)
[vecteur(numeric)]: Gets the times (the value of the slot time
)
[array(numeric)]: Gets all the joint trajectories (the value of the slot traj
)
[vector3(numeric)]: Gets the dimension of traj
.
[numeric]: Gets the first dimension of
traj
(ie the number of individual include in the analysis).
[numeric]: Gets the second dimension of
traj
(ie the number of time measurement).
[numeric]: Gets the third dimension of
traj
(ie the number of variables).
[vecteur(numeric)]: Gets maxNA.
[matrix(numeric)]: Gets the matrix of the scaling parameters.
scale
scale the trajectories. Usefull to normalize variable trajectories measured with different units.
restoreRealData
restore original data that have been modified after a scaling operation.
longDataFrom3d
Create a
LongData
by extracting a single variable trajectory
form a dataset of joint variable-trajectories.
plotTrajMeans
plot all the variable of the LongData3d
, optionnaly according to a Partition
.
plotTrajMeans3d
plot two variables of the LongData3d
in
a 3 dimensions graph, optionnaly according to a Partition
.
plot3dPdf
create 'Triangle objects' representing in
3D the cluster's center according to a
Partition
. 'Triangle object' can latter be
include in a LaTeX file to get a dynamique (rotationg) pdf
figure.
imputation
Impute the missing values of the trajectories.
qualityCriterion
Compute some quality criterion that
can be use to compare the quality of differents Partition
.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
Overview: longitudinalData-package
Methods: LongData
, longData3d
, imputation
, qualityCriterion
Plot: plotTrajMeans
,
plotTrajMeans3d
, plot3dPdf
################# ### building joint trajectories dn <- data.frame(id=1:3,v1=c(11,14,16),t1=c(1,5,7),v2=c(12,10,13),t2=c(2,5,0),t3=c(3,6,8)) (ld <- longData3d(dn,timeInData=list(Vir=c(2,4,NA),Tes=c(3,5,6)))) ### Scaling scale(ld) (ld) ### Plotting plotTrajMeans3d(ld) restoreRealData(ld)
################# ### building joint trajectories dn <- data.frame(id=1:3,v1=c(11,14,16),t1=c(1,5,7),v2=c(12,10,13),t2=c(2,5,0),t3=c(3,6,8)) (ld <- longData3d(dn,timeInData=list(Vir=c(2,4,NA),Tes=c(3,5,6)))) ### Scaling scale(ld) (ld) ### Plotting plotTrajMeans3d(ld) restoreRealData(ld)
Extract a single variable-trajectory from an object
LongData
that contain some joint-trajectories.
longDataFrom3d(xLongData3d,variable)
longDataFrom3d(xLongData3d,variable)
xLongData3d |
|
variable |
|
Extract a single variable-trajectory from an object
LongData3d
that contain some join-trajectories.
An object of class LongData
.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
### Creation of joint-trajectories mat <- array(c(1,NA,3,2,3,6,1,8,10,1,NA,1,2,NA,3,2,3,2),dim=c(3,3,2)) (ldJoint <- longData3d(mat,varNames=c("Hyp","Som"))) ### Extraction of the first variable-trajectory (ldHyp <- longDataFrom3d(ldJoint,variable="Hyp")) ### Extraction of the second variable-trajectory (ldSom <- longDataFrom3d(ldJoint,variable="Som")) ### Extraction of the second variable-trajectory, using number (ldSom <- longDataFrom3d(ldJoint,variable=2))
### Creation of joint-trajectories mat <- array(c(1,NA,3,2,3,6,1,8,10,1,NA,1,2,NA,3,2,3,2),dim=c(3,3,2)) (ldJoint <- longData3d(mat,varNames=c("Hyp","Som"))) ### Extraction of the first variable-trajectory (ldHyp <- longDataFrom3d(ldJoint,variable="Hyp")) ### Extraction of the second variable-trajectory (ldSom <- longDataFrom3d(ldJoint,variable="Som")) ### Extraction of the second variable-trajectory, using number (ldSom <- longDataFrom3d(ldJoint,variable=2))
Build a object LongData3d
from an object
LongData
. The resulting object has a single
variable-trajectory stored in a array.
longDataTo3d(xLongData)
longDataTo3d(xLongData)
xLongData |
|
Build a object LongData3d
from an object
LongData
. The resulting object has a single
variable-trajectory stored in a array.
An object of class LongData3d
.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
### Creation of single variable-trajectory mat <- matrix(c(1,NA,3,2,3,6,1,8,10,1,NA,1,2,NA,3,2,3,2),6,3) (ldSingle <- longData(mat)) ### Extension to joint trajectories (ldHyp <- longDataTo3d(ldSingle))
### Creation of single variable-trajectory mat <- matrix(c(1,NA,3,2,3,6,1,8,10,1,NA,1,2,NA,3,2,3,2),6,3) (ldSingle <- longData(mat)) ### Extension to joint trajectories (ldHyp <- longDataTo3d(ldSingle))
Create a LaTeX document that inclusde 3D objects into PDF documents.
makeLatexFile(filename = "main.tex", asyToInclude = "scene+0.prc")
makeLatexFile(filename = "main.tex", asyToInclude = "scene+0.prc")
filename |
Name of the LaTeX file |
asyToInclude |
Name of the file holding the 3D graph to include. |
Create a LaTeX document that inclusde 3D objects into PDF documents with PDF-1.5/1.6 compatibility.
A LaTeX file, in the current directory.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
makeTriangles,plot3dPdf
, saveTrianglesAsASY.
### Move to tempdir wd <- getwd() setwd(tempdir()); getwd() ### Generating the data data(artificialJointLongData) myLd <- longData3d(artificialJointLongData,timeInData=list(var1=2:12,var2=13:23)) part <- partition(rep(1:3,each=50)) plotTrajMeans3d(myLd,part) ### Creation of the scene scene <- plot3dPdf(myLd,part) drawScene.rgl(scene) ### Export in '.asy' file saveTrianglesAsASY(scene) ### Creation of a '.prc' file # Open a console, then run: # 'asy -inlineimage -tex pdflatex scene.asy' ### Creation of the LaTeX main document makeLatexFile() ### Creation of the '.pdf' # Open a console window, then run # pdfLatex main.tex ### Go back to current dir setwd(wd)
### Move to tempdir wd <- getwd() setwd(tempdir()); getwd() ### Generating the data data(artificialJointLongData) myLd <- longData3d(artificialJointLongData,timeInData=list(var1=2:12,var2=13:23)) part <- partition(rep(1:3,each=50)) plotTrajMeans3d(myLd,part) ### Creation of the scene scene <- plot3dPdf(myLd,part) drawScene.rgl(scene) ### Export in '.asy' file saveTrianglesAsASY(scene) ### Creation of a '.prc' file # Open a console, then run: # 'asy -inlineimage -tex pdflatex scene.asy' ### Creation of the LaTeX main document makeLatexFile() ### Creation of the '.pdf' # Open a console window, then run # pdfLatex main.tex ### Go back to current dir setwd(wd)
Sort the Partition
of a
ListPartition
according to a quality criterion.
ordered(x,...)
ordered(x,...)
x |
[ListPartition]: Object whose |
... |
Note used, for S4 compatibility only. |
Sort the Partition
of a ListPartition
for each
list (sort the 'c2' list, the 'c3' list,...) according to a quality criterion.
The criterion used to sort is the one in the field
criterionActif
.
This function change internaly the order of the fields c2
,
c3
, ... c26
of an object. In addition, it return the
permutation matrix (the matrix use to re-ordered the ci
).
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] Christophe M. Genolini and Bruno Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] Christophe M. Genolini and Bruno Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
############## ### Preparing data data(artificialLongData) traj <- as.matrix(artificialLongData[,-1]) ### Some clustering part2 <- partition(rep(c("A","B"),time=100),traj) part3 <- partition(rep(c("A","B","C","A"),time=50),traj) part3b <- partition(rep(c("A","B","C","B"),time=50),traj) part4 <- partition(rep(c("A","B","C","D"),time=50),traj) ################ ### ListPartition listPart <- listPartition() listPart['criterionActif'] <-"Davies.Bouldin" plotCriterion(listPart) listPart["add"] <- part2 listPart["add"] <- part3 listPart["add"] <- part3b listPart["add"] <- part4 listPart["add"] <- part4 listPart["add"] <- part3 listPart["add"] <- part3b plotCriterion(listPart) ordered(listPart) plotCriterion(listPart) listPart['criterionActif'] <-"Calinski.Harabatz" plotCriterion(listPart) ordered(listPart) plotCriterion(listPart)
############## ### Preparing data data(artificialLongData) traj <- as.matrix(artificialLongData[,-1]) ### Some clustering part2 <- partition(rep(c("A","B"),time=100),traj) part3 <- partition(rep(c("A","B","C","A"),time=50),traj) part3b <- partition(rep(c("A","B","C","B"),time=50),traj) part4 <- partition(rep(c("A","B","C","D"),time=50),traj) ################ ### ListPartition listPart <- listPartition() listPart['criterionActif'] <-"Davies.Bouldin" plotCriterion(listPart) listPart["add"] <- part2 listPart["add"] <- part3 listPart["add"] <- part3b listPart["add"] <- part4 listPart["add"] <- part4 listPart["add"] <- part3 listPart["add"] <- part3b plotCriterion(listPart) ordered(listPart) plotCriterion(listPart) listPart['criterionActif'] <-"Calinski.Harabatz" plotCriterion(listPart) ordered(listPart) plotCriterion(listPart)
parLongData
, parTraj
and parMean
are constructors for the class
ParLongData
.
parLongData(type, col, pch, pchPeriod, cex, xlab, ylab) parTRAJ(type = "l", col = "black", pch = "1", pchPeriod = 0, cex = 1, xlab = "Time", ylab = "") parMEAN(type = "b", col = "clusters", pch = "letters", pchPeriod = 1, cex = 1.2, xlab = "Time", ylab = "")
parLongData(type, col, pch, pchPeriod, cex, xlab, ylab) parTRAJ(type = "l", col = "black", pch = "1", pchPeriod = 0, cex = 1, xlab = "Time", ylab = "") parMEAN(type = "b", col = "clusters", pch = "letters", pchPeriod = 1, cex = 1.2, xlab = "Time", ylab = "")
type |
|
col |
|
pch |
|
pchPeriod |
|
cex |
|
xlab |
|
ylab |
|
parLongData
is the basic constructor of the class
ParLongData
.
parTRAJ
create an object with default values for ploting
individual trajectories ;
parMEAN
create an object with default values for ploting mean trajectories.
If col='clusters', pch='letters' or pch='symbol', the object can not be use directly, it should first be prepared using the function expandParLongData.
An object of class ParLongData
Christophe Genolini
PSIGIAM: Paris Sud Innovation Group in Adolescent Mental Health
INSERM U669 / Maison de Solenn / Paris
Contact author : <[email protected]>
Raphaël Ricaud
Laboratoire "Sport & Culture" / "Sports & Culture" Laboratory
University of Paris 10 / Nanterre
################## ### Construction of LongData time=c(1,2,3,4,8,12,16,20) id2=1:120 f <- function(id,t)((id-1)%%3-1) * t g <- function(id,t)(id%%2+1)*t ld2 <- longData3d( array(cbind(outer(id2,time,f),outer(id2,time,g))+rnorm(120*8*2,0,3), dim=c(120,8,2))) ### Example with default value plotTrajMeans3d(ld2) plotTrajMeans3d(ld2,parTraj=parTRAJ()) ### Example with default value except for the color plotTrajMeans3d(ld2,parTraj=parTRAJ(col="blue"))
################## ### Construction of LongData time=c(1,2,3,4,8,12,16,20) id2=1:120 f <- function(id,t)((id-1)%%3-1) * t g <- function(id,t)(id%%2+1)*t ld2 <- longData3d( array(cbind(outer(id2,time,f),outer(id2,time,g))+rnorm(120*8*2,0,3), dim=c(120,8,2))) ### Example with default value plotTrajMeans3d(ld2) plotTrajMeans3d(ld2,parTraj=parTRAJ()) ### Example with default value except for the color plotTrajMeans3d(ld2,parTraj=parTRAJ(col="blue"))
ParLongData
is an objet containing some graphical
parameter used to plot LongData
object and / or mean trajectories. They work as
define in par
.
type
[character]
: Type of the plot that should be
drawn ('p' for point, 'l' for line, 'b' for both, 'c' line appart, 'o'
for overplot, 'h' for histogram, 's' and 'S' for steps, 'n' for no
ploting)
col
[character]
: A specification for the
default plotting color. Can be either a single value or a vector.
pch
[numeric]
or [character]
: Either an integer
specifying a symbol or a single character to be used as the default in plotting
points. See example in points for possible values and their
interpretation.
pchPeriod
[numeric]
: Fix the number of point that should be plot. Usefull to
plot points on trajectories with a lot of mesurement (see examples
in plotTrajMeans
for LongData
for details).
cex
[numeric]
: A numerical value giving the
amount by which plotting text and symbols should be magnified relative
to the default.
xlab
[character]
: A title for the x axis.
ylab
[character]
: A title for the y axis.
Object ParLongData
can be created by three functions:
parLongData
create an object from scratch ;
parTraj
create an object containing default value to plot
individutal trajectories;
parMean
create an object containing default value to
plot mean trajectories.
object['xxx']
Get the value of the field xxx
.
object['xxx']<-value
Set the field xxx
to value
.
Christophe Genolini
PSIGIAM: Paris Sud Innovation Group in Adolescent Mental Health
INSERM U669 / Maison de Solenn / Paris
Contact author : <[email protected]>
Raphaël Ricaud
Laboratoire "Sport & Culture" / "Sports & Culture" Laboratory
University of Paris 10 / Nanterre
### Building ParLongData parMyData <- parLongData(type="n",col=3,pch="1",pchPeriod=20,cex=1,xlab="Time",ylab="Size") ### Get parMyData['col'] ### Set parMyData['cex'] <- 3 (parMyData)
### Building ParLongData parMyData <- parLongData(type="n",col=3,pch="1",pchPeriod=20,cex=1,xlab="Time",ylab="Size") ### Get parMyData['col'] ### Set parMyData['cex'] <- 3 (parMyData)
partition
is the constructor of the class
Partition
. It can be build either alone or
relatively to a object LongData
.
partition(clusters, traj, details=character())
partition(clusters, traj, details=character())
clusters |
|
traj |
|
details |
|
partition
construct a object of class
Partition
. It does not provide any default
values. yLongData
and details
are optional.
An object of class Partition
.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSM, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] Christophe M. Genolini and Bruno Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] Christophe M. Genolini and Bruno Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
### Empty partition partition() ### Small partition partition(clusters=c("A","B","A","C","C")) ### Random partition partition(clusters=LETTERS[floor(runif(100,1,5))]) ### Partition that clusters correctly some data ### Quality criterion are high data(artificialLongData) traj <- as.matrix(artificialLongData[,-1]) partition(clusters=rep(1:4,each=50),traj) ### Partition that does not cluster correctly the data ### Quality criterion are low partition(clusters=rep(1:4,50),traj)
### Empty partition partition() ### Small partition partition(clusters=c("A","B","A","C","C")) ### Random partition partition(clusters=LETTERS[floor(runif(100,1,5))]) ### Partition that clusters correctly some data ### Quality criterion are high data(artificialLongData) traj <- as.matrix(artificialLongData[,-1]) partition(clusters=rep(1:4,each=50),traj) ### Partition that does not cluster correctly the data ### Quality criterion are low partition(clusters=rep(1:4,50),traj)
An object of class Partition
is a partition of a population
into subgroups. The object also contains some information like the
percentage of trajectories in each group or some qualities criterion.
Objects are mainly intend to be created by some clustering methods (like k-means, fuzzy k-means, mixture modeling, latent class analysis,...)
nbClusters
[numeric]: number of groups, between 1 and 26
clusters
[vector(factor)]: vector containing the groups of each individual. Groups are in upper-case letters.
percentEachCluster
[vector(numeric)]: percentage of trajectories contained in each group.
postProba
[matrix(numeric)]: assuming that in each clusters C and for each time T, variable follow a normal law (mean and standard deviation of the variable at time T restricted to clusters C), then it is possible to compute the postterior probabilities of each individual (that is the probabilities that an individual has to belong to each clusters). These probabilities are hold in postProba.
postProbaEachCluster
[vector(numeric)]: for each clusters C, mean of the post probabilities to belong to C of the individual that effectively belong to C. A high percent means that the individual that are in this cluter realy meant to be here.
criterionValues
[vector(numeric)]: Value of the quality
criterions used to evaluate the quality of the
Clustering
. See qualityCriterion
for details.
details
[vector(character)]: hold different optionnal informations like
the algorithm (if any) used to find the partition, the convergence
time, the imputation methods, the starting condition.
Examples:
details=c(algorithm="kmeans",convergenceTime="3")
.
A class Partition
object must follow some rules to be valid:
Slots should be either all empty, or all non empty.
nbClusters
has to be lower or equal to 26.
clusters
is a factor in LETTERS[1:nbCluster]
.
Class Partition
objects are mainly constructed by some clustering methods
(like k-means, fuzzy k-means, mixture modeling, latent class
analysis,...). Neverdeless, it is also possible to construct them from
scratch using the fonction partition
.
[numeric]: Gets the number of clusters
(the value of the slot nbClusters
)
[vector(factor)]: Gets the cluster of each
individual (the value of the slot clusters
)
[vector(integer)]
: Gets the
cluster of each individual and turn them into integer
[vector(numeric)]
: Get the
percent of individual
in each clusters (the value of the slot nbClusters
)
[vector(numeric)]
: Get the
post probabilities for each clusters.
[matrix(numeric)]
: Get the
post probabilities for each individual and each clusters.
[vector(numeric)]
: gives the values
of all the
criterion values (the value of the slot criterionValues
)
[vector(character)]
: Get the values
of the slot details
.
[numeric]
: Get the value of the criterion
XcriterionX
. It can be one of Calinski.Harabatz
,
Krzysztof.Calinski
, Genolini.Calinski
, Ray.Turi
,
Davies.Bouldin
, BIC
, AIC
, AICc
or random
.
[character]
: Get the value named
XspecialX
in
the slot details
(probably one of multiplicity
,
convergenceTime
, imputationMethod
or
algorithm
.)
[numeric]: In the slot
details
, sets the values names multiplicity
to value
.
[numeric]: In the slot
details
, sets the values names convergenceTime
to value
.
The others slot can not be change after the object creation.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
Overview: longitudinalData-package
Classes: LongData
Methods: partition
############ ### Building Partition ### number part <- partition(rep(c(1,2,1,3),time=3)) ### LETTERS part <- partition(rep(c("A","B","D"),time=4),details=c(convergenceTime="3",multiplicity="1")) ### Others don't work try(partition(rep(c("A","Bb","C"),time=3))) ############# ### Setteur and Getteur ### '[' part["clusters"] part["clustersAsInteger"] part["nbClusters"] ### '[<-' part["multiplicity"] <- 2 (part)
############ ### Building Partition ### number part <- partition(rep(c(1,2,1,3),time=3)) ### LETTERS part <- partition(rep(c("A","B","D"),time=4),details=c(convergenceTime="3",multiplicity="1")) ### Others don't work try(partition(rep(c("A","Bb","C"),time=3))) ############# ### Setteur and Getteur ### '[' part["clusters"] part["clustersAsInteger"] part["nbClusters"] ### '[<-' part["multiplicity"] <- 2 (part)
parWindows
is the constructor of object ParWindows
.
parWindows(nbRow, nbCol, addLegend,closeScreen)
parWindows(nbRow, nbCol, addLegend,closeScreen)
nbRow |
|
nbCol |
|
addLegend |
|
closeScreen |
|
parWindows
is the constructor of object ParWindows
.
Given a number of rows and colonnes, it computes the screenMatrix
that
is use by split.screen for plot object
LongData
. If addLegend
is true, an
extra space is added on the top of the graphes to print the legend.
An object of class ParWindows
.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
### Building ParWindows (paramWin <- parWindows(3,2,FALSE,TRUE)) ### Get figsScreen <- paramWin['screenMatrix'] ### Usage listScreen <- split.screen(figsScreen) screen(listScreen[1]) plot(-5:5/10,2.5-(-5:5)^2/20,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20,ylim=c(0,6),type="l",lwd=3) screen(listScreen[3]) plot(-5:5/10,2.5-(-5:5)^2/20,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20,ylim=c(0,6),type="l",lwd=3) screen(listScreen[5]) plot(-5:5/10,(-5:5)^2/10,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20+1.25,ylim=c(0,6),type="l",lwd=3) close.screen(all.screens=TRUE) ### :-)
### Building ParWindows (paramWin <- parWindows(3,2,FALSE,TRUE)) ### Get figsScreen <- paramWin['screenMatrix'] ### Usage listScreen <- split.screen(figsScreen) screen(listScreen[1]) plot(-5:5/10,2.5-(-5:5)^2/20,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20,ylim=c(0,6),type="l",lwd=3) screen(listScreen[3]) plot(-5:5/10,2.5-(-5:5)^2/20,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20,ylim=c(0,6),type="l",lwd=3) screen(listScreen[5]) plot(-5:5/10,(-5:5)^2/10,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20+1.25,ylim=c(0,6),type="l",lwd=3) close.screen(all.screens=TRUE) ### :-)
ParWindows
is an objet containing graphical
parameter used to set the screen display.
nbCol
[numeric]
: Number of column of the screen
matrix.
nbRow
[numeric]
: Number of row of the screen
matrix.
addLegend
[logical]
: Shall a legend be added on
the graph?
closeScreen
[logical]
: On exit, high level plot
function can either close the screen that they open and return
nothing ; or not close it and return the list of the screen number.
screenMatrix
[matrix(numeric)]
: Matrix with 4
column defining the screen region, like the figs
argument
of the function screen. The screenMatrix
can be specified
by the user (bad idea) or can be compute automaticaly according to
nbCol
, nbRow
and addLegend
. For that, use
windowsCut.
Object ParWindows
can be created by the constructor
parWindows
or by the function windowsCut
.
object['xxx']
Get the value of the field xxx
.
object['xxx']<-value
Set the field xxx
to value.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
### Building ParWindows (paramWin <- parWindows(3,2,FALSE,TRUE)) ### Get figsScreen <- paramWin['screenMatrix'] ### Usage listScreen <- split.screen(figsScreen) screen(listScreen[1]) plot(-5:5/10,2.5-(-5:5)^2/20,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20,ylim=c(0,6),type="l",lwd=3) screen(listScreen[3]) plot(-5:5/10,2.5-(-5:5)^2/20,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20,ylim=c(0,6),type="l",lwd=3) screen(listScreen[5]) plot(-5:5/10,(-5:5)^2/10,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20+1.25,ylim=c(0,6),type="l",lwd=3) close.screen(all.screens=TRUE) ### Sorry for that...
### Building ParWindows (paramWin <- parWindows(3,2,FALSE,TRUE)) ### Get figsScreen <- paramWin['screenMatrix'] ### Usage listScreen <- split.screen(figsScreen) screen(listScreen[1]) plot(-5:5/10,2.5-(-5:5)^2/20,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20,ylim=c(0,6),type="l",lwd=3) screen(listScreen[3]) plot(-5:5/10,2.5-(-5:5)^2/20,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20,ylim=c(0,6),type="l",lwd=3) screen(listScreen[5]) plot(-5:5/10,(-5:5)^2/10,ylim=c(0,6),axes=FALSE,xlab="",ylab="",type="l",lwd=3) lines(-5:5/10,(-5:5)^2/20+1.25,ylim=c(0,6),type="l",lwd=3) close.screen(all.screens=TRUE) ### Sorry for that...
Given a LongData
and a Partition
, this
function create 'Triangle objects' representing the 3D plot the
clusters centers. Triangle object can latter be used to include
dynamic rotating graph in a pdf file.
## S4 method for signature 'LongData3d,missing' plot3dPdf(x,y,varY=1,varZ=2) ## S4 method for signature 'LongData3d,numeric' plot3dPdf(x,y,varY=1,varZ=2)
## S4 method for signature 'LongData3d,missing' plot3dPdf(x,y,varY=1,varZ=2) ## S4 method for signature 'LongData3d,numeric' plot3dPdf(x,y,varY=1,varZ=2)
x |
|
y |
|
varY |
|
varZ |
|
Create Triangle objects representing the 3D plot of the main
trajectories of a LongData
.
The three functions plot3dPdf
, saveTrianglesAsASY
and makeLatexFile
are design to export a 3D graph to a Pdf file. The process is the following:
plot3dPdf
: Create a scene, that is a collection of Triangle object that
represent a 3D images.
saveTrianglesAsASY
: Export the scene in an '.asy' file.
'.asy' can not be include in LaTeX file. LaTeX can read only
'.pre' file. So the next step is to use the software
asymptote
to convert '.asy' to '.pre'. This is done by the command asy -inlineimage -tex pdflatex
scene.asy
(not in R, in a console).
The previous step did produce a file scene+0.prc
that can be include in a LaTeX file.
makeLatexFile
create a LaTeX file that is directly compilable (using pdfLatex
).
It produce a pdf file that contain the 3D object.
A Triangle object.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
saveTrianglesAsASY
,makeLatexFile
,makeTriangles
### Move to tempdir wd <- getwd() setwd(tempdir()); getwd() ### Generating the data data(artificialJointLongData) myLd <- longData3d(artificialJointLongData,timeInData=list(var1=2:12,var2=13:23)) part <- partition(rep(1:3,each=50)) plotTrajMeans3d(myLd,part) ### Creation of the scene scene <- plot3dPdf(myLd,part) drawScene.rgl(scene) ### Export in '.asy' file saveTrianglesAsASY(scene) ### Creation of a '.prc' file # Open a console, then run: # 'asy -inlineimage -tex pdflatex scene.asy' ### Creation of the LaTeX main document makeLatexFile() ### Creation of the '.pdf' # Open a console window, then run # pdfLatex main.tex ### Go back to current dir setwd(wd)
### Move to tempdir wd <- getwd() setwd(tempdir()); getwd() ### Generating the data data(artificialJointLongData) myLd <- longData3d(artificialJointLongData,timeInData=list(var1=2:12,var2=13:23)) part <- partition(rep(1:3,each=50)) plotTrajMeans3d(myLd,part) ### Creation of the scene scene <- plot3dPdf(myLd,part) drawScene.rgl(scene) ### Export in '.asy' file saveTrianglesAsASY(scene) ### Creation of a '.prc' file # Open a console, then run: # 'asy -inlineimage -tex pdflatex scene.asy' ### Creation of the LaTeX main document makeLatexFile() ### Creation of the '.pdf' # Open a console window, then run # pdfLatex main.tex ### Go back to current dir setwd(wd)
This function graphically displays the quality criterion of all the
Partition
of a ListPartition
object.
plotAllCriterion(x, criterion=CRITERION_NAMES[1:5],standardized = TRUE)
plotAllCriterion(x, criterion=CRITERION_NAMES[1:5],standardized = TRUE)
x |
[ClusterLongData]: object whose quality criterion should be displayed. |
criterion |
[character]: name of the criterion(s) to plot. It can either display all the value for a single specific criterion or display several criterion, only the best value for each clusters number and for each criterion. |
standardized |
[logical]: If |
This function display graphically several quality criterion, probably to decide the best clusters' number.
No value are return. A graph is printed.
############### ### Data generation data(artificialLongData) traj <- as.matrix(artificialLongData[,-1]) ### Some clustering listPart <- listPartition() listPart["add"] <- partition(rep(c("A","B"),time=100),traj) listPart["add"] <- partition(rep(c("A","B","B","B"),time=50),traj) listPart["add"] <- partition(rep(c("A","B","C","A"),time=50),traj) listPart["add"] <- partition(rep(c("A","B","C","D"),time=50),traj) ordered(listPart) ################ ### graphical display plotAllCriterion(listPart) plotAllCriterion(listPart,criterion=CRITERION_NAMES[1:5],TRUE)
############### ### Data generation data(artificialLongData) traj <- as.matrix(artificialLongData[,-1]) ### Some clustering listPart <- listPartition() listPart["add"] <- partition(rep(c("A","B"),time=100),traj) listPart["add"] <- partition(rep(c("A","B","B","B"),time=50),traj) listPart["add"] <- partition(rep(c("A","B","C","A"),time=50),traj) listPart["add"] <- partition(rep(c("A","B","C","D"),time=50),traj) ordered(listPart) ################ ### graphical display plotAllCriterion(listPart) plotAllCriterion(listPart,criterion=CRITERION_NAMES[1:5],TRUE)
This function graphically displays the quality criterion of all the
Partition
of a ListPartition
object.
plotCriterion(x, criterion=x["criterionActif"],nbCriterion=100)
plotCriterion(x, criterion=x["criterionActif"],nbCriterion=100)
x |
[ClusterLongData]: object whose quality criterion should be displayed. |
criterion |
[character]: name of the criterion(s) to plot. It can either display all the value for a single specific criterion or display several criterion, only the best value for each clusters number and for each criterion. |
nbCriterion |
[numeric]: if there is a big number of
|
This function display graphically the quality criterion (probably to decide the best clusters' number). It can either display all the criterion ; this is useful to see the consistency of the result : is the best clusterization obtain several time or only one ? It can also display only the best result for each clusters number : this helps to find the local maximum, which is classically used to chose the "correct" clusters' number.
No value are return. A graph is printed.
############### ### Data generation data(artificialLongData) traj <- as.matrix(artificialLongData[,-1]) ### Some clustering listPart <- listPartition() listPart["add"] <- partition(rep(c("A","B"),time=100),traj) listPart["add"] <- partition(rep(c("A","B","B","B"),time=50),traj) listPart["add"] <- partition(rep(c("A","B","C","A"),time=50),traj) listPart["add"] <- partition(rep(c("A","B","C","D"),time=50),traj) ordered(listPart) ################ ### graphical display plotCriterion(listPart) plotAllCriterion(listPart,criterion=CRITERION_NAMES[1:5],TRUE)
############### ### Data generation data(artificialLongData) traj <- as.matrix(artificialLongData[,-1]) ### Some clustering listPart <- listPartition() listPart["add"] <- partition(rep(c("A","B"),time=100),traj) listPart["add"] <- partition(rep(c("A","B","B","B"),time=50),traj) listPart["add"] <- partition(rep(c("A","B","C","A"),time=50),traj) listPart["add"] <- partition(rep(c("A","B","C","D"),time=50),traj) ordered(listPart) ################ ### graphical display plotCriterion(listPart) plotAllCriterion(listPart,criterion=CRITERION_NAMES[1:5],TRUE)
Plot the LongData
or LongData3d
optionnaly relatively
to a Partition
. For joint trajectories, one
graphe for each variable trajectory is displayed.
plotTrajMeans(x, y, parTraj=parTRAJ(), parMean=parMEAN(),...)
plotTrajMeans(x, y, parTraj=parTRAJ(), parMean=parMEAN(),...)
x |
|
y |
|
parTraj |
|
parMean |
|
... |
Arguments to be passed to methods, such as graphical parameters. |
Plot either a LongData
,
or each variable of a LongData3d
optionnaly according to the Partition
define by y
.
Graphical option concerning the individual trajectory (col, type, pch
and xlab) can be change using parTraj
.
Graphical option concerning the cluster mean trajectory (col, type, pch,
pchPeriod and cex) can be change using parMean
. For more
detail on parTraj
and parMean
, see object of
class ParLongData
.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSM, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
LongData
, LongData3d
, plotTrajMeans3d
.
################## ### Construction of the data data(artificialLongData) ld <- longData(artificialJointLongData) part <- partition(rep(1:3,each=50)) ### Basic plotting plotTrajMeans(ld) plotTrajMeans(ld,part,xlab="Time") ################## ### Changing graphical parameters 'par' ### No letters on the mean trajectories plotTrajMeans(ld,part,parMean=parMEAN(type="l")) ### Only one letter on the mean trajectories plotTrajMeans(ld,part,parMean=parMEAN(pchPeriod=Inf)) ### Color individual according to its clusters (col="clusters") plotTrajMeans(ld,part,parTraj=parTRAJ(col="clusters")) ### Mean without individual plotTrajMeans(ld,part,parTraj=parTRAJ(type="n")) ### No mean trajectories (type="n") ### Color individual according to its clusters (col="clusters") plotTrajMeans(ld,part,parTraj=parTRAJ(col="clusters"),parMean=parMEAN(type="n")) ### Only few trajectories plotTrajMeans(ld,part,nbSample=10,parTraj=parTRAJ(col='clusters'),parMean=parMEAN(type="n")) ################## ### single variable trajectory data(artificialLongData) ld2 <- longData(artificialLongData) part2 <- partition(rep(1:4,each=50)) plotTrajMeans(ld2) plotTrajMeans(ld2,part2)
################## ### Construction of the data data(artificialLongData) ld <- longData(artificialJointLongData) part <- partition(rep(1:3,each=50)) ### Basic plotting plotTrajMeans(ld) plotTrajMeans(ld,part,xlab="Time") ################## ### Changing graphical parameters 'par' ### No letters on the mean trajectories plotTrajMeans(ld,part,parMean=parMEAN(type="l")) ### Only one letter on the mean trajectories plotTrajMeans(ld,part,parMean=parMEAN(pchPeriod=Inf)) ### Color individual according to its clusters (col="clusters") plotTrajMeans(ld,part,parTraj=parTRAJ(col="clusters")) ### Mean without individual plotTrajMeans(ld,part,parTraj=parTRAJ(type="n")) ### No mean trajectories (type="n") ### Color individual according to its clusters (col="clusters") plotTrajMeans(ld,part,parTraj=parTRAJ(col="clusters"),parMean=parMEAN(type="n")) ### Only few trajectories plotTrajMeans(ld,part,nbSample=10,parTraj=parTRAJ(col='clusters'),parMean=parMEAN(type="n")) ################## ### single variable trajectory data(artificialLongData) ld2 <- longData(artificialLongData) part2 <- partition(rep(1:4,each=50)) plotTrajMeans(ld2) plotTrajMeans(ld2,part2)
Plot two variables of a LongData3d
object in 3D, optionnaly
relatively to a Partition
.
plotTrajMeans3d(x,y,varY=1,varZ=2, parTraj=parTRAJ(),parMean=parMEAN(type="n"),...)
plotTrajMeans3d(x,y,varY=1,varZ=2, parTraj=parTRAJ(),parMean=parMEAN(type="n"),...)
x |
|
y |
|
varY |
|
varZ |
|
parTraj |
|
parMean |
|
... |
Arguments to be passed to methods, such as graphical parameters. |
Plot two variables of a LongData3d
object in 3D. It
use the rgl
library. The user can make the
graphical representation turn using the mouse.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
################## ### Construction of the data time=c(1,2,3,4,8,12,16,20) id2=1:120 f <- function(id,t)((id-1)%%3-1) * t g <- function(id,t)(id%%2+1)*t h <- function(id,t)(id%%4-0.5)*(20-t) ld <- longData3d(array(cbind(outer(id2,time,f),outer(id2,time,g),outer(id2,time,h))+ rnorm(120*8*3,0,3),dim=c(120,8,3))) part <- partition(rep(1:6,20)) ### Basic plotting plotTrajMeans3d(ld) plotTrajMeans3d(ld,part) ### Variable 1 and 3, then 2 and 3 plotTrajMeans3d(ld,part) plotTrajMeans3d(ld,part,varY=3,varZ=2) plotTrajMeans3d(ld,part,varY=1,varZ=3) ################## ### Changing graphical parameters 'par' ### Color individual according to its clusters (col="clusters") plotTrajMeans3d(ld,part,parTraj=parTRAJ(col="clusters")) plotTrajMeans3d(ld,part,parTraj=parTRAJ(col="clusters"),varY=1,varZ=3) ### No mean trajectories (type="n"), only few trajectories ### Color individual according to its clusters (col="clusters") plotTrajMeans3d(ld,part,parTraj=parTRAJ(col="clusters"),parMean=parMEAN(type="n"),nbSample=10)
################## ### Construction of the data time=c(1,2,3,4,8,12,16,20) id2=1:120 f <- function(id,t)((id-1)%%3-1) * t g <- function(id,t)(id%%2+1)*t h <- function(id,t)(id%%4-0.5)*(20-t) ld <- longData3d(array(cbind(outer(id2,time,f),outer(id2,time,g),outer(id2,time,h))+ rnorm(120*8*3,0,3),dim=c(120,8,3))) part <- partition(rep(1:6,20)) ### Basic plotting plotTrajMeans3d(ld) plotTrajMeans3d(ld,part) ### Variable 1 and 3, then 2 and 3 plotTrajMeans3d(ld,part) plotTrajMeans3d(ld,part,varY=3,varZ=2) plotTrajMeans3d(ld,part,varY=1,varZ=3) ################## ### Changing graphical parameters 'par' ### Color individual according to its clusters (col="clusters") plotTrajMeans3d(ld,part,parTraj=parTRAJ(col="clusters")) plotTrajMeans3d(ld,part,parTraj=parTRAJ(col="clusters"),varY=1,varZ=3) ### No mean trajectories (type="n"), only few trajectories ### Color individual according to its clusters (col="clusters") plotTrajMeans3d(ld,part,parTraj=parTRAJ(col="clusters"),parMean=parMEAN(type="n"),nbSample=10)
Given a LongData
and a
Partition
, the fonction qualityCriterion
calculate
some qualities criterion.
qualityCriterion(traj,clusters,imputationMethod="copyMean")
qualityCriterion(traj,clusters,imputationMethod="copyMean")
traj |
|
clusters |
|
imputationMethod |
|
Given a LongData
and a
Partition
(or a matrix
and a vector of
integer
), the fonction qualityCriterion
calculate several
quality criterion and return then as a list (see 'value' below).
If some individual have no clusters (ie if Partition
has some
missing values), the corresponding trajectories are exclude from the
calculation.
Note that if there is an empty cluster or an empty trajectory, most of the criterions are anavailable.
Basicaly, 6 non-parametrics criterions are computed.
In addition, ASSUMING THAT in each clusters C and for each time T,
the variable follow a NORMAL LAW (mean and standard deviation of the variable at time T restricted
to clusters C), it is possible to compute the the posterior
probabilities of the individual trajectories and the
likelihood. From there, we can also compute the BIC, the AIC and
the global posterior probability. The function qualityCriterion
also compute these criterion. But the user should alway keep in mind
that these criterion are
valid ONLY under the hypothesis of normality. If this
hypothèsis is not respected, algorithm like k-means will converge but the BIC and AIC
will have no meaning.
IMPORTANT NOTE: Some criterion should be maximized, some other should be
minimized. This might be confusing for the non expert. In order to
simplify the comparison of the criterion, qualityCriterion
compute the OPPOSITE of the criterion that should be minimized (Ray & Bouldin, Davies & Turi, BIC and AIC). Thus,
all the criterion computed by this function should be maximized.
A list with three fields: the first is the list of the criterions. the second is the clusters post probabilities; the third is the matrix of the individual post probabilities.
Notations: k=number of clusters; n=number of individual; B=Between variance ; W=Within variance The criterion are:
[numeric]
: Calinski and Harabatz
criterion: c(k)=Trace(B)/Trace(W)*(n-k)/(k-1)
.
[numeric]
: Calinski and Harabatz
criterion modified by Krysczuk: c(k)=Trace(B)/Trace(W)*(n-1)/(n-k)
.
[numeric]
: Calinski and Harabatz
criterion modified by Genolini:
g(k)=Trace(B)/Trace(W)*(n-k)/sqrt(k-1)
.
[numeric]
: Ray and Turi criterion: r(k)=-Vintra/Vinter
with
Vintra=Sum(dist(x,center(x)))
and
Vinter=min(dist(center_i,center_j)^2)
. (The "true" index of
Ray and Turi is Vintra/Vinter
and should me minimized. See IMPORTANT NOTE above.)
[numeric]
: Davies and Bouldin criterion:
d(k)=-mean(Proximite(cluster_i,cluster_j))
with
Proximite(i,j)=(DistInterne(i)+DistInterne(j))/(DistExterne(i,j))
. (The "true" index of
Davies and Bouldin is mean(Proximite())
and should me
minimized. See IMPORTANT NOTE above.)
[numeric]
: random value following the normal law N(0,1).
All the parametric indices should be minimized. So the function
qualityCriterion
compute their opposite (see IMPORTANT NOTE above.)
Notation: L=likelihood; h=number of parameters; n=number of trajectories; t=number of time measurement; N=total number of measurement (N=t.n).
SECOND IMPORTANT NOTE: the formula of parametrics criterion ofen
include the size of the population. In the specific case on
longitudinal data, the definition of the "size of the population" is
not obvious. It can be either the number of individual n
, or the number of
measurement N=n.t
. So, the function qualityCriterion
gives
two version of all the non parametrics criterion, the first using n
,
the second using N
.
[numeric]
: Bayesian Information Criterion: BIC=2*log(L)-h*log(n). See IMPORTANT NOTE above.
[numeric]
: Bayesian Information Criterion: BIC=2*log(L)-h*log(N). See IMPORTANT NOTE above.
[numeric]
: Akaike Information Criterion, bis: AIC=2*log(L)-2*h. See IMPORTANT NOTE above.
[numeric]
: Akaike Information Criterion with correction: AIC=AIC+(2h(h+1))/(n-h-1). See IMPORTANT NOTE above.
[numeric]
: Akaike Information Criterion with correction, bis: AIC=AIC+(2h(h+1))/(n-h-1). See IMPORTANT NOTE above.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSM, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
LongData
, Partition
,
imputation
.
################## ### Preparation of some artificial data par(ask=TRUE) data(artificialLongData) ld <- longData(artificialLongData) ### Correct partition part1 <- partition(rep(1:4,each=50)) plotTrajMeans(ld,part1) (cr1 <- qualityCriterion(ld,part1)) ### Random partition part2 <- partition(floor(runif(200,1,5))) plotTrajMeans(ld,part2) (cr2 <- qualityCriterion(ld,part2)) ### Partition with 3 clusters instead of 4 part3 <- partition(rep(c(1,2,3,3),each=50)) plotTrajMeans(ld,part3) (cr3 <- qualityCriterion(ld,part3)) ### Comparisons of the Partition plot(c(cr1[[1]],cr2[[1]],cr3[[1]]),main="The highest give the best partition (according to Calinski & Harabatz criterion)") par(ask=FALSE)
################## ### Preparation of some artificial data par(ask=TRUE) data(artificialLongData) ld <- longData(artificialLongData) ### Correct partition part1 <- partition(rep(1:4,each=50)) plotTrajMeans(ld,part1) (cr1 <- qualityCriterion(ld,part1)) ### Random partition part2 <- partition(floor(runif(200,1,5))) plotTrajMeans(ld,part2) (cr2 <- qualityCriterion(ld,part2)) ### Partition with 3 clusters instead of 4 part3 <- partition(rep(c(1,2,3,3),each=50)) plotTrajMeans(ld,part3) (cr3 <- qualityCriterion(ld,part3)) ### Comparisons of the Partition plot(c(cr1[[1]],cr2[[1]],cr3[[1]]),main="The highest give the best partition (according to Calinski & Harabatz criterion)") par(ask=FALSE)
Remove duplicate Partition
present in a
ListPartition
(or, by inheritance, in
ClusterLongData
and ClusterLongData3d
objects.
regroup(object)
regroup(object)
object |
|
A clusterizing algorithm can find a Partition
several time. It
is store several time in object ListPartition
(or in
ClusterLongData
or in ClusterLongData3d
), encombering
the memory. regroup
remove the duplicate
Partition
. Note that if the ListPartition
is not ordered, then
regroup
sort it unless toOrder=FALSE
.
None (this function change internaly the field of an object, it does not return any values.)
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] Christophe M. Genolini and Bruno Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] Christophe M. Genolini and Bruno Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
### Some data data(artificialLongData) myLd <- as.matrix(artificialLongData[,-1]) ### Some clustering part2 <- partition(rep(c("A","B","A","C"),time=50),myLd) part3 <- partition(rep(c("A","B","C","D"),time=50),myLd) ################ ### ListPartition listPart <- listPartition() listPart["add"] <- part2 listPart["add"] <- part3 listPart["add"] <- part2 listPart["add"] <- part3 ### Some clustering has been found several time ### regroup will suppress the duplicate one regroup(listPart) plotCriterion(listPart)
### Some data data(artificialLongData) myLd <- as.matrix(artificialLongData[,-1]) ### Some clustering part2 <- partition(rep(c("A","B","A","C"),time=50),myLd) part3 <- partition(rep(c("A","B","C","D"),time=50),myLd) ################ ### ListPartition listPart <- listPartition() listPart["add"] <- part2 listPart["add"] <- part3 listPart["add"] <- part2 listPart["add"] <- part3 ### Some clustering has been found several time ### regroup will suppress the duplicate one regroup(listPart) plotCriterion(listPart)
This function reshapes a data frame in 'long' format (repeated measurements in the same column) into a data frame in 'wide' format (repeated measurements in separate columns). It also correct a bug of reshape.
longToWide(trajLong) reshapeLongToWide(trajLong)
longToWide(trajLong) reshapeLongToWide(trajLong)
trajLong |
[ |
This function reshapes a data frame in 'long' format (repeated measurements in the same column) into a data frame in 'wide' format (repeated measurements in separate columns).
A data frame in 'wide' format (repeated measurements in separate columns).
longToWide
is just a 'friendly overlay' of the function
reshape
. It also corrects a reshape
bug
(modification of the order of some trajectories value when some times
are missing).
Christophe Genolini
summary(Indometh) longToWide(Indometh) df2 <- data.frame(id = rep(LETTERS[1:4], rep(2,4)), visit = I(rep(c("3","6"), 4)), x = rnorm(4), y = runif(4), sex=rep(c("H","F","H"),time=c(4,2,2)))[1:7,] longToWide(df2[,1:3]) longToWide(df2[,c(1,2,4)])
summary(Indometh) longToWide(Indometh) df2 <- data.frame(id = rep(LETTERS[1:4], rep(2,4)), visit = I(rep(c("3","6"), 4)), x = rnorm(4), y = runif(4), sex=rep(c("H","F","H"),time=c(4,2,2)))[1:7,] longToWide(df2[,1:3]) longToWide(df2[,c(1,2,4)])
This function reshapes a data frame in 'wide' format (repeated measurements in separate column) into a data frame in 'long' format (repeated measurements in the same columns).
wideToLong(trajWide,times=1:(ncol(trajWide)-1)) reshapeWideToLong(trajWide,times=1:(ncol(trajWide)-1))
wideToLong(trajWide,times=1:(ncol(trajWide)-1)) reshapeWideToLong(trajWide,times=1:(ncol(trajWide)-1))
trajWide |
|
times |
|
This function reshapes a data frame in 'wide' format (repeated measurements in separe column) into a data frame in 'long' format (repeated measurements in the same columns). The first column has to be the individual indentier. All the other column should be the trajectories. The missing values are removed in long format.
A data frame in 'long' format.
Christophe Genolini
df3 <- data.frame(id = LETTERS[rep(1:4)], sex=c("H","F","H","F"), v1=rnorm(4),v2=rnorm(4),w1=rnorm(4),w2=rnorm(4)) wideToLong(df3[,c(1,3,4)]) wideToLong(df3[,c(1,5,6)]) wideToLong(df3[,c(1,3:6)]) wideToLong(df3[,c(1,3:6)],times=c(1,2,4,8))
df3 <- data.frame(id = LETTERS[rep(1:4)], sex=c("H","F","H","F"), v1=rnorm(4),v2=rnorm(4),w1=rnorm(4),w2=rnorm(4)) wideToLong(df3[,c(1,3,4)]) wideToLong(df3[,c(1,5,6)]) wideToLong(df3[,c(1,3:6)]) wideToLong(df3[,c(1,3:6)],times=c(1,2,4,8))
This function revert the effect of scale
by restauring
the initial values of trajectories.
restoreRealData(object)
restoreRealData(object)
object |
|
This function revert the effect of scale
by restauring
the initial values of trajectories.
None: this function change internaly the field of an object, it does not return any values.)
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSME, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
################## ### Building LongData time=c(1,2,3,4,8,12,16,20) id2=1:12 f <- function(id,t)((id-1)%%3-1) * t g <- function(id,t)(id%%2+1)*t ld1 <- longData3d(array(cbind(outer(id2,time,f),outer(id2,time,g))+rnorm(12*8*2,0,1),dim=c(12,8,2))) plotTrajMeans3d(ld1) ################## ### Scaling by 'mean' and 'standard deviation' scale(ld1,scale=c(-1,-1)) plotTrajMeans3d(ld1) ################## ### Back to the first version of the data restoreRealData(ld1) plotTrajMeans3d(ld1)
################## ### Building LongData time=c(1,2,3,4,8,12,16,20) id2=1:12 f <- function(id,t)((id-1)%%3-1) * t g <- function(id,t)(id%%2+1)*t ld1 <- longData3d(array(cbind(outer(id2,time,f),outer(id2,time,g))+rnorm(12*8*2,0,1),dim=c(12,8,2))) plotTrajMeans3d(ld1) ################## ### Scaling by 'mean' and 'standard deviation' scale(ld1,scale=c(-1,-1)) plotTrajMeans3d(ld1) ################## ### Back to the first version of the data restoreRealData(ld1) plotTrajMeans3d(ld1)
Export a Triangle
object to an '.asy' file.
saveTrianglesAsASY(scene, filename = "scene.asy")
saveTrianglesAsASY(scene, filename = "scene.asy")
scene |
|
filename |
|
Export a Triangle
object to an '.asy' file. See
plot3dPdf
for a summary of the overall procedure.
An '.asy' file, in the current directory.
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa
https://homepage.divms.uiowa.edu/~luke/R/misc3d/misc3d-pdf/misc3d-pdf.pdf
plot3dPdf
,makeLatexFile
,makeTriangles
### Move to tempdir wd <- getwd() setwd(tempdir()); getwd() ### Generating the data data(artificialJointLongData) myLd <- longData3d(artificialJointLongData,timeInData=list(var1=2:12,var2=13:23)) part <- partition(rep(1:3,each=50)) plotTrajMeans3d(myLd,part) ### Creation of the scene scene <- plot3dPdf(myLd,part) drawScene.rgl(scene) ### Export in '.asy' file saveTrianglesAsASY(scene) ### Creation of a '.prc' file # Open a console, then run: # 'asy -inlineimage -tex pdflatex scene.asy' ### Creation of the LaTeX main document makeLatexFile() ### Creation of the '.pdf' # Open a console window, then run # pdfLatex main.tex ### Go back to current dir setwd(wd)
### Move to tempdir wd <- getwd() setwd(tempdir()); getwd() ### Generating the data data(artificialJointLongData) myLd <- longData3d(artificialJointLongData,timeInData=list(var1=2:12,var2=13:23)) part <- partition(rep(1:3,each=50)) plotTrajMeans3d(myLd,part) ### Creation of the scene scene <- plot3dPdf(myLd,part) drawScene.rgl(scene) ### Export in '.asy' file saveTrianglesAsASY(scene) ### Creation of a '.prc' file # Open a console, then run: # 'asy -inlineimage -tex pdflatex scene.asy' ### Creation of the LaTeX main document makeLatexFile() ### Creation of the '.pdf' # Open a console window, then run # pdfLatex main.tex ### Go back to current dir setwd(wd)
scale
the trajectories of the different variable of a
LongData
object.
scale(x, center = TRUE, scale = TRUE)
scale(x, center = TRUE, scale = TRUE)
x |
|
center |
|
scale |
|
When variable with different unit are used jointly, it might be necessary to
change their scale them in order to change their individual influance.
This is what scale
do.
More precisely, all the value x[i,j,k] of the variable k will be scale
according to the classic formula (x[i,j,k]- m_k)/s_k
where
m_k and s_k are respectively the k-ieme value of the argument
center
and scale
.
Note that center=TRUE
is a special value that set m_k=mean(x[,,k],na.rm=TRUE)
.
Similarly, scale=TRUE
is a special value that set s_k=sd(x[,,k],na.rm=TRUE)
.
scale
directly
modify the internal value of the LongData
. No value is return.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSM, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
################## ### Building LongData time=c(1,2,3,4,8,12,16,20) id2=1:12 f <- function(id,t)((id-1)%%3-1) * t g <- function(id,t)(id%%2+1)*t ld1 <- longData3d(array(cbind(outer(id2,time,f),outer(id2,time,g))+rnorm(12*8*2,0,1),dim=c(12,8,2))) plotTrajMeans3d(ld1) ################## ### Scaling by 'mean' and 'standard deviation' plotTrajMeans3d(ld1) scale(ld1) plotTrajMeans3d(ld1) ### Scaling by some parameters scale(ld1,center=c(10,100),scale=c(3,-1)) plotTrajMeans3d(ld1) ################## ### To restore the data restoreRealData(ld1)
################## ### Building LongData time=c(1,2,3,4,8,12,16,20) id2=1:12 f <- function(id,t)((id-1)%%3-1) * t g <- function(id,t)(id%%2+1)*t ld1 <- longData3d(array(cbind(outer(id2,time,f),outer(id2,time,g))+rnorm(12*8*2,0,1),dim=c(12,8,2))) plotTrajMeans3d(ld1) ################## ### Scaling by 'mean' and 'standard deviation' plotTrajMeans3d(ld1) scale(ld1) plotTrajMeans3d(ld1) ### Scaling by some parameters scale(ld1,center=c(10,100),scale=c(3,-1)) plotTrajMeans3d(ld1) ################## ### To restore the data restoreRealData(ld1)
windowsCut
prepare an object ParWindows
according to its arguments.
windowsCut(x, addLegend = TRUE,closeScreen=TRUE)
windowsCut(x, addLegend = TRUE,closeScreen=TRUE)
x |
|
addLegend |
|
closeScreen |
|
If x
is a number of variable, the column and row number are
estimate according to the formula
nbCol <- ceiling(sqrt(x))
and nbRow <- ceiling(x/nbCol)
.
An object of class ParWindows
.
Christophe Genolini
1. UMR U1027, INSERM, Université Paul Sabatier / Toulouse III / France
2. CeRSM, EA 2931, UFR STAPS, Université de Paris Ouest-Nanterre-La Défense / Nanterre / France
[1] C. Genolini and B. Falissard
"KmL: k-means for longitudinal data"
Computational Statistics, vol 25(2), pp 317-328, 2010
[2] C. Genolini and B. Falissard
"KmL: A package to cluster longitudinal data"
Computer Methods and Programs in Biomedicine, 104, pp e112-121, 2011
### Simple cut with no space for legent windowsCut(3,FALSE) windowsCut(4,FALSE) windowsCut(5,FALSE) ### Simple cut with legend windowsCut(5)
### Simple cut with no space for legent windowsCut(3,FALSE) windowsCut(4,FALSE) windowsCut(5,FALSE) ### Simple cut with legend windowsCut(5)