--- title: "GPoM : 3 Modelling" author: "Sylvain Mangiarotti & Mireille Huc" date: "`r Sys.Date()`" output: pdf_document: number_sections: no latex_engine: xelatex vignette: > %\VignetteIndexEntry{GPoM : 3 Modelling} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} load package: "`r library(GPoM)`" --- ## Global Modelling This document gives an illustration on how to apply the global modelling technique in various contexts. The aim of the `GPoM` package is to obtain models of Ordinary Differential Equations (ODEs) of polynomial form from time series (either single or multiple). Such type of modelling requires a careful time series preprocessing. Such data preprocessing was sketched in the vignette `2 PreProcessing`. In the present vignette, it is assumed that the studied time series has already been carefully preprocessed, so that the modelling tools can now be applied without anymore preprocessing. ## Single time series modelling The chaotic system introduced by Rössler in 1976^[O. Rössler, An Equation for Continuous Chaos, *Physics Letters*, **57A**(5), 1976, 397-398.] is well adapted to test and illustrate the performances of the approach to model nonlinear dynamics. Simulations of this chaotic system are available in the GPoM package and can be directly loaded as follows: ```{r, eval = TRUE} # load data data("Ross76") ``` (note that the GPoM package can easily be used to produce such chaotic data set, see vignette `1 Conventions`). The loaded variable `Ross76` is a matrix. The time vector is in the first column and the time series $x(t)$, $y(t)$ and $z(t)$ of the three variables are in the next columns two to four. The global modelling technique from one single time series is introduced in the present section. The variable $y$ will be considered here: ```{r, eval = TRUE, fig.align='center'} # time vector tin <- Ross76[,1] # single time series data <- Ross76[,3] # plot plot(tin, data, type='l', xlab = 't', ylab = 'y(t)', main = 'Original time series') ``` Starting from single time series, the following canonical form of equations can be expected^[G. Gouesbet & C. Letellier, 1994. Global vector-field reconstruction by using a multivariate polynomial L2 approximation on nets. *Physical Review E*, **49**(6), 4955-4972.]: $dX_1/dt = X_2$ $dX_2/dt = X_3$ $...$ $dX_n/dt = P(X_1, X_2, ..., X_n)$ where the modeled variable $X_1$ will correspond to the observed variable $y$, and $X_2$ to $X_n$ are its successive derivatives $dy/dt$ and $dy^2/dt^2$. To apply the global modelling technique to the time series presented above, it is necessary to choose a trial model dimension `nVar` (corresponding to $n$ in the previous equation) and a maximal polynomial degree `dMax`. To restrict the model search to a smaller ensemble of models, the minimum and maximum numbers of parameters for the model can be chosen manually using `nPmin` and `nPmax`. `IstepMax` corresponds to the maximum number of iterations to be used for the numerical integration. ```{r, echo = TRUE} # model dimension nVar = 3 # maximum polynomial degree dMax = 2 ``` ```{r, eval = FALSE, echo = TRUE} # generalized Polynomial modelling out1 <- gPoMo(data, tin = tin, dMax = dMax, nS = nVar, show = 0, nPmin = 9, nPmax = 11, verbose = FALSE, IstepMax = 3000) ``` Note that, to follow the search process in real time, parameter `show` should be set on (`show = 1`). In practice, the model size (that is, its number of parameters which is bounded by `nPmin` and `nPmax`) is generally unknwown. It may thus be necessary to extend its range (be default `nPmin = 1` and `nPmax = 14`). However, in practice, `nPmax` cannot be chosen too large in order to keep the computation time reasonable. With the parameterization chosen here, two models are obtained: ```{r, eval = FALSE} which(out1$okMod == 1) ``` ```{r, eval = TRUE, echo = FALSE} which(data_vignetteIII$out1okMod == 1) ``` The model #3 shows a quite good performance to reproduce the original phase portrait: ```{r, eval = FALSE, fig.align='center', fig.width=4, fig.height=4} # obtained model #3 plot(out1$stockoutreg$model3[,1], out1$stockoutreg$model3[,2], type='l', col = 'red', xlab = 'y(t)', ylab = 'dy(t)/dt', main = 'Phase portrait') # original phase portrait lines(out1$filtdata[,1], out1$filtdata[,2]) legend(3,2,c("model", "original"), col=c('red', 'black'), lty=c(1,1), cex=0.6) ``` ```{r, eval = TRUE, echo = FALSE, fig.align='center', fig.width=4, fig.height=4} # obtained model #3 library(float) plot(dbl(data_vignetteIII$out1_stockoutreg_m3[,1]), dbl(data_vignetteIII$out1_stockoutreg_m3[,2]), type='l', col = 'red', xlab = 'y(t)', ylab = 'dy(t)/dt', main = 'Phase portrait') # original phase portrait lines(dbl(data_vignetteIII$out1_filtdata[,1]), dbl(data_vignetteIII$out1_filtdata[,2])) legend(3,2,c("model", "original"), col=c('red', 'black'), lty=c(1,1), cex=0.6) ``` The equations of the obtained model can be edited as follows (more precision in the digits can be obtained choosing a larger value of `approx`): ```{r, eval = FALSE} # obtained model #3 visuEq(out1$models$model3, nVar, dMax, approx = 1) ``` ```{r, eval = TRUE, echo = FALSE} # obtained model #3 visuEq(data_vignetteIII$out1_m3, nVar, dMax, approx = 1) ``` Various methods may be used to validate such a model. In practice, note that validation may depend on the objective of the model (characterization, forecasting, etc.). One way to validate the model can be done by considering the forecasting performances of the model. A code dedicated to such a type of validation is presented in vignette `5 Predictability`. ## Detection of causal couplings and retro-modelling One interest of the global modelling technique is (1) to enable the detection of nonlinear dynamical couplings between observed variables, and potentially even (2) to obtain an analytical formulation of this couplings in order (3) to help for the inference of the causal relations. Here again, the Rössler-1976 chaotic system is used to illustrate the purpose. The time series of the three variables $x$, $y$ and $z$ are used this time. ```{r, eval = TRUE} # multiple (three) time series data <- Ross76[,2:4] ``` The three time series are as follows: ```{r, eval = TRUE, fig.align='center'} # x(t) plot(tin, data[,1], ylim = c(-6.5,12), type='l', col = 'blue', xlab = 't', ylab = '', main = 'Original time series') # y(t) lines(tin, data[,2], type = 'l', col = 'orange') # z(t) lines(tin, data[,3], type = 'l', col = 'brown') legend(35,12,c("x(t)", "y(t)", "z(t)"), col=c('blue', 'orange', 'brown'), lty=1, cex=0.8) ``` A set of three equations is expected, one for each variable: $dx/dt = P_x(x,y,z)$ $dy/dt = P_y(x,y,z)$ $dz/dt = P_z(x,y,z)$ To define such a structure, the number `nS` of equations for each series must be defined as a vector such as: `nS = c(1,1,1)`. (note that to discard information about the current number of models under test, the optional parameter `verbose` can be set to `verbose = FALSE`). ```{r, eval = FALSE} # generalized Polynomial modelling out2 <- gPoMo(data, tin = tin, dMax = 2, nS = c(1,1,1), show = 0, IstepMin = 10, IstepMax = 3000, nPmin = 7, nPmax = 8) ``` The simplest obtained model able to reproduce the observed dynamics is the model #5: ```{r, eval = FALSE, fig.align='center', fig.width=4, fig.height=4} # obtained model #5 plot(out2$stockoutreg$model5[,1], out2$stockoutreg$model5[,2], type='l', col = 'red', xlab = 'x(t)', ylab = 'y(t)', main = 'Phase portrait') # original phase portrait lines(out2$filtdata[,1], out2$filtdata[,2]) legend(3,2,c("original", "model"), col=c('black', 'red'), lty=1, cex=0.5) ``` ```{r, eval = TRUE, echo = FALSE, fig.align='center', fig.width=4, fig.height=4} # obtained model #5 plot(dbl(data_vignetteIII$out2_stockoutreg_m5[,1]), dbl(data_vignetteIII$out2_stockoutreg_m5[,2]), type='l', col = 'red', xlab = 'x(t)', ylab = 'y(t)', main = 'Phase portrait') # original phase portrait lines(dbl(data_vignetteIII$out2_filtdata[,1]), dbl(data_vignetteIII$out2_filtdata[,2])) legend(3,2,c("original", "model"), col=c('black', 'red'), lty=1, cex=0.5) ``` The equations of the obtained model are given by: ```{r, eval = FALSE} visuEq(out2$models$model5, 3, 2, approx = 4) ``` ```{r, eval = TRUE, echo = FALSE} visuEq(data_vignetteIII$out2_m5, 3, 2, approx = 4) ``` The original Rössler system is thus retrieved. The ability of the approach to retrieve original equations was also tested to numerous other dynamical systems (see vignette `7 Retro-modelling`). A practical application of the global modelling technique under real world conditions was done for modelling the plague epidemic of Bombay by the end of 19th century and its coupling to the epizootics among black and brown rats^[S. Mangiarotti, 2015. Low dimensional chaotic models for the plague epidemic in Bombay (1896–1911). *Chaos, Solitons & Fractals*, **81A**, 184–196.]. This analysis enabled not only the detection of the couplings but also to obtain an algebraic formulation of this coupling. An interpretation could be proposed for all the terms of the obtained model. ## Generalized global modelling and polynomial a priori structure The GPoM package enables a generalized formulation of the global modelling technique combining the canonical formulation -- one single observed variables and its derivatives -- and the multivariate formulation -- several observed variables considered together. Examples are given in the two previous sections. The two variables $x$ and $y$ of the Rössler-1976 chaotic system are considered here to illustrate the purpose. ```{r, eval = TRUE} # time vector tin <- Ross76[,1] # multiple (two) time series data <- Ross76[,2:3] ``` The two time series are as follows: ```{r, eval = TRUE, fig.align='center'} # x(t) plot(tin, data[,1], ylim = c(-6.5,6.5), type='l', col = 'blue', xlab = 't', ylab = '', main = 'Original time series') # y(t) lines(tin, data[,2], type = 'l', col = 'brown') legend(30,6,c("x(t)", "y(t)"), col=c('blue', 'brown'), lty=1, cex=0.8) ``` Obviously, the correlation between the two time series is not very high since close to 0.5 in magnitude (Note that alternative examples with very low levels of correlations will also be considered in vignette `7 Retro-modelling`): ```{r, eval = TRUE, fig.align='center'} # correlation between Rössler-x and Rössler-y cor(data[,1], data[,2]) ``` Though, their dynamics are linked in a causal way since dynamically coupled as expressed by the original (fully deterministic) system: $dx/dt = - y - z$ $dy/dt = x + ay$ $dz/dt = b + z(x - c)$. Because $x$ and $y$ are coupled to the third variable $z$ (assumed not to be observed in the present example), and because of the presence of a nonlinear term ($xz$ in the last equation), the relation between $x$ and $y$ can only be poorly detected using a simple statistical technique such as the correlation. This difficulty directly results from the complex coevolution between these two variables as illustrated by the following scatter plot (x,y): ```{r, eval = TRUE, fig.align='center', fig.width=4, fig.height=4} plot(data[,1], data[,2], type='p', cex = 0.2, col = 'blue', xlab = 'x', ylab = 'y', main = 'Scatter plot (x,y)') ``` In its principle, the global modelling technique is well adapted to tackle the detection of nonlinear couplings. As the Rössler-1976 system is three-dimensional, a three-dimensional system can be expected, that is, such as `sum(nS) = 3` (in practice, this dimension is generally unknown and a trial dimension can be used). Two formulations may thus be expected, either based on variables ($X_1$, $X_2$, $Y_1$): $dX_1/dt = X_2$ $dX_2/dt = P_X(X_1, X_2, Y_1)$ $dY_1/dt = P_Y(X_1, X_2, Y_1)$, which general structure will be defined by `nS = c(2,1)` (two equations of canonical form will be generated from the observed time series $x(t)$, a single one from $y(t)$); or based on variables ($X_1$, $Y_1$, $Y_2$): $dX_1/dt = P_X(X_1, Y_1, Y_2)$ $dY_1/dt = Y_2$ $dY_2/dt = P_Y(X_1, Y_1, Y_2)$ which general structure will be defined by `nS = c(1,2)` (one equation generated from $x(t)$, two ones of canonical form from $y(t)$). For a tutorial purpose, we will assume here that some *a priori* information is available about the model sructure telling us that the former formulation based on variables $X_1$, $X_2$ and $Y_1$ is more suited and that specific polynomial terms should be excluded (e.g. $Y_1^2$ and $X_2Y_1$ should be excluded from polynomial $P_X$, and only the terms $Y_1$, $X_1$ and $X_1Y_1$ should be accepted in polynomial $P_Y$). It is then possible to provide a template for the model formulation allowing some terms and forbidding others. This template structure can be provided as an input of the `gPoMo` function through the optional parameter `EqS`. In the present case, the following structure will be used as a template: ```{r, eval = TRUE} # model template: EqS <- matrix(1, ncol = 3, nrow = 10) EqS[,1] <- c(0,0,0,1,0,0,0,0,0,0) EqS[,2] <- c(1,1,0,1,0,1,1,1,1,1) EqS[,3] <- c(0,1,0,0,0,0,1,1,0,0) visuEq(EqS, 3, 2, substit = c('X1','X2','Y1')) ``` Each term of this template can be used by the global model if set to 1 and cannot if set to 0. In other words, terms that are not included in this template cannot be added, but terms that are included may be removed. The search for a Generalized Global Model will be launched as follows: ```{r, eval = FALSE} # generalized Polynomial modelling out3 <- gPoMo(data, tin = tin, dMax = 2, nS=c(2,1), EqS = EqS, show = 0, verbose = FALSE, IstepMin = 10, IstepMax = 2000, nPmin = 9, nPmax = 11) ``` The simplest obtained model able to reproduce the observed dynamics is found to be the model #2: ```{r, eval = FALSE, fig.align='center', fig.width=4, fig.height=4} # obtained model #2 plot(out3$stockoutreg$model2[,1], out3$stockoutreg$model2[,2], type='l', col = 'red', xlab = 'X1', ylab = 'X2', main = 'Phase portrait') # original phase portrait lines(out3$filtdata[,1], out3$filtdata[,2]) legend(-3,-5,c("original", "model"), col=c('black', 'red'), lty=1, cex=0.5) ``` ```{r, echo = FALSE, eval = TRUE, fig.align='center', fig.width=4, fig.height=4} # obtained model #2 plot(dbl(data_vignetteIII$out3_stockoutreg_m2[,1]), dbl(data_vignetteIII$out3_stockoutreg_m2[,2]), type='l', col = 'red', xlab = 'y(t)', ylab = 'dy(t)/dt', main = 'Phase portrait') # original phase portrait lines(dbl(data_vignetteIII$out3_filtdata[,1]), dbl(data_vignetteIII$out3_filtdata[,2])) legend(-3,-5,c("original", "model"), col=c('black', 'red'), lty=1, cex=0.5) ``` The equations of the obtained model are the following : ```{r, eval = FALSE} visuEq(out3$models$model2, 3, 2, approx = 4, substit = c('X1','X2','Y1')) ``` ```{r, eval = TRUE, echo = FALSE} visuEq(data_vignetteIII$out3_m2, 3, 2, approx = 4, substit = c('X1','X2','Y1')) ``` When no information is available about the polynomial structure, the model search can be launched without using the option `EqS`. In this situation, all the terms are considered as possible terms (all the terms are set to 1 in `EqS`). When there is no reason to choose one formulation rather than the other, then all the formulations may be tested one by one. In the present case, models can actually be obtained with both sets of variables ($X_1$,$Y_1$,$Y_2$) and ($X_1$,$X_2$,$Y_1$)). A practical application of the generalized global modelling technique could be performed from observational data under real world conditions for the modelling of the West-Africa epidemic of Ebola Virus Desease^[S. Mangiarotti, M. Peyre & M. Huc, 2016. A chaotic model for the epidemic of Ebola virus disease in West Africa (2013–2016). *Chaos*, **26**, 113112.]. ## Blind separation and modelling of two independant sets of equations Finally, since the approach enables -- in its principle -- to detect nonlinear couplings, it may also be used to detect separated sets of equations. Two separated systems are considered here: the Rössler-1976 system already presented upper and the Sprott-K system^[J.C. Sprott, 1994. Some simple chaotic flows. *Physical Review E*, **50**(2), 647-650]. ```{r, eval = TRUE} # load data data("TSallMod_nVar3_dMax2") # multiple (six) time series tin <- TSallMod_nVar3_dMax2$SprK$reconstr[1:400,1] TSRo76 <- TSallMod_nVar3_dMax2$R76$reconstr[,2:4] TSSprK <- TSallMod_nVar3_dMax2$SprK$reconstr[,2:4] data <- cbind(TSRo76,TSSprK)[1:400,] ``` Six time series are thus considered in this illustration, three for each system: ```{r, eval = TRUE, fig.align='center'} # For the Rössler-1976 system # x(t) plot(tin, data[,1], ylim = c(-6.5,8.5), type='l', col = 'red', xlab = 't', ylab = '', main = 'Original time series') # y(t) lines(tin, data[,2], type = 'l', col = 'brown') # z(t) lines(tin, data[,3], type = 'l', col = 'orange') # For the Sprott-K system # u(t) lines(tin, data[,4], type = 'l', col = 'green') # v(t) lines(tin, data[,5], type = 'l', col = 'darkgreen') # w(t) lines(tin, data[,6], type = 'l', col = 'lightgreen') legend(3, 8,title = 'Rössler', c("x(t)", "y(t)", "z(t)"), col=c('red', 'brown', 'orange'), lty=1, cex=0.8) legend(17, 8,title='Sprott-K', c("u(t)", "v(t)", "w(t)"), col=c('green', 'darkgreen', 'lightgreen'), lty=1, cex=0.8) ``` Since one equation is expected from each variable, vector `nS` is defined as `nS = c(1,1,1,1,1,1)`. To speed up the computation, the fourth order Runge-Kutta numerical integration `method = 'rk4'` is preferred^[package `deSolve` is used for this purpose.]. ```{r, eval = FALSE} # generalized Polynomial modelling out4 <- gPoMo(data, dtFixe = 1/20, dMax = 2, nS = c(1,1,1,1,1,1), show = 0, method = 'rk4', IstepMin = 2, IstepMax = 3, nPmin = 13, nPmax = 13) ``` Due to the huge number of models to be tested, numerical integration is sketched here on a quite short duration and the model search is directly focused on models of 13-parameter size. A larger range of model sizes and a longer numerical integration can be applied to check the ability of the technique to retrieve the original models. This can easily be done, except that a much longer running time will be required. However, to distinguish between integrable and non integrable models (in a numerical sense), it is necessary to consider the numerical integration of large lengths. With longer integration durations (e.g. from `IstepMin = 10` to `IstepMax = 10000`), many of the models will be rejected or lead to trivial solutions (fixed points, periodic cycles) In the present case, only one obtained set of equations (model #347) able to reproduce the original phase portraits of the two original systems (Rössler-1976 and Sprott-K) is found: ```{r, eval = FALSE, fig.show='hold', fig.width=4, fig.height=4} KL <- out4$models$model347 v0 <- as.numeric(head(data,1)) outNumi <- numicano(nVar = 6, dMax = 2, Istep=5000, onestep=1/20, KL=KL, v0=v0) # obtained model #347 plot(outNumi$reconstr[,2], outNumi$reconstr[,3], type='l', col = 'red', xlab = 'x(t)', ylab = 'y(t)', main = 'Phase portrait') # original phase portrait lines(out4$filtdata[,1], out4$filtdata[,2]) legend(4,2,c("original", "model"), col=c('black', 'red'), lty=1, cex=0.5) # original phase portrait plot(outNumi$reconstr[,5], outNumi$reconstr[,6], type='l', col = 'green', xlab = 'u(t)', ylab = 'v(t)', main = 'Phase portrait') # original phase portrait lines(out4$filtdata[,4], out4$filtdata[,5]) legend(-3,1,c("original", "model"), col=c('black', 'green'), lty=1, cex=0.5) ``` ```{r, eval = TRUE, echo = FALSE, fig.show='hold', fig.width=4, fig.height=4} KL <- data_vignetteIII$out4_m347 v0 <- as.numeric(head(data,1)) outNumi <- numicano(nVar = 6, dMax = 2, Istep=5000, onestep=1/20, KL=KL, v0=v0) # obtained model #347 plot(outNumi$reconstr[,2], outNumi$reconstr[,3], type='l', col = 'red', xlab = 'x(t)', ylab = 'y(t)', main = 'Phase portrait') # original phase portrait lines(dbl(data_vignetteIII$out4_filtdata[,1]), dbl(data_vignetteIII$out4_filtdata[,2])) legend(4,2,c("original", "model"), col=c('black', 'red'), lty=1, cex=0.5) # original phase portrait plot(outNumi$reconstr[,5], outNumi$reconstr[,6], type='l', col = 'green', xlab = 'u(t)', ylab = 'v(t)', main = 'Phase portrait') # original phase portrait lines(dbl(data_vignetteIII$out4_filtdata[,4]), dbl(data_vignetteIII$out4_filtdata[,5])) legend(-3,1,c("original", "model"), col=c('black', 'green'), lty=1, cex=0.5) ``` The estimated model reads ```{r, eval = FALSE} visuEq(out4$models$model347, 6, 2, approx = 2, substit = c("x", "y", "z", "u", "v", "w")) ``` ```{r, eval = TRUE, echo = FALSE} visuEq(data_vignetteIII$out4_m347, 6, 2, approx = 2, substit = c("x", "y", "z", "u", "v", "w")) ``` The global modelling technique enables to distinguish two independant sets of equations: one for the Rössler-1976 system (equations one to three), the other one for the Sprott-K system (equations four to six). Moreover, the obtained equations are identical in their structure and very close in their parameterization to the original equations from which the observed time series were derived. This latter point is interesting since it shows that -- in the present case at least -- not only the couplings, but also each of the nonlinear terms of the equations can be retrieved. The approach may thus be used as a retro-modelling technique in order to interpret the model terms. This problem will be investigated in more details in vignette `7 Retromodelling`. ## Time series with gaps When measurements are gathered under real environmental conditions, the time series may have gaps, or may be contaminated by temporary perturbations which should be removed. The `GPoM` package can be applied to such types of data sets^[S. Mangiarotti, F. Le Jean, M. Huc, C. Letellier, 2016. Global modelling of aggregated and associated chaotic dynamics. *Chaos, Solitons & Fractals*, **83**, 82-96]. An example of this kind of situation is given in the present section. The Rössler system is used again in this example: ```{r, eval = TRUE, fig.align='center'} # load data data("Ross76") # time vector tin <- Ross76[1:1500,1] # single time series series0 <- series <- Ross76[1:1500,3] # plot plot(tin, series0, type = 'l', col = 'gray', xlab = 'time', ylab = 'Observational data') ``` To illustrate our purpose, some noise is added to the original time series and a weighting function is defined to indicate on what part of the signal the analysis should focus. This function is equal to 1 when the time series is free of noise, and equal to 0 when it is noise-contaminated. ```{r, eval = TRUE, fig.align='center'} # some noise is added series[1:100] <- series[1:100] + 0.1 * runif(1:100, min = -1, max = 1) series[301:320] <- series[301:320] + 0.5 * runif(1:20, min = -1, max = 1) plot(tin, series, ylab='', type = 'l', col = 'black') # # weighting function W <- tin * 0 + 1 W[1:100] <- 0 # the first fourty values will be discarded W[301:320] <- 0 # twenty other values will be discarded too lines(tin, W, type = 'l', col = 'brown') legend(0, -2,c("observed data", "weighting function"), col=c('black', 'brown'), lty=1, cex=0.8) ``` The global modelling is then applied: ```{r, echo = TRUE, eval = FALSE} # Weighted time series GPout1 <- gPoMo(data = series, tin = tin, weight = W, dMax = 2, nS = 3, winL = 9, show = 1, IstepMin = 10, IstepMax = 6000, nPmin = 5, nPmax = 11, method = 'rk4') ``` ```{r, echo = FALSE, eval = TRUE} # Weighted time series GPout1 <- gPoMo(data = series, tin = tin, weight = W, dMax = 2, nS = 3, winL = 9, show = 0, IstepMin = 10, IstepMax = 6000, nPmin = 5, nPmax = 11, method = 'rk4') ``` The following equations are retrieved (the option `weight` indicates when the observations provides in `series` should be considered in the analysis, or not): ```{r, eval = TRUE} visuEq(GPout1$models$model7, 3, 2, approx = 2) ``` This set of equations can then be integrated numerically. Integration will be launched from the first state for which weight is equal to 1. ```{r, eval = TRUE} # first weight which value not equal to zero: i1 = which(GPout1$Wfiltdata== 1)[1] v0 <- GPout1$filtdata[i1,1:3] rcstr <- numicano(nVar=3, dMax=2, Istep=5000, onestep=1/250, KL = GPout1$models$model7, v0=v0, method="rk4") ``` The best obtained model is #7 that effectively reproduces the dynamic of the original system: ```{r, eval = TRUE, fig.align='center', fig.width=4, fig.height=4} plot(rcstr$reconstr[,2], rcstr$reconstr[,3], type='l', lwd = 3, main='phase portrait', xlab='y', ylab = 'dy/dt', col='orange') # original data: lines(GPout1$filtdata[,1], GPout1$filtdata[,2], type='l', main='phase portrait', col='black') # initial condition lines(v0[1], v0[2], type = 'p', col = 'red') legend(-2,1,c("original", "model"), col=c('black', 'orange'), lty=1, cex=0.5) legend(-2.1,-0.7,"initial conditions : ", cex = 0.5, bty="n") ``` Note that model #3 also produces a good approximation of the original system. ```{r, eval = TRUE} visuEq(GPout1$models$model3, 3, 2, approx = 2) ``` This model converges to a periodic attractor but with a slight modification of its parameterization, a chaotic solutions can be obtained. ```{r, eval = TRUE} # first weight which value not equal to zero: i1 = which(GPout1$Wfiltdata== 1)[1] v0 <- GPout1$filtdata[i1,1:3] KL3bis <- KL3 <- GPout1$models$model7 KL3bis[2,3] <- KL3[2,3] * 1.1 rcstr <- numicano(nVar = 3, dMax = 2, Istep = 40000, onestep = 1/250, KL = KL3bis, v0 = v0, method = "rk4") plot(rcstr$reconstr[35000:40000,2], rcstr$reconstr[35000:40000,3], main='phase portrait', xlab='y', ylab = 'dy/dt', type='l', lwd = 3, col='orange') ``` ## Time series in associassion Equivalent records of the same dynamical behavior may have been performed by several sensors, concurently or not and at same or different locations. It may then be useful, and simetimes necessary, to consider these time series simultaneously in the modelling process. The `GPoM` package can also be applied to such type of data sets^[Ibid]. A set of four time series of different time legnth, all derived from the Rössler system, are made available to show how to apply the global modelling technique to such a set of time series. ```{r, eval = TRUE, fig.align='center'} # load data data("severalTS") # plot plot(svrlTS$data1$TS[,1] - svrlTS$data1$TS[1,1], svrlTS$data1$TS[,2], type = 'l', col = 'gray', xlab = 'time', ylab = 'y(t)', main = 'Observational data', xlim = c(0, 20), ylim = c(-6, 2.2)) lines(svrlTS$data2$TS[,1] - svrlTS$data2$TS[1,1], svrlTS$data2$TS[,2], type = 'l', col = 'blue') lines(svrlTS$data3$TS[,1] - svrlTS$data3$TS[1,1], svrlTS$data3$TS[,2], type = 'l', col = 'orange') lines(svrlTS$data4$TS[,1] - svrlTS$data4$TS[1,1], svrlTS$data4$TS[,2], type = 'l', col = 'brown') ``` The function `concat` concatenates the set of separated time series into a single time series, and provides a corresponding weight vector `W` that takes the bundary conditions between successive time series into account, in order to avoid any discontinuous effect. ```{r, eval = TRUE, fig.align='center'} # Concatenate the data set into a single time series winL = 55 concaTS <- concat(svrlTS, winL = winL) ``` A large value is chosen for parameter `winL` just to highlight its effect on the next plots. The gray line enables to clearly distinguish the discontinuities at the connexion of originally separated time series. Data points rejected at the boudary are plotted as red dots and kept ones as green dots. The corresponding weighted vector is plotted as a black line. ```{r, eval = TRUE, fig.align='center'} # Plot the concatenated time series plot(concaTS$sglTS$TS[,1], concaTS$sglTS$TS[,2], main = 'Concatenated time series', xlab = 'Time (concatenated)', ylab = 'y(t)', type = 'l', col = 'gray') lines(concaTS$sglTS$TS[concaTS$sglTS$W == 1,1], concaTS$sglTS$TS[concaTS$sglTS$W == 1,2], type = 'p', col = 'green', cex = 0.5) lines(concaTS$sglTS$TS[concaTS$sglTS$W == 0,1], concaTS$sglTS$TS[concaTS$sglTS$W == 0,2], type = 'p', col = 'red', cex = 0.5) lines(concaTS$sglTS$TS[,1], concaTS$sglTS$W, type = 'l') ``` The output of function `concat` can directly be used for global modelling: ```{r, echo = TRUE, eval = FALSE} GPout2 <- gPoMo(data = concaTS$sglTS$TS[,2], tin = concaTS$sglTS$TS[,1], dMax = 2, nS = 3, winL = winL, weight = concaTS$sglTS$W, show = 1, IstepMin = 10, IstepMax = 12000, nPmin = 6, nPmax = 12, method = 'rk4') ``` ```{r, echo = TRUE, eval = FALSE} GPout2 <- gPoMo(data = concaTS$sglTS$TS[,2], tin = concaTS$sglTS$TS[,1], dMax = 2, nS = 3, winL = winL, weight = concaTS$sglTS$W, show = 0, IstepMin = 10, IstepMax = 12000, nPmin = 6, nPmax = 12, method = 'rk4') ``` The dynamic of the original system is effectively reproduced by sereral models. ## Output visualization and global models validation The outputs of the `gPoMo` function are gathered in one single list. The next vignette `4 VisualizeOutputs` aims to show how the results are organized in this list. Examples of model validation are then presented in the vignette `5 Predictability`.