Data Fusion with GeoFIS

Introduction

This vignette illustrates the design of a Data Fusion system using the GeoFIS R package.

See Data fusion documentation for more details about concepts of data fusion.

The example is an agronomic case study detailed in “A fuzzy logic based soil chemical quality index for cacao” (Mora-Herrera et al. 2020).

library(GeoFIS)
#> Loading required package: sp
#> Loading required package: data.tree
#> Loading required package: FisPro

Loading data

The data fusion process accepts input dataset of class data.frame or Spatial*DataFrame of the sp R package.
Only the numeric attributes of the dataset can be used.

In this example we use the tolima dataset available in the GeoFIS R package (Mora-Herrera et al. 2020):

data(tolima)
fusion <- NewFusion(tolima)

The aggregation system

The aggregation system is built as a tree, based on the data.tree R package.

  • Each input of the system is a leaf of the tree.
  • Each node of the tree is an aggregation operation.
  • The root of the tree is the result of the aggregation system.

Each aggregation node can also be used as an input for another aggregation step, yielding a hierarchical structure.

Build inputs

The goal of this step is to turn raw data of individual information sources into satisfaction degrees.
The NewFusionInput function that builds an input leaf for the aggregation system takes 3 parameters:

  1. the input name
  2. the membership function (Mf) used to turn raw data into satisfaction degrees.
    several types of Mf are proposed in the FisPro R package.
    • NewMfTrapezoidalInf: low values are preferred.
    • NewMfTrapezoidalSup: high values are preferred.
    • NewMfTrapezoidal: around an interval.
    • NewMfTriangular: about a value.
  3. the name of the attribute in the input dataset (the default is the same as the input name).

Build aggregation nodes

The goal of this step is to summarize the inputs satisfaction degrees into a single one through an aggregation operator.

The NewFusionAggreg function that builds the aggregation node takes several parameters:

  1. the node name.
  2. the aggregation operator.
  3. the leafs / nodes used as inputs of the aggregation operator.

In the GeoFIS R package, 4 aggregation operators are implemented:

Numerical operators

  • NewAggregWam build a WAM operator with the specified weights.
  • NewAggregOwa build an OWA operator with the specified weights.

Aggregation using linguistic rules

NewAggregFis build a linguistic rule-based operator based on a Fuzzy Inference System (Fis), a Fis object of FisPro R package.

The Fis can be built with the FisPro software or with the NewFisFusion function.

The NewFisFusion function helps to generate a Fis and take 5 parameters:

  1. the Fis name.
  2. the Fis inputs names.
  3. the granularity (number of Mf) for each Fis input, in range [2, 5].
    • with granularity = 2, the Mf labels are: “low”, “high”
    • with granularity = 3, the Mf labels are: “low”, “average”, “high”
    • with granularity = 4, the Mf labels are: “very_low”, “low”, “high”, “very_high”
    • with granularity = 5, the Mf labels are: “very_low”, “low”, “average”, “high”, “very_high”
  4. the Fis output name.
  5. the conclusions of the rulebase.
    the rule based is generated in the lexicographic order of inputs Mfs.
    • if the conclusions are a numeric vector, a FisOutCrisp output will be added to the Fis.
    • if the conclusions are a character vector, a FisOutFuzzy output will be added to the Fis.

Aggregation using a function

NewAggregFunction build an operator based on a function.
The function must accepts a numeric vector as parameter, e.g. mean function or user-defined function.

The Application example

In this application example (Mora-Herrera et al. 2020), we build a hierarchical system with 8 inputs, 2 rule-based aggregation nodes and 1 WAM node.

The inputs:

potassium <- NewFusionInput("K", NewMfTrapezoidalSup(0.2, 0.6))
phosphorus <- NewFusionInput("P", NewMfTrapezoidalSup(5, 15))
balance_gap <- NewFusionInput("Bal_Gap", NewMfTrapezoidalInf(0, 0.5), "BalanceGap")
n_gap <- NewFusionInput("N_Gap", NewMfTriangular(0.5, 1, 1.5), "Ngap_N_OpN")
base_sat <- NewFusionInput("Base_Sat", NewMfTrapezoidalSup(0.4, 0.6), "Base_S")
org_matter <- NewFusionInput("OM", NewMfTrapezoidalSup(3, 5))
ph <- NewFusionInput("pH", NewMfTrapezoidal(5, 5.5, 6.5, 7.5))
cadmium <- NewFusionInput("Cd", NewMfTrapezoidalInf(0, 0.43), "Cadmium")

The Macronutrients rulebase:

macronutrients_fis <- NewFisFusion(
  "MacN", # Fis name
  c("Bal_Gap", "K", "P", "N_Gap", "Base_Sat"), # Fis inputs names
  c(2, 2, 2, 2, 2), # Fis inputs granularities
  "MacN", # Fis output name
  c(
    0, 0.1, 0.15, 0.2, 0.25, 0.35, 0.4, 0.45,
    0.3, 0.4, 0.45, 0.5, 0.55, 0.65, 0.7, 0.75,
    0.4, 0.3, 0.4, 0.45, 0.5, 0.55, 0.65, 0.7,
    0.55, 0.6, 0.7, 0.75, 0.8, 0.85, 0.9, 1
  ) # Fis conclusions
)

Print the Macronutrients rulebase:

print(macronutrients_fis)
#> name = "MacN"
#> nb inputs = 5
#> nb outputs = 1
#> nb rules = 32
#> conjunction = "min"
#> 
#> [input 1]
#> name = "Bal_Gap"
#> range = [0, 1]
#> mf1 = mf_trapezoidal_inf("low", 0, 1)
#> mf2 = mf_trapezoidal_sup("high", 0, 1)
#> 
#> [input 2]
#> name = "K"
#> range = [0, 1]
#> mf1 = mf_trapezoidal_inf("low", 0, 1)
#> mf2 = mf_trapezoidal_sup("high", 0, 1)
#> 
#> [input 3]
#> name = "P"
#> range = [0, 1]
#> mf1 = mf_trapezoidal_inf("low", 0, 1)
#> mf2 = mf_trapezoidal_sup("high", 0, 1)
#> 
#> [input 4]
#> name = "N_Gap"
#> range = [0, 1]
#> mf1 = mf_trapezoidal_inf("low", 0, 1)
#> mf2 = mf_trapezoidal_sup("high", 0, 1)
#> 
#> [input 5]
#> name = "Base_Sat"
#> range = [0, 1]
#> mf1 = mf_trapezoidal_inf("low", 0, 1)
#> mf2 = mf_trapezoidal_sup("high", 0, 1)
#> 
#> [output 1]
#> name = "MacN"
#> nature = "crisp"
#> range = [0, 1]
#> defuzzification = sugeno
#> disjunction = max
#> 
#> [Rules]
#> 1, 1, 1, 1, 1, 0
#> 1, 1, 1, 1, 2, 0.1
#> 1, 1, 1, 2, 1, 0.15
#> 1, 1, 1, 2, 2, 0.2
#> 1, 1, 2, 1, 1, 0.25
#> 1, 1, 2, 1, 2, 0.35
#> 1, 1, 2, 2, 1, 0.4
#> 1, 1, 2, 2, 2, 0.45
#> 1, 2, 1, 1, 1, 0.3
#> 1, 2, 1, 1, 2, 0.4
#> 1, 2, 1, 2, 1, 0.45
#> 1, 2, 1, 2, 2, 0.5
#> 1, 2, 2, 1, 1, 0.55
#> 1, 2, 2, 1, 2, 0.65
#> 1, 2, 2, 2, 1, 0.7
#> 1, 2, 2, 2, 2, 0.75
#> 2, 1, 1, 1, 1, 0.4
#> 2, 1, 1, 1, 2, 0.3
#> 2, 1, 1, 2, 1, 0.4
#> 2, 1, 1, 2, 2, 0.45
#> 2, 1, 2, 1, 1, 0.5
#> 2, 1, 2, 1, 2, 0.55
#> 2, 1, 2, 2, 1, 0.65
#> 2, 1, 2, 2, 2, 0.7
#> 2, 2, 1, 1, 1, 0.55
#> 2, 2, 1, 1, 2, 0.6
#> 2, 2, 1, 2, 1, 0.7
#> 2, 2, 1, 2, 2, 0.75
#> 2, 2, 2, 1, 1, 0.8
#> 2, 2, 2, 1, 2, 0.85
#> 2, 2, 2, 2, 1, 0.9
#> 2, 2, 2, 2, 2, 1

The Macronutrients aggregation node:

macronutrients_aggreg <- NewFusionAggreg(
  "MacN",
  NewAggregFis(macronutrients_fis),
  balance_gap, potassium, phosphorus, n_gap, base_sat
)

The Soil Nutritional Balance rulebase:

nutri_balance_fis <- NewFisFusion(
  "Nutri_Bal", # Fis name
  c("pH", "OM", "MacN"), # Fis inputs names
  c(2, 2, 2), # Fis inputs granularities
  "Nutri_Bal", # Fis output name
  c(0, 0.2, 0.3, 0.5, 0.4, 0.6, 0.7, 1.0) # Fis conclusions
)

Print the Macronutrients rulebase:

print(nutri_balance_fis)
#> name = "Nutri_Bal"
#> nb inputs = 3
#> nb outputs = 1
#> nb rules = 8
#> conjunction = "min"
#> 
#> [input 1]
#> name = "pH"
#> range = [0, 1]
#> mf1 = mf_trapezoidal_inf("low", 0, 1)
#> mf2 = mf_trapezoidal_sup("high", 0, 1)
#> 
#> [input 2]
#> name = "OM"
#> range = [0, 1]
#> mf1 = mf_trapezoidal_inf("low", 0, 1)
#> mf2 = mf_trapezoidal_sup("high", 0, 1)
#> 
#> [input 3]
#> name = "MacN"
#> range = [0, 1]
#> mf1 = mf_trapezoidal_inf("low", 0, 1)
#> mf2 = mf_trapezoidal_sup("high", 0, 1)
#> 
#> [output 1]
#> name = "Nutri_Bal"
#> nature = "crisp"
#> range = [0, 1]
#> defuzzification = sugeno
#> disjunction = max
#> 
#> [Rules]
#> 1, 1, 1, 0
#> 1, 1, 2, 0.2
#> 1, 2, 1, 0.3
#> 1, 2, 2, 0.5
#> 2, 1, 1, 0.4
#> 2, 1, 2, 0.6
#> 2, 2, 1, 0.7
#> 2, 2, 2, 1

The Soil Nutritional aggregation node:

nutri_balance_aggreg <- NewFusionAggreg(
  "Nutri_Bal",
  NewAggregFis(nutri_balance_fis),
  ph, org_matter, macronutrients_aggreg
)

The Chemical aggregation node:

chemical_aggreg <- NewFusionAggreg(
  "Chemical",
  NewAggregWam(weights = c(0.3, 0.7)),
  cadmium, nutri_balance_aggreg
)

Print the Chemical aggregation tree with aggregation operators and input leafs:

print(chemical_aggreg, "aggreg", "mf")
#>               levelName              aggreg                                   mf
#> 1  Chemical                   wam(0.3, 0.7)                                     
#> 2   ¦--Cd                                        mf_trapezoidal_inf("", 0, 0.43)
#> 3   °--Nutri_Bal        fis("Nutri_Bal", 1)                                     
#> 4       ¦--pH                               mf_trapezoidal("", 5, 5.5, 6.5, 7.5)
#> 5       ¦--OM                                       mf_trapezoidal_sup("", 3, 5)
#> 6       °--MacN              fis("MacN", 1)                                     
#> 7           ¦--Bal_Gap                            mf_trapezoidal_inf("", 0, 0.5)
#> 8           ¦--K                                mf_trapezoidal_sup("", 0.2, 0.6)
#> 9           ¦--P                                   mf_trapezoidal_sup("", 5, 15)
#> 10          ¦--N_Gap                              mf_triangular("", 0.5, 1, 1.5)
#> 11          °--Base_Sat                         mf_trapezoidal_sup("", 0.4, 0.6)

Use the chemical_aggreg as the root node of the Data Fusion system:

fusion$aggregate <- chemical_aggreg

Perform the Data Fusion process and read output:

fusion$perform()
output <- fusion$output()

The output data frame contains all satisfaction degrees of inputs leafs, the aggregation value of all nodes, defined into the aggregation system.

print(output, digits = 2)
#>    Chemical   Cd Nutri_Bal   pH   OM MacN Bal_Gap     K     P N_Gap Base_Sat
#> 1      0.46 0.00     0.655 0.50 1.00 0.70    0.44 0.728 1.000  0.68      1.0
#> 2      0.41 0.00     0.586 0.76 0.87 0.41    0.63 0.382 0.000  0.56      1.0
#> 3      0.47 0.47     0.467 1.00 0.00 0.34    0.62 0.127 0.000  0.43      1.0
#> 4      0.37 0.00     0.534 1.00 0.00 0.67    0.61 1.000 0.258  0.68      1.0
#> 5      0.77 1.00     0.675 1.00 0.52 0.47    0.75 0.728 0.000  0.62      1.0
#> 6      0.70 0.54     0.770 1.00 0.86 0.48    0.77 0.728 0.000  0.75      1.0
#> 7      0.65 0.65     0.652 1.00 0.35 0.56    0.70 1.000 0.000  0.15      1.0
#> 8      0.55 0.53     0.555 1.00 0.00 0.77    0.61 1.000 1.000  0.00      1.0
#> 9      0.42 0.68     0.309 0.60 0.00 0.38    0.77 0.190 0.000  0.33      1.0
#> 10     0.67 1.00     0.523 1.00 0.00 0.61    0.90 1.000 0.000  0.31      1.0
#> 11     0.44 0.89     0.239 0.38 0.00 0.27    0.74 0.012 0.000  0.10      1.0
#> 12     0.33 0.82     0.120 0.00 0.00 0.60    0.82 0.280 0.321  1.00      1.0
#> 13     0.29 0.58     0.164 0.10 0.00 0.48    0.71 0.448 0.111  0.64      1.0
#> 14     0.47 0.41     0.491 1.00 0.00 0.46    0.43 0.215 0.760  0.00      1.0
#> 15     0.60 0.88     0.487 1.00 0.00 0.43    0.67 0.407 0.075  0.12      1.0
#> 16     0.82 0.71     0.864 1.00 1.00 0.55    0.78 0.100 1.000  0.00      1.0
#> 17     0.53 0.90     0.371 0.30 0.16 0.68    0.68 0.292 1.000  0.81      1.0
#> 18     0.28 0.79     0.061 0.00 0.00 0.31    0.85 0.000 0.125  0.00      1.0
#> 19     0.30 0.81     0.084 0.00 0.00 0.42    0.90 0.140 0.000  0.86      0.4
#> 20     0.05 0.00     0.072 0.00 0.00 0.36    0.66 0.000 0.293  0.32      1.0
#> 21     0.48 0.74     0.368 0.00 1.00 0.34    0.94 0.165 0.000  0.00      1.0
#> 22     0.60 0.60     0.598 0.34 1.00 0.64    0.52 1.000 0.560  0.00      1.0
#> 23     0.43 0.41     0.442 0.00 1.00 0.71    0.97 1.000 0.485  0.00      1.0
#> 24     0.48 0.78     0.356 0.00 1.00 0.28    0.81 0.040 0.000  0.00      1.0
#> 25     0.49 0.65     0.423 0.00 1.00 0.62    0.78 1.000 0.223  0.00      1.0
#> 26     0.68 0.84     0.613 0.52 1.00 0.38    0.94 0.330 0.000  0.00      1.0
#> 27     0.30 0.72     0.114 0.08 0.00 0.26    0.36 0.000 0.000  0.80      1.0
#> 28     0.62 0.71     0.583 0.44 1.00 0.32    0.37 0.407 0.000  0.00      1.0
#> 29     0.52 0.40     0.575 0.92 0.24 0.75    0.89 1.000 0.321  0.82      1.0
#> 30     0.60 0.88     0.480 0.46 0.83 0.22    0.59 0.000 0.000  0.03      1.0

Learning illustration

To model the decision maker preferences, the parameters of the aggregation operator can be learned from data. The user must in this case provide an additional target, for each sample.

The rule conclusions of a FIS operator can be optimized using the FisPro software, the reader may refer to the specific documentation Learning with FisPro.

The weights of the WAM and the OWA can be learned using a least square minimization procedure under two constraints for the weights: they must be positive and their sum should be 1. The process is illustrated using a toy example, the fusion_cars dataset, but it was also applied to the previously mentioned tolima dataset (Mora-Herrera et al. 2020).

The fusion_cars data include four cars (from 1 to 4) described by four attributes:

  • A: the acceleration time (s) from 0 to 100 km/h.
  • V: the volume of the trunk (dm3).
  • S: the maximum speed (km/h).
  • C: the gas consumption (l per 100 km).

The ideal vehicle should minimize A and C while maximizing V and S. The dataset is as follows:

data(fusion_cars)
print(fusion_cars)
#>      A   V   S  C
#> 1  8.8 380 190  9
#> 2 12.0 460 190  9
#> 3  8.8 380 170  7
#> 4  7.2 340 200 10

To turn the raw data into preference degrees, the following transformation data are used:

a <- NewFusionInput("µA", NewMfTrapezoidalInf(4, 20), "A")
v <- NewFusionInput("µV", NewMfTrapezoidalSup(100, 500), "V")
s <- NewFusionInput("µS", NewMfTrapezoidalSup(120, 220), "S")
c <- NewFusionInput("µC", NewMfTrapezoidalInf(6, 16), "C")

This yields the following degrees:

fusion <- NewFusion(fusion_cars)
fusion$aggregate <- list(a, v, s, c)
fusion$perform()
degrees <- fusion$output()
print(degrees)
#>    µA  µV  µS  µC
#> 1 0.7 0.7 0.7 0.7
#> 2 0.5 0.9 0.7 0.7
#> 3 0.7 0.7 0.5 0.9
#> 4 0.8 0.6 0.8 0.6

The first vehicle, which represents a good trade-off, is preferred to the others. This preference can modeled by any target with the highest value in the first location, for example:

target <- c(0.8, 0.6, 0.6, 0.6)

or by the binary following one, that is used in this example:

target <- c(1, 0, 0, 0)

The WAM learning gives the following results:

wam_weights <- LearnWamWeights(degrees, target)
print(wam_weights)
#> [1] 0.5 0.0 0.5 0.0

with inferred values:

wam_aggreg <- NewFusionAggreg("wam", NewAggregWam(wam_weights), a, v, s, c)
fusion$aggregate <- wam_aggreg
fusion$perform()
wam_inferred <- fusion$output()["wam"]
print(wam_inferred)
#>   wam
#> 1 0.7
#> 2 0.6
#> 3 0.6
#> 4 0.8

The result is not the expected one, the car ‘4’ has the highest score, the preferred car ‘1’ is ranked second: the WAM operator is not able to model the compromise

The OWA learning gives the following results:

owa_weights <- LearnOwaWeights(degrees, target)
print(owa_weights)
#> [1] 1 0 0 0

In this toy example, the OWA aggregator returns the minimum: the whole weight is put on the smallest degree. This ensures that the decision maker preference is accurately modeled:

owa_aggreg <- NewFusionAggreg("owa", NewAggregOwa(owa_weights), a, v, s, c)
fusion$aggregate <- owa_aggreg
fusion$perform()
owa_inferred <- fusion$output()["owa"]
print(owa_inferred)
#>   owa
#> 1 0.7
#> 2 0.5
#> 3 0.5
#> 4 0.6

Publications

Guillaume, Serge, Terry Bates, Jean-Luc Lablee, Thom Betts, and James Taylor. 2020. “Combining Spatial Data Layers Using Fuzzy Inference Systems: Application to an Agronomic Case Study.” In Proceedings of the 6th International Conference on Geographical Information Systems Theory, Applications and Management (GISTAM 2020), 62–71. Prague, Czech Republic: SCITEPRESS.
Leroux, Corentin, Hazaël Jones, Léo Pichon, Serge Guillaume, Julien Lamour, James Taylor, Olivier Naud, Thomas Crestey, Jean-Luc Lablee, and Bruno Tisseyre. 2018. “GeoFIS: An Open Source, Decision-Support Tool for Precision Agriculture Data.” Agriculture 8 (6). https://doi.org/10.3390/agriculture8060073.
Mora-Herrera, Denys Yohana, Serge Guillaume, Didier Snoeck, and Orlando Zúñiga Escobar. 2020. “A Fuzzy Logic Based Soil Chemical Quality Index for Cacao.” Computers and Electronics in Agriculture 177: 105624. https://doi.org/10.1016/j.compag.2020.105624.