% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/FuGePSD.R
\name{FUGEPSD}
\alias{FUGEPSD}
\title{Fuzzy Genetic Programming-based learning for Subgroup Discovery (FuGePSD) Algorithm.}
\usage{
FUGEPSD(paramFile = NULL, training = NULL, test = NULL,
  output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0,
  nLabels = 3, t_norm = "product_t-norm", ruleWeight = "Certainty_Factor",
  frm = "Normalized_Sum", numGenerations = 300,
  numberOfInitialRules = 100, crossProb = 0.5, mutProb = 0.2,
  insProb = 0.15, dropProb = 0.15, tournamentSize = 2,
  globalFitnessWeights = c(0.7, 0.1, 0.05, 0.2), minCnf = 0.6,
  ALL_CLASS = TRUE, targetVariable = NA)
}
\arguments{
\item{paramFile}{The path of the parameters file. \code{NULL} If you want to use training and test \code{SDEFSR_Dataset} variables}

\item{training}{A \code{SDEFSR_Dataset} class variable with training data.}

\item{test}{A \code{SDEFSR_Dataset} class variable with test data.}

\item{output}{Character vector with the paths where store information file, rules file and test quality measures file, respectively. For rules and quality measures files, the algorithm generate 4 files, each one with the results of a given filter of fuzzy confidence.}

\item{seed}{An integer to set the seed used for generate random numbers.}

\item{nLabels}{Number of linguistic labels for numerical variables. By default 3. We recommend an odd number between 3 and 9.}

\item{t_norm}{A string with the t-norm to use when computing the compatibilty degree of the rules. Use \code{'Minimum/Maximum'} to specify the minimum t-norm, if not, we use product t-norm that is the default method.}

\item{ruleWeight}{String with the method to calculate the rule weight. Possible values are: 
\itemize{
 \item \code{Certainty_Factor}: It uses the Classic Certainty Factor Weight method.
 \item \code{Average_Penalized_Certainty_Factor}: It uses Penalized Certainty Factor weight II by Ishibuchi.
 \item \code{No_Weights}: There are no weight calculation.
 \item Default: If none of this are specificied, the default method is Penalized Certainty Factor Weight IV by Ishibuchi.
     }}

\item{frm}{A string specifying the Fuzzy Reasoning Method to use. Possible Values are:
\itemize{
 \item \code{Normalized_Sum}: It uses the Normalized Sum or Additive Combination Fuzzy Reasoning Method.
 \item \code{Arithmetic_Mean}: It uses the Arithmetic Mean Fuzzy Reasoning Method.
 \item Default: By default, Winning Rule Fuzzy Reasoning Method are selected.
}}

\item{numGenerations}{An integer to set the number of generations to perfom before stop the evolutionary process.}

\item{numberOfInitialRules}{An integer to set the number individuals or rules in the initial population.}

\item{crossProb}{Sets the crossover probability. We recommend a number in [0,1].}

\item{mutProb}{Sets the mutation probability. We recommend a number in [0,1].}

\item{insProb}{Sets the insertion probability. We recommend a number in [0,1].}

\item{dropProb}{Sets the dropping probability. We recommend a number in [0,1].}

\item{tournamentSize}{Sets the number of individuals that will be chosen in the tournament selection procedure. This number must be greater than or equal to 2.}

\item{globalFitnessWeights}{A numeric vector of length 4 specifying the weights used in the computation of the Global Fitness Parameter.}

\item{minCnf}{A value in [0,1] to filter rules with a minimum confidence}

\item{ALL_CLASS}{if TRUE, the algorithm returns, at least, the best rule for each target class, even if it does not pass the filters. If FALSE, it only returns, at least, the best rule if there are not rules that passes the filters.}

\item{targetVariable}{The name or index position of the target variable (or class). It must be a categorical one.

 @details This function sets as target variable the last one that appear in \code{SDEFSR_Dataset} object. If you want 
    to change the target variable, you can set the \code{targetVariable} to change this target variable.
    The target variable MUST be categorical, if it is not, throws an error. Also, the default behaviour is to find
    rules for all possible values of the target varaible. \code{targetClass} sets a value of the target variable where the
    algorithm only finds rules about this value.
    
    If you specify in \code{paramFile} something distinct to \code{NULL} the rest of the parameters are
    ignored and the algorithm tries to read the file specified. See "Parameters file structure" below 
    if you want to use a parameters file.

    
 @return The algorithm shows in console the following results:
 \enumerate{
   \item Information about the parameters used in the algorithm.
   \item Results for each filter:
     \enumerate{
       \item Rules generated that passes the filter.
       \item The test quality measures for each rule in that filter.
     }
 }
 Also, this results are saved in a file with rules and other with the quality measures, one file per filter.
 
 @section How does this algorithm work?:
 This algorithm performs a EFS based on a genetic programming algorithm. This algorithm starts with an initial 
 population generated in a random manner where individuals are represented through the "chromosome = individual"
 approach includind both antecedent and consequent of the rule. The representation of the consequent has the advantage
 of getting rules for all target class with only one execution of the algorithm.  
 
 The algorithm employs a cooperative-competition approach were rules of the population cooperate and compete between them in order to 
 obtain the optimal solution. So this algorithm performs to evaluation, one for individual rules to competition and other for the total population 
 for cooperation.  
 
 The algorithm evolves generating an offspring population of the same size than initial generated by the application of the
 genetic operators over the main population. Once applied, both populations are joined a token competition is performed in order to 
 mantain the diversity of the rules generated. Also, this token competition reduce the population sice deleting those rules that are not competitive.  
 
 After the evolutionary process a screening function is applied over the best population. This screening function filter the rules that have a minimium
 level of confidence and sensitivity. Those levels are 0.6 for sensitivy and four filters of 0.6, 0.7, 0.8 and 0.9 for fuzzy confidence are performed.  
 
 Also, the user can force the algorithm return at least one rule for all target class values, even if not pass the screening function. This 
 behaviour is specified by the ALL_CLASS parameter.
 
 
 @section Parameters file structure:
  The \code{paramFile} argument points to a file which has the neccesary parameters to execute FuGePSD.
  This file \strong{must} be, at least, this parameters (separated by a carriage return):
  \itemize{
    \item \code{algorithm}  Specify the algorithm to execute. In this case. "MESDIF"
    \item \code{inputData}  Specify two paths of KEEL files for training and test. In case of specify only the name of the file, the path will be the working directory.
    \item \code{seed}  Sets the seed for the random number generator
    \item \code{nLabels}  Sets the number of fuzzy labels to create when reading the files
    \item \code{nEval}  Set the maximun number of \strong{evaluations of rules} for stop the genetic process
    \item \code{popLength}  Sets number of individuals of the main population
    \item \code{eliteLength}  Sets number of individuals of the elite population. Must be less than \code{popLength}  
    \item \code{crossProb}  Crossover probability of the genetic algorithm. Value in [0,1]
    \item \code{mutProb}  Mutation probability of the genetic algorithm. Value in [0,1]
    \item \code{Obj1} Sets the objetive number 1. 
    \item \code{Obj2} Sets the objetive number 2. 
    \item \code{Obj3} Sets the objetive number 3. 
    \item \code{Obj4} Sets the objetive number 4.
    \item \code{RulesRep}  Representation of each chromosome of the population. "can" for canonical representation. "dnf" for DNF representation.
    \item \code{targetClass}  Value of the target variable to search for subgroups. The target variable \strong{is always the last variable.} Use \code{null} to search for every value of the target variable
  }
  
  An example of parameter file could be:
 \preformatted{
 algorithm = FUGEPSD
 inputData = "banana-5-1tra.dat" "banana-5-1tst.dat"
 outputData = "Parameters_INFO.txt" "Rules.txt" "TestMeasures.txt"
 seed = 23783
 Number of Labels = 3
 T-norm/T-conorm for the Computation of the Compatibility Degree = Normalized_Sum
 Rule Weight = Certainty_Factor
 Fuzzy Reasoning Method = Normalized_Sum
 Number of Generations = 300
 Initial Number of Fuzzy Rules = 100
 Crossover probability = 0.5
 Mutation probability = 0.2
 Insertion probability = 0.15
 Dropping Condition probability = 0.15
 Tournament Selection Size = 2 
 Global Fitness Weight 1 = 0.7
 Global Fitness Weight 2 = 0.1 
 Global Fitness Weight 3 = 0.05
 Global Fitness Weight 4 = 0.2
 All Class = true}}
}
\description{
Make a subgroup discovery task using the FuGePSD algorithm.
}
\examples{
FUGEPSD(training = habermanTra,
         test = habermanTst,
         output = c("parametersFile.txt", "rulesFile.txt", "testQM.txt"),
         seed = 23783,
         nLabels = 3,
         t_norm = "Minimum/Maximum",
         ruleWeight = "Certainty_Factor",
         frm = "Normalized_Sum",
         numGenerations = 20,
         numberOfInitialRules = 15,
         crossProb = 0.5,
         mutProb = 0.2,
         insProb = 0.15,
         dropProb = 0.15,
         tournamentSize = 2,
         globalFitnessWeights = c(0.7, 0.1, 0.3, 0.2),
         ALL_CLASS = TRUE)
\dontrun{
Execution with a parameters file called 'ParamFile.txt' in the working directory:

FUGEPSD("ParamFile.txt")

}

}
\author{
Written on R by Angel M. Garcia <amgv0009@red.ujaen.es>
}
\references{
A fuzzy genetic programming-based algorithm for subgroup discovery and the application to one problem of pathogenesis of acute sore throat conditions in humans, Carmona, C.J., Ruiz-Rodado V., del Jesus M.J., Weber A., Grootveld M., Gonzalez P., and Elizondo D. , Information Sciences, Volume 298, p.180-197, (2015)
}

