% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ParticipantGroup.R
\docType{class}
\name{ParticipantGroup-class}
\alias{ParticipantGroup-class}
\alias{ParticipantGroup}
\title{A Reference Class for representing a group of consistency test participants}
\description{
A Reference Class for representing a group of consistency test participants
}
\section{Fields}{

\describe{
\item{\code{participants}}{A list of \code{\link{Participant}}
class instances.}
}}

\section{Methods}{

\describe{
\item{\code{add_participant(participant)}}{Add a passed participant to the participantgroup's list
of participants. The participant's entry in
the list is named based on the participant's
id. Note that if you try to add
a participant with an id that's identical
to one of the participants already in the
participantgroup's list of participants, the
already existing same-id participant
is overwritten.}

\item{\code{add_participants(participant_list)}}{Go through a passed list of Participant instances
and add each one using the add_participant() method.}

\item{\code{check_valid_get_twcv_scores(
  min_complete_graphemes = 5,
  dbscan_eps = 20,
  dbscan_min_pts = 4,
  max_var_tight_cluster = 150,
  max_prop_single_tight_cluster = 0.6,
  safe_num_clusters = 3,
  safe_twcv = 250,
  complete_graphemes_only = TRUE,
  symbol_filter = NULL
)}}{    Checks if participants' data are valid based on passed arguments.
    This method aims to identify participants who had too few responses or
    varied their response colors too little, by marking them as invalid.
    Note that there are no absolutely correct values, as what is 'too little
    variation' is highly subjective. You might need to tweak parameters to be
    in line with your project's criteria, especially if you use another color
    space than CIELUV, since the default values are based on what seems
    to make sense in a CIELUV context. If you use the results in a
    research article, make sure to reference synr and specify what parameter
    values you passed to the function.

    This method relies heavily on the DBSCAN algorithm and the package
    'dbscan', and involves calculating a synr-specific 'Total Within-Cluster
    Variance' (TWCV) score. You can find more information, and
    what the parameters here mean, in
    the documentation for the function \code{validate_get_twcv}. Note
    that DBSCAN clustering and related calculations are performed on
    a per-participant basis, before they are summarized in the data frame
    returned by this method.
    \subsection{Parameters}{
      \itemize{
        \item{\code{min_complete_graphemes} The minimum number of graphemes
          with complete (all non-NA color) responses that a participant's data
          must have for them to not be categorized as invalid based on
          this criterion. Defaults to 7.
        }
        \item{\code{dbscan_eps} Radius of 'epsilon neighborhood' when applying
          (on a per-participant basis) DBSCAN clustering. Defaults to 30.
        }
        \item{\code{dbscan_min_pts} Minimum number of points required in the
          epsilon neighborhood for core points (including the core point
          itself). Defaults to 4.
        }
        \item{\code{max_var_tight_cluster} Maximum variance for an identified
          DBSCAN cluster to be considered 'tight-knit'. Defaults to 150.
        }
        \item{\code{max_prop_single_tight_cluster} Maximum proportion of
          points allowed to be within a single 'tight-knit' cluster (if a
          participant's data exceed this limit, they are classified as
          invalid). Defaults to 0.6.
        }
        \item{\code{safe_num_clusters} Minimum number of identified DBSCAN
          clusters (including 'noise' cluster only if it consists of at least
          'dbscan_min_pts' points) that guarantees validity of
          a participant's data if points are 'non-tight-knit'. Defaults to 3.
        }
        \item{\code{safe_twcv} Minimum total within-cluster variance (TWCV)
          score that guarantees a participant's data's validity if points are
          'non-tight-knit'. Defaults to 250.
        }
        \item{\code{complete_graphemes_only} A logical vector. If TRUE, 
          only data from graphemes that have all non-NA color responses
          are used; if FALSE, even data from graphemes with some NA color
          responses are used. Defaults to TRUE.
        }
        \item{\code{symbol_filter} A character vector (or NULL) that specifies
          which graphemes' data to use. Defaults to NULL, meaning data from
          all of the participants' graphemes will be used.
        }
      }
    }

    \subsection{Returns}{
      A data frame with columns
      \itemize{
        \item{\code{valid} Holds TRUE for participants whose data were
        classified as valid, FALSE for participants whose data were
        classified as invalid.}
        \item{\code{reason_invalid} Strings which describe for each
          participant why their data were deemed invalid. Participants
          whose data were classified as valid have empty strings here.
        }
        \item{\code{twcv} Numeric column which holds participants'
          calculated TWCV scores (NA for participants who had no/too
          few graphemes with complete responses).
        }
        \item{\code{num_clusters} One-element numeric (or NA if there are no/too few
          graphemes with complete responses) vector indicating
          the number of identified clusters counting toward the
          tally compared with 'safe_num_clusters'.
        }
      }
    }
    }

\item{\code{get_ids()}}{Returns a character vector with all ids for
participants associated with the participantgroup.}

\item{\code{get_mean_consistency_scores(
  method = "euclidean",
  symbol_filter = NULL,
  na.rm = FALSE
)}}{Returns a vector of mean consistency scores for
      participants in the group. If na.rm=FALSE, for each
      participant calculates the mean consistency score if
      all of the participants' graphemes only have response
      colors that are non-NA, otherwise puts an NA value
      for that participant in returned vector. If na.rm=TRUE,
      for each participant calculates the mean consistency score for
      all of the participant's graphemes that only have
      non-NA response colors, while ignoring graphemes
      that have at least one NA response color value. Note that
      for participants whose graphemes ALL have at least one NA
      response color value, an NA is put in the returned vector for
      that participant, regardless of what na.rm is set to.

      If a character vector is passed to symbol_filter, only
      data from graphemes with symbols in the passed vector
      are used when calculating each participant's mean score.

      Use the method argument to specify what kind of color space
      distances should be used when calculating consistency scores
      (usually 'manhattan' or 'euclidean' - see documentation for
      the base R dist function for all options)}

\item{\code{get_mean_response_times(symbol_filter = NULL, na.rm = FALSE)}}{Returns the mean response times, with respect to
Grapheme instances associated with each participant.
If na.rm=TRUE, for each participant returns mean response time even
if there are missing response times. If na.rm=FALSE, returns
mean response time if there is at least one response time
value for at least one of the participants' graphemes. If a
character vector is passed to symbol_filter, only data from
graphemes with symbols in the passed vector are used when
calculating each participant's mean response time.}

\item{\code{get_numbers_all_colored_graphemes(symbol_filter = NULL)}}{Returns a vector with numbers representing how many
graphemes with all-valid (non-na) response colors that each
participant has.  If a character vector is passed to symbol_filter,
only data connected to graphemes with symbols in the passed vector
are used.}

\item{\code{has_participants()}}{Returns TRUE if there is at least one
participant in the participantgroup's participants list,
otherwise returns FALSE}

\item{\code{save_plots(
  save_dir = NULL,
  file_format = "png",
  dpi = 300,
  cutoff_line = FALSE,
  mean_line = FALSE,
  grapheme_size = 2,
  grapheme_angle = 0,
  foreground_color = "black",
  background_color = "white",
  symbol_filter = NULL,
  ...
)}}{Goes through all participants and for each one produces and saves
    a ggplot2 plot that describes the participant's
    grapheme color responses and per-grapheme consistency scores,
    using the ggsave function.

    If a character vector is passed to symbol_filter, only data for graphemes
    with symbols in the passed vector are used.

    If path is not specified, plots are saved to the current
    working directory. Otherwise, plots are saved to the specified
    directory. The file is saved using the specified file_format,
    e. g. JPG (see ggplot2::ggsave documentation for list of
    supported formats), and the resolution specified with
    the dpi argument.

    If cutoff_line=TRUE, each plot will include a blue line that
    indicates the value 135.30, which is the synesthesia cut-off score
    recommended by Rothen, Seth, Witzel & Ward (2013) for the L*u*v
    color space. If mean_line=TRUE, the plot will include a green line
    that indicates the participant's mean consistency score for
    graphemes with all-valid response colors (if the participant
    has any such graphemes). If a vector is passed to symbol_filter,
    this green line represents the mean score
    for ONLY the symbols included in the filter.

    Pass a value to grapheme_size to adjust the size of graphemes
    shown at the bottom of the plot, e. g. increasing the size if
    there's empty space otherwise, or decreasing the size if the
    graphemes don't fit. Similarly, you can use the grapheme_angle
    argument to rotate the graphemes, which might help them fit better.

    Apart from the ones above, all other arguments
    that ggsave accepts (e. g. 'scale') also work with this function, since
    all arguments are passed on to ggsave. }
}}

