% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/check_dag.R
\name{check_dag}
\alias{check_dag}
\alias{as.dag}
\title{Check correct model adjustment for identifying causal effects}
\usage{
check_dag(
  ...,
  outcome = NULL,
  exposure = NULL,
  adjusted = NULL,
  latent = NULL,
  effect = c("all", "total", "direct"),
  coords = NULL
)

as.dag(x, ...)
}
\arguments{
\item{...}{One or more formulas, which are converted into \strong{dagitty} syntax.
First element may also be model object. If a model objects is provided, its
formula is used as first formula, and all independent variables will be used
for the \code{adjusted} argument. See 'Details' and 'Examples'.}

\item{outcome}{Name of the dependent variable (outcome), as character string.
Must be a valid name from the formulas. If not set, the first dependent
variable from the formulas is used.}

\item{exposure}{Name of the exposure variable (as character string), for
which the direct and total causal effect on the \code{outcome} should be checked.
Must be a valid name from the formulas. If not set, the first independent
variable from the formulas is used.}

\item{adjusted}{A character vector with names of variables that are adjusted
for in the model. If a model object is provided in \code{...}, any values in
\code{adjusted} will be overwritten by the model's independent variables.}

\item{latent}{A character vector with names of latent variables in the model.}

\item{effect}{Character string, indicating which effect to check. Can be
\code{"all"} (default), \code{"total"}, or \code{"direct"}.}

\item{coords}{A list with two elements, \code{x} and \code{y}, which both are named
vectors of numerics. The names correspond to the variable names in the DAG,
and the values for \code{x} and \code{y} indicate the x/y coordinates in the plot.
See 'Examples'.}

\item{x}{An object of class \code{check_dag}, as returned by \code{check_dag()}.}
}
\value{
An object of class \code{check_dag}, which can be visualized with \code{plot()}.
The returned object also inherits from class \code{dagitty} and thus can be used
with all functions from the \strong{ggdag} and \strong{dagitty} packages.
}
\description{
The purpose of \code{check_dag()} is to build, check and visualize
your model based on directed acyclic graphs (DAG). The function checks if a
model is correctly adjusted for identifying specific relationships of
variables, especially directed (maybe also "causal") effects for given
exposures on an outcome. In case of incorrect adjustments, the function
suggests the minimal required variables that should be adjusted for (sometimes
also called "controlled for"), i.e. variables that \emph{at least} need to be
included in the model. Depending on the goal of the analysis, it is still
possible to add more variables to the model than just the minimally required
adjustment sets.

\code{check_dag()} is a convenient wrapper around \code{ggdag::dagify()},
\code{dagitty::adjustmentSets()} and \code{dagitty::adjustedNodes()} to check correct
adjustment sets. It returns a \strong{dagitty} object that can be visualized with
\code{plot()}. \code{as.dag()} is a small convenient function to return the
dagitty-string, which can be used for the online-tool from the
dagitty-website.
}
\section{Specifying the DAG formulas}{


The formulas have following syntax:
\itemize{
\item One-directed paths: On the \emph{left-hand-side} is the name of the variables
where causal effects point to (direction of the arrows, in dagitty-language).
On the \emph{right-hand-side} are all variables where causal effects are assumed
to come from. For example, the formula \code{Y ~ X1 + X2}, paths directed from
both \code{X1} and \code{X2} to \code{Y} are assumed.
\item Bi-directed paths: Use \verb{~~} to indicate bi-directed paths. For example,
\code{Y ~~ X} indicates that the path between \code{Y} and \code{X} is bi-directed, and
the arrow points in both directions. Bi-directed paths often indicate
unmeasured cause, or unmeasured confounding, of the two involved variables.
}
}

\section{Minimally required adjustments}{


The function checks if the model is correctly adjusted for identifying the
direct and total effects of the exposure on the outcome. If the model is
correctly specified, no adjustment is needed to estimate the direct effect.
If the model is not correctly specified, the function suggests the minimally
required variables that should be adjusted for. The function distinguishes
between direct and total effects, and checks if the model is correctly
adjusted for both. If the model is cyclic, the function stops and suggests
to remove cycles from the model.
}

\section{Direct and total effects}{


The direct effect of an exposure on an outcome is the effect that is not
mediated by any other variable in the model. The total effect is the sum of
the direct and indirect effects. The function checks if the model is correctly
adjusted for identifying the direct and total effects of the exposure on the
outcome.
}

\section{Why are DAGs important - the Table 2 fallacy}{


Correctly thinking about and identifying the relationships between variables
is important when it comes to reporting coefficients from regression models
that mutually adjust for "confounders" or include covariates. Different
coefficients might have different interpretations, depending on their
relationship to other variables in the model. Sometimes, a regression
coefficient represents the direct effect of an exposure on an outcome, but
sometimes it must be interpreted as total effect, due to the involvement
of mediating effects. This problem is also called "Table 2 fallacy"
(\emph{Westreich and Greenland 2013}). DAG helps visualizing and thereby focusing
the relationships of variables in a regression model to detect missing
adjustments or over-adjustment.
}

\examples{
\dontshow{if (require("ggdag", quietly = TRUE) && require("dagitty", quietly = TRUE) && require("see", quietly = TRUE) && packageVersion("see") > "0.8.5") (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
# no adjustment needed
check_dag(
  y ~ x + b,
  outcome = "y",
  exposure = "x"
)

# incorrect adjustment
dag <- check_dag(
  y ~ x + b + c,
  x ~ b,
  outcome = "y",
  exposure = "x"
)
dag
plot(dag)

# After adjusting for `b`, the model is correctly specified
dag <- check_dag(
  y ~ x + b + c,
  x ~ b,
  outcome = "y",
  exposure = "x",
  adjusted = "b"
)
dag

# use specific layout for the DAG
dag <- check_dag(
  score ~ exp + b + c,
  exp ~ b,
  outcome = "score",
  exposure = "exp",
  coords = list(
    # x-coordinates for all nodes
    x = c(score = 5, exp = 4, b = 3, c = 3),
    # y-coordinates for all nodes
    y = c(score = 3, exp = 3, b = 2, c = 4)
  )
)
plot(dag)

# Objects returned by `check_dag()` can be used with "ggdag" or "dagitty"
ggdag::ggdag_status(dag)

# Using a model object to extract information about outcome,
# exposure and adjusted variables
data(mtcars)
m <- lm(mpg ~ wt + gear + disp + cyl, data = mtcars)
dag <- check_dag(
  m,
  wt ~ disp + cyl,
  wt ~ am
)
dag
plot(dag)
\dontshow{\}) # examplesIf}
}
\references{
\itemize{
\item Rohrer, J. M. (2018). Thinking clearly about correlations and causation:
Graphical causal models for observational data. Advances in Methods and
Practices in Psychological Science, 1(1), 27–42. \doi{10.1177/2515245917745629}
\item Westreich, D., & Greenland, S. (2013). The Table 2 Fallacy: Presenting and
Interpreting Confounder and Modifier Coefficients. American Journal of
Epidemiology, 177(4), 292–298. \doi{10.1093/aje/kws412}
}
}
