% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/fit_Bayesian_FROC.R
\name{fit_MRMC}
\alias{fit_MRMC}
\title{Fit and Draw the FROC models (curves)}
\usage{
fit_MRMC(dataList, DrawCurve = FALSE, verbose = TRUE,
  PreciseLogLikelihood = FALSE, summary = TRUE, dataList.Name = "",
  prior = 1, ModifiedPoisson = TRUE, mesh.for.drawing.curve = 10000,
  significantLevel = 0.7, cha = 1, war = floor(ite/5), ite = 10000,
  dig = 3, see = 1234569, Null.Hypothesis = FALSE,
  prototype = FALSE, model_reparametrized = FALSE, zz = 1)
}
\arguments{
\item{dataList}{To be passed to the function \code{rstan::}\code{\link[rstan]{sampling}}() in \pkg{rstan}. This is a variable in the function \code{rstan::sampling()} in which it is named \code{data}.



 For the single reader and a single modality data, the \code{dataList} is the following forms:

\code{ dataList.Example <- list(       }

\code{            h = c(41,22,14,8,1),   }

\code{            f = c(1,2,5,11,13),    }

\code{            NL = 124,     }

\code{            NI = 63,    }

\code{            C = 5)         }

And using this object \code{dataList.Example}, we can apply \code{fit_Bayesian_FROC()} such as \code{fit_Bayesian_FROC(dataList.Example)}.






To make this \R object \code{dataList}, this package provides three functions:
please use one of the following codes to obtain an \R object representing FROC data:


\describe{
\item{  \code{ \link{convertFromJafroc}()}           }{ If  data is a           \emph{\strong{JAFROC xlsx}} formulation.}
\item{  \code{ \link{dataset_creator_new_version}()} }{ Enter TP and FP data    \emph{\strong{by table   }}.            }
\item{  \code{ \link{create_dataset}()}              }{ Enter TP and FP data by \emph{\strong{interactive}} manner.     }
}

This package includes FROC datasets.
Before running the function, we can confirm dataset is correctly formulated by using the function \strong{\code{ \link{viewdata}}}.


----------------------------------------------------------------------------------------


  \strong{A Single reader and a single modality (SRSC) case.}


----------------------------------------------------------------------------------------

In a single reader and a single modality case (srsc), it should be a list which includes  \code{f, h, NL, NI, C}.
This list contains the following numeric vectors \code{f, h} and numeric \code{NL, NI, C} :
\describe{
\item{ \code{f}  }{Non-negative integer vector  specifying  number of False Alarms   associated with  each confidence level. The first component corresponding to the highest confidence level.}
\item{ \code{h}  }{Non-negative integer vector  specifying  number  of Hits  associated with  each confidence level. The first component corresponding to the highest confidence level.}
\item{ \code{NL} }{A positive integer, representing  Number of Lesions.}
\item{ \code{NI} }{A positive integer, representing  Number of Images. }
\item{ \code{C}  }{A positive integer, representing  Number of Confidence level. }
}




The detail of these dataset, see the datasets  endowed with this package.
'Note that the maximal number of confidence level, denoted by  \code{C}, are included,
however,
Note that confidence level vector \code{c } should not be specified. If specified, will be ignored , since it is created by \code{  c <-c(rep(C:1))} in the program and do not refer from user input data, where \code{C} is the highest number of confidence levels.
So, you should write down your hits and false alarms vector so that it is compatible with this automatically created \code{c} vector.


\strong{\emph{ data Format:}}

 \emph{            A single reader and a single modality case   }

------------------------------------------------------------------------------------------------------
\tabular{rccc}{
\code{NI=63,NL=124}   \tab \strong{ confidence level } \tab \strong{ No. of false alarms} \tab \strong{No. of hits}  \cr
 In R console ->      \tab \code{ c} \tab   \code{f }  \tab   \code{h}  \cr
  -----------------------\tab ----------------------- \tab ----------------------------- \tab ------------- \cr
\emph{definitely} present  \tab  \code{c[1] = }5 \tab \code{f[1] = }\eqn{F_5} = 1 \tab  \code{h[1] = }\eqn{H_5} = 41 \cr
 \emph{probably} present   \tab  \code{c[2] = }4 \tab \code{f[2] = }\eqn{F_4} = 2 \tab  \code{h[2] = }\eqn{H_4} = 22 \cr
 equivocal                 \tab  \code{c[3] = }3 \tab \code{f[3] = }\eqn{F_3} = 5 \tab  \code{h[3] = }\eqn{H_3} = 14  \cr
 subtle                    \tab  \code{c[4] = }2 \tab \code{f[4] = }\eqn{F_2} = 11 \tab \code{h[4] = }\eqn{H_2} = 8  \cr
 \emph{very} subtle        \tab  \code{c[5] = }1 \tab \code{f[5] = }\eqn{F_1} = 13 \tab \code{h[5] = }\eqn{H_1} = 1  \cr
 }

---------------------------------------------------------------------------------------------------



*  \emph{false alarms} = False Positives = FP

*  \emph{hits} = True Positives = TP

Note that  in FROC data, all confidence level means \emph{present} (\emph{diseased, lesion}) case only, no confidence level indicating absent.. Since each reader marks their suspicious location only and it generate the hits and false alarms, \emph{thus} his confidence level representing that lesion is \emph{present}.
In the absent case, reader dose not mark any locations and hence, the absent confidence level does not relate this dataset. So, if reader think it is no lesion, then in such case confidence level is not needed.


Note that the first column of confidence level vector \code{c } should not be specified. If specified, will be ignored , since it is created by \code{  c <-c(rep(C:1))} automatically in the program and do not refer from user input data even if it is specified explicitly, where \code{C} is the highest number of confidence levels.
So you should check the compatibility of your data and the program's generating new confidence  level vector by
a table which can be displayed by the function \code{\link{viewdata}()}.









---------------------------------------------------------------------------------------

  \strong{Multiple readers and multiple modalities case, i.e., MRMC case}


---------------------------------------------------------------------------------------


For  multiple readers and multiple modalities case, i.e., MRMC case,
to apply the function \code{fit_Bayesian_FROC()}, an \R list object representing FROC data
must have components \code{m,q,c,h,f,NL,C,M,Q}:
\describe{
\item{ \code{C }  }{A positive integer, representing  the \emph{\strong{highest}} number of confidence level, this is a scalar.}
\item{ \code{M }  }{A positive integer vector, representing  the number of \emph{\strong{modalities}}.  }
\item{ \code{Q }  }{A positive integer, representing  the number of \emph{\strong{readers}}. }
\item{ \code{c }  }{A vector of positive integers,  representing  the \emph{\strong{confidence level}}. This vector must be made by \code{rep(rep(C:1), M*Q)} }
\item{ \code{m }  }{A vector of positive integers,  representing  the \emph{\strong{modality}} ID vector. }
\item{ \code{q }  }{A vector of positive integers,  representing  the \emph{\strong{reader}} ID vector.}
\item{ \code{h }  }{A vector of non-negative integers,  representing  the number of \emph{\strong{hits}} vector.}
\item{ \code{f }  }{A vector of non-negative integers,  representing  the number of \emph{\strong{false alarm}} vector.}
\item{ \code{NL }  }{A positive integer, representing  the Total number of \emph{\strong{lesions}} for all images, this is a scalar.}
}



The detail of these dataset, please see the example datasets ( the section \strong{See Also} in the below) in this package.



Note that the maximal number of confidence level, denoted by  \code{C}, are included,
however,
its each confidence level vector also created in the program by \code{C}. So, to confirm
your false positives and hits are correctly correspond
to confidence levels,
you should confirm the orders by the function \code{\link{viewdata}()}.


\strong{\emph{ Example data: }}

 \emph{ Multiple readers and multiple modalities case, i.e., MRMC case }




---------------------------------------------------------------------------------------------------
\tabular{ccccc}{
 \strong{ Reader ID} \tab   \strong{Mortality }  \tab  \strong{ Confidence levels} \tab   \strong{No. of false alarms} \tab   \strong{No. of hits}.\cr
  \code{q} \tab  \code{ m}  \tab   \code{c} \tab  \code{ f} \tab \code{ h}\cr
  -------- \tab ------------- \tab ------------------------ \tab  ------------------- \tab ----------------\cr
  1 \tab 1 \tab 5 \tab  1\tab 15\cr
  1 \tab 2 \tab 4  \tab 3\tab 14\cr
  1 \tab 3 \tab 3  \tab 5\tab 5\cr
  1 \tab 1 \tab 2  \tab 5\tab 3\cr
  1 \tab 2 \tab 1  \tab 9\tab 4\cr
  1 \tab 3 \tab 5  \tab 1\tab 14\cr
  1 \tab 1 \tab 4  \tab 2\tab 13\cr
  1 \tab 2 \tab 3  \tab 2\tab 5\cr
  1 \tab 3 \tab 2 \tab 5\tab 3\cr
  2 \tab 1 \tab 1 \tab  6\tab 4\cr
  2 \tab 2 \tab 5  \tab 1\tab 14\cr
  2 \tab 3 \tab 4  \tab 1\tab 4\cr
  2 \tab 1 \tab 3  \tab 1\tab 1\cr
  2 \tab 2 \tab 2  \tab 2\tab 2\cr
  2 \tab 3 \tab 1  \tab 3\tab 2\cr
  2 \tab 1 \tab 5  \tab 1\tab 13\cr
  2 \tab 2 \tab 4 \tab 2\tab 4\cr
  2 \tab 3 \tab 3  \tab 1\tab 2\cr }
---------------------------------------------------------------------------------------------------




*  \emph{false alarms} = False Positives = FP

*  \emph{hits} = True Positives = TP}

\item{DrawCurve}{Logical: \code{TRUE} of \code{FALSE}. Whether the curve is to be drawn. TRUE or FALSE. If you want to draw the FROC and AFROC curves, then you set \code{DrawCurve =TRUE}, if not then \code{DrawCurve =FALSE}.
The reason why the author make this variable \code{DrawCurve} is that it takes long time in MRMC case to draw curves, and thus default value is \code{FALSE} in the case of MRMC data.}

\item{verbose}{A logical, if TRUE, then the redundant summary is printed in \R console.}

\item{PreciseLogLikelihood}{Logical, that is \code{TRUE} or \code{FALSE}. If \code{PreciseLogLikelihood  = TRUE}(default), then Stan calculates the precise log likelihood with target formulation.
If \code{PreciseLogLikelihood  = FALSE}, then Stan calculates the log likelihood by dropping the constant terms in the likelihood function.
In past, I distinct the stan file, one is target formulation and the another is not. But non-target formulation cause some Jacobian warning,
thus I made all stanfile with target formulation when I uploaded to CRAN.
 Thus this variable is now meaningless.}

\item{summary}{Logical: \code{TRUE} of \code{FALSE}. Whether to print the verbose summary, i.e., logical; If \code{TRUE} then verbose summary is printed in the \R console. If \code{FALSE}, the output is minimal. I regret, this variable name should be verbose.}

\item{dataList.Name}{This is not for user, but the author for this package development.}

\item{prior}{positive integer, to select the prior}

\item{ModifiedPoisson}{Logical, that is \code{TRUE} or \code{FALSE}. If
\code{ModifiedPoisson = TRUE},
then Poisson rate of false alarm is calculated \emph{per lesion},
and model is fitted so that the FROC curve is a expected curve of TPF and FPF \emph{per lesion}.
If \code{ModifiedPoisson = FALSE}, then Poisson rate of false alarm is calculated \emph{per image},
and model is fitted so that the FROC curve is a expected curve of TPF and FPF \emph{per image}.
To know detail, see the author's paper in which I explained \emph{per image} and \emph{per lesion}.(for details of models, see   \href{https://cran.r-project.org/package=BayesianFROC}{ vignettes  })

If \code{ModifiedPoisson = TRUE},
 then the \emph{False Positive Fraction (FPF)} is calculated as follows
 (\eqn{f_c} denotes the number of false alarms with confidence level \eqn{c} )


\deqn{ \frac{f_1+f_2+f_3+f_4+f_5}{N_L}, }

\deqn{ \frac{f_2+f_3+f_4+f_5}{N_L}, }

 \deqn{ \frac{f_3+f_4+f_5}{N_L}, }

  \deqn{ \frac{f_4+f_5}{N_L}, }

   \deqn{ \frac{f_5}{N_L}, }

where \eqn{N_L} is a number of lesions (signal).


On the other hand,


if \code{ModifiedPoisson = FALSE} (Default), then
\deqn{ \frac{f_1+f_2+f_3+f_4+f_5}{N_I}, }

\deqn{ \frac{f_2+f_3+f_4+f_5}{N_I}, }

 \deqn{ \frac{f_3+f_4+f_5}{N_I}, }

  \deqn{ \frac{f_4+f_5}{N_I}, }

   \deqn{ \frac{f_5}{N_I}, }

where \eqn{N_I} is a number of images (trial).


The model is fitted so that the estimated FROC curve is on the FPF per image or per lesion accordingly.

If \code{ModifiedPoisson = TRUE}, then FROC curve means the expected pair of FPF \strong{per lesion} and TPF.

On the other hand, if  \code{ModifiedPoisson = FALSE}, then FROC curve means the expected pair of \strong{FPF per image} and TPF.




So,data of FPF and TPF are changed thus, a fitted model is also changed whether  \code{ModifiedPoisson = TRUE} or \code{FALSE}.
In traditional FROC analysis, it uses only per images (trial). Since we can divide one image into two images or more images, number of
trial is not important. And more important is per signal. So, the author also developed FROC theory to consider FROC analysis under per signal.
One can see that the FROC curve is rigid with respect to change of a number of images, so, it does not matter whether \code{ModifiedPoisson = TRUE} or \code{FALSE}.

Revised 2019 August 28}

\item{mesh.for.drawing.curve}{An integer indicating number of dots drawing the curves, default =10000.}

\item{significantLevel}{This is a number between 0 and 1. The results are shown if posterior probabilities are greater than this quantity.}

\item{cha}{To be passed to the function \code{rstan::}\code{\link[rstan]{sampling}}() in \pkg{rstan}. An argument of \code{rstan::}\code{\link[rstan]{sampling}}()  in which it is named \code{chains}.  A positive integer representing   the number of chains generated by Hamiltonian Monte Carlo method,
and, default = 1.}

\item{war}{To be passed to the function \code{rstan::}\code{\link[rstan]{sampling}}() in \pkg{rstan}. An argument of \code{rstan::}\code{\link[rstan]{sampling}}()  in which it is named \code{warmup}.  A positive integer representing the Burn in period, which must be less than \code{ite}. Defaults to
war = floor(ite/5)=10000/5=2000,}

\item{ite}{To be passed to the function \code{rstan::}\code{\link[rstan]{sampling}}() in \pkg{rstan}. An argument of \code{rstan::}\code{\link[rstan]{sampling}}()  in which it is named \code{iter}. A positive integer representing  the  number of samples generated by Hamiltonian Monte Carlo method,
and, default = 10000. If your model could not converge, then raise this number. Must be greater for more reliable estimates.}

\item{dig}{To be passed to the function \code{rstan::}\code{\link[rstan]{sampling}}() in \pkg{rstan}. An argument of \code{rstan::}\code{\link[rstan]{sampling}}()  in which it is named \code{...??}.   A positive integer representing   the Significant digits, used in stan Cancellation.
default = 5,}

\item{see}{To be passed to the function \code{rstan::}\code{\link[rstan]{sampling}}() in \pkg{rstan}. An argument of \code{rstan::}\code{\link[rstan]{sampling}}()  in which it is named \code{seed}.  A positive integer representing  seed used in stan,
default = 1234567.}

\item{Null.Hypothesis}{Logical, that is \code{TRUE} or \code{FALSE}.
If \code{Null.or.Alternative.Hypothesis  = FALSE}(default),
 then fit the \emph{alternative model} to \code{dataList} (for details of models, see   \href{https://cran.r-project.org/package=BayesianFROC}{ vignettes  }).
If \code{Null.or.Alternative.Hypothesis  = TRUE},
 then fit the \emph{null model} to \code{dataList}.(for details of models, see   \href{https://cran.r-project.org/package=BayesianFROC}{ vignettes  }).
 Note that the null model is constructed under the null hypothesis that
 all modality are same observer performance ability.
 The alternative model is made under the assumption that all modality are not same.
The reason why author creates this parameter is to test the null hypothesis by the Bayes factor.
But the result of test is not desired one for me. Thus the test is under construction.}

\item{prototype}{A logical, if \code{TRUE} then the model is no longer
a generative model, namely a dataset from the model cannot satisfy that the condition that the sum of his is not greater than number of lesions:

\deqn{ \Sigma_c H_c \le N_L }

However this model (\code{TRUE} ) is good in the sense that it admits various initial values of MCMC sampling.

 if \code{FALSE}, then the model is precisely statistical model in the seance that any dataset from the model satisfies that the sum of all hits is
 not greater than the number of lesions. This model is theoretically perfect. However, in the practically, the calculation will generates some undesired results which caused by the so-called floo .... I forget English :'-D.
 The flood point??? I forgeeeeeeeeeeeeet!! Ha. So, prior generates very small hit rates such as 0.0000000000000001234 and it cause the non accurate calculation such as 0.00000,,,00000123/0.000.....000012345= 0.0012 which becomes hit rate and thus OH No!.
 Then it generates Bernoulli success rate which is not less than 1 !!
 To avoid this, the author should develop the theory of prior to avoid this very small numbers, however the author has idea but now it does not success.




If \code{prototype = TRUE}, then the model for hits is the following:

\deqn{H_5 ~ Binomial(p_5,N_L)}
\deqn{H_4 ~ Binomial(p_4,N_L)}
\deqn{H_3 ~ Binomial(p_3,N_L)}
\deqn{H_2 ~ Binomial(p_2,N_L)}
\deqn{H_1 ~ Binomial(p_1,N_L)}


On the other hand,
if \code{prototype = FALSE}, then the model for hits is the following:

\deqn{H_5 ~ Binomial(               p_5,N_L      )                        }
\deqn{H_4 ~ Binomial( \frac{p_4}{1-p_5},N_L - H_5)                        }
\deqn{H_3 ~ Binomial( \frac{p_3}{1-p_5-p_4},N_L - H_5-H_4)                }
\deqn{H_2 ~ Binomial( \frac{p_2}{1-p_5-p_4-p_3},N_L - H_5-H_4-H_3)        }
\deqn{H_1 ~ Binomial( \frac{p_1}{1-p_5-p_4-p_3-p_2},N_L - H_5-H_4-H_3-H_2)}


Each number of lesions is adjusted so that the sum of hits \eqn{\Sigma_c H_c} is less than
the number of lesions (signals, targets) \eqn{N_L}.
And hence the model in case of \code{prototype = FALSE} is a generative model in the sense that
it can replicate datasets of FROC  arises.
Note that the adjustment of the number of lesions in the above manner leads us the adjustment of hit rates.
The reason why we use the hit rates such as \eqn{\frac{p_2}{1-p_5-p_4-p_3}} instead of \eqn{p_c} is that
it ensures the equality \eqn{ E[H_c/N_L] = p_c}. This equality is very important.
To establish Bayesian FROC theory so that it is compatible to the classical FROC theory, we need two equations of expectation,

  \deqn{ E[H_c/N_L] = p_c,}
  \deqn{ E[F_c/N_X] = q_c,}

where  \eqn{E} denotes the expectation and \eqn{N_X} is the number of lesion or the number of images and
\eqn{q_c} is a false alarm rate, namely, \eqn{ F_c ~ Poisson(N_X q_c)}.

Using the above two equations, we can establish the alternative Bayesian FROC theory preserving classical notions and
formulas. For the details, please see the author's pre print:

Bayesian Models for ,,, for?? I forget my paper title .... :'-D.
What the hell!? I forget,... My health is so bad to forget , .... I forget.




The author did not notice that the prototype is not a generative model. And hence
the author revised the model so that the model is exactly generative model.

But the reason why the author remains the prototype model(\code{prototype = TRUE})
is that the convergence of MCMC sampling in case of MRMC is not good in the current model (\code{prototype = FALSE}) .
Because it uses fractions \eqn{\frac{p_1}{1-p_5-p_4-p_3-p_2}} and which is very dangerous to numerical perspective.
For example, if \eqn{p_1} is very small, then the numerator and denominator of \eqn{\frac{p_1}{1-p_5-p_4-p_3-p_2}}  is very small.
Both of them is like 0.000000000000000123.... and such small number causes the non accurate results.
So, sometimes, it occurs that \eqn{\frac{p_1}{1-p_5-p_4-p_3-p_2} >1} which never occur in the theoretical perspective but
unfortunately, in numerically occurs.

SO, now, the author try to avoid such phenomenon by using priors but it now does not success.



Here of course we interpret the terms
such as \eqn{N_L - H_5-H_4-H_3} as
the remained targets after
reader get hits. The author thinks it is another manner to do so like \eqn{N_L -H_1-H2-H_3}, but it does not be employed.
Since the author thinks that the reader will assign his suspicious lesion location from high confidence level and in this view point
the author thinks it should be considered that targets are found from the highest confidence suspicious location.}

\item{model_reparametrized}{A logical, if TRUE, then a model under construction is used.}

\item{zz}{A real number: parameter of prior}
}
\description{
Fit and Draw the FROC models (curves).
}
