eive

An R package for Errors-in-variables estimation in linear regression

Installation

Install stable version from CRAN

install.packages("eive")

Install development version

Please install devtools package before installing eive:

install.packages("devtools")

then install the package from the github repo using

devtools::install_github(repo = "https://github.com/jbytecode/eive") 

The Problem

Suppose the linear regression model is

\[ y = \beta_0 + \beta_1 x^* + \varepsilon \]

where \(y\) is n-vector of the response variable, \(\beta_0\) and \(\beta_1\) are unknown regression parameteres, \(\varepsilon\) is the iid. error term, \(x^*\) is the unknown n-vector of the independent variable, and \(n\) is the number of observations.

We call \(x^*\) unknown because in some situations the true values of the variable cannot be visible or directly observable, or observable with some measurement error. Now suppose that \(x\) is the observable version of the true values and it is defined as

\[ x = x^* + \delta \]

where \(\delta\) is the measurement error and \(x\) is the erroneous version of the true \(x^*\). If the estimated model is

\[ \hat{y} = \hat{\beta_0} + \hat{\beta_1}x \]

then the ordinary least squares (OLS) estimates are no longer unbiased and even consistent.

Eive-cga is an estimator devised for this problem. The aim is to reduce the errors-in-variable bias with some cost of increasing the variance. At the end, the estimator obtains lower Mean Square Error (MSE) values defined as

\[ MSE(\hat{\beta_1}) = Var(\hat{\beta_1}) + Bias^2(\hat{\beta_1}) \]

for the Eive-cga estimator. For more detailed comparisons, see the original paper given in the Citation part.

Usage

For the single variable case

> eive(dirtyx = dirtyx, y = y, otherx = nothing) 

and for the multiple regression

> eive(dirtyx = dirtyx, y = y, otherx = matrixofotherx) 

and for the multiple regression with formula object

> eive(formula = y ~ x1 + x2 + x3, dirtyx.varname = "x", data = mydata) 

Note that the method assumes there is only one erroneous variable in the set of independent variables.

Citation

@article{satman2015reducing,
  title={Reducing errors-in-variables bias in linear regression using compact genetic algorithms},
  author={Satman, M Hakan and Diyarbakirlioglu, Erkin},
  journal={Journal of Statistical Computation and Simulation},
  volume={85},
  number={16},
  pages={3216--3235},
  year={2015},
  doi={10.1080/00949655.2014.961157}
  publisher={Taylor \& Francis}
}