Including results from other outlier identification methods in O3 plots

Antony Unwin

2020-04-24

Introduction

OutliersO3 includes access to six different methods available in R for identifying possible outliers. If you want to use another outlier method (or use one of the available ones, but with some different parameters) to draw an O3 plot, this is what you have to do.

Using a single new method with up to 3 tolerance levels.

Prepare a list called outResults made up of the following elements:
* data the dataset used
* nw the number of variable combinations analysed (default assumes sum(choose(n1, 1:n1)), for n1=#variables)
* mm name of the method used
* tols set of t tolerance levels (t≤3)
* outList a multi-level list with t x nw x 3 elements. Each of the t lists is a list of nw lists Each of the nw lists is a list of three: names of variables in the combination, indices of cases identified as potential outliers, outlier distances for every case.

outResults can then be input to the function O3plotT to draw the corresponding O3 plot.

(Note that if your method does not work for one dimension, like, for instance, FastPCS, you should add a univariate approach. OutliersO3 uses boxplot limits in that situation.)

Using one of more new methods and adding their results to results for methods in OutliersO3.

Prepare a list called outResults made up of the following elements:
* data the dataset used
* nw the number of variable combinations analysed (default assumes sum(choose(n1, 1:n1)), for n1=#variables)
* mm names of the m methods used
* tols set of individual tolerance levels, one for each method
* outList a multi-level list with m x nw x 3 elements. Each of the m = length(mm) lists is a list of nw lists Each of the nw lists is a list of three: names of variables in the combination, indices of cases identified as potential outliers, outlier distances for every case.

If these results are to be added to results found by the function O3prep in the OutliersO3 package, then the two outList lists can be combined using rlist::list.append.
e.g., if the lists are called outList1 and outList2 and outList2 has only two elements:

outList <- rlist::list.append(outList1, outList2[[1]], outList2[[2]])

outResults is constructed as above and input to the function O3plotM to draw an O3 plot.
NB outList1 and outList2 have to use the same dataset and the same nw combinations.

Tips for constructing an outResults list.

There are two issues: each outlier method has its own structure, parameters, and output, and outliers need to be identified for many different combinations of variables. The O3a function in the OutliersO3 package shows how to run through the variable combinations and includes code for six different methods. It is likely that you can use ideas from one of more of these to determine how your method could be coded.
The appropriate mapply R function used in the O3prep function can then be employed to produce the outList list element of an outResults list to be used as input for an O3plot function.