
<!-- README.md is generated from README.Rmd. Please edit that file -->

# ActiveLearning4SPM

![R-CMD-check](https://github.com/unina-sfere/ActiveLearning4SPM/actions/workflows/R-CMD-check.yaml/badge.svg)
[![Codecov](https://codecov.io/gh/unina-sfere/ActiveLearning4SPM/branch/main/graph/badge.svg)](https://app.codecov.io/gh/unina-sfere/ActiveLearning4SPM)
<!-- [![CRAN status](https://www.r-pkg.org/badges/version/ActiveLearning4SPM)](https://CRAN.R-project.org/package=ActiveLearning4SPM) -->

The `ActiveLearning4SPM` package implements the methodology of Capezza,
Lepore, and Paynabar (2025) for  
**stream-based active learning for process monitoring**:

- Capezza, C., Lepore, A., & Paynabar, K. (2025). Stream-Based Active
  Learning for Process Monitoring. *Technometrics*.
  <doi:10.1080/00401706.2025.2561744>.

The package provides tools to:

- **Simulate** multivariate data streams with true hidden states.  
- **Fit** partially hidden Markov models (pHMMs) with user-specified or
  automatically initialized parameters.  
- **Perform stream-based active learning**, balancing *exploration* and
  *exploitation* when deciding whether to acquire new labels under a
  budget constraint.

The methodology is motivated by process monitoring in industrial
applications where obtaining labels is costly.

## Installation

You can install the development version from GitHub:

``` r
# install.packages("devtools")
devtools::install_github("unina-sfere/ActiveLearning4SPM")
```

Once on CRAN, you’ll be able to install with:

``` r
install.packages("streamALpHMM")
```

## Example

### Simulate a data stream

``` r
library(ActiveLearning4SPM)

set.seed(123)
dat <- simulate_stream(T0 = 100, TT = 500, d = 10)
str(dat)
```

    ## List of 2
    ##  $ x: num [1:600] 1 1 1 1 1 1 1 1 1 1 ...
    ##  $ y: num [1:600, 1:10] 0.756 -0.872 -0.279 1.43 0.618 ...

### Fit a pHMM with user-defined initialization

``` r
y <- dat$y
d <- ncol(y)
xlabeled <- dat$x
xlabeled[sample(1:600, 400)] <- NA  # partially labeled

fit <- fit_pHMM(
  y = y,
  xlabeled = xlabeled,
  nstates = 3,
  mean_start = list(rep(0, d), rep(1, d), rep(-1, d)),
  equal_covariance = TRUE
)
fit$AIC
```

    ## [1] 12711.52

### Fit a pHMM with automatic initialization

``` r
fit_auto <- fit_pHMM_auto(y = y, xlabeled = xlabeled, max_nstates = 3)
fit_auto$AIC
```

    ## [1] 12711.52

### Perform stream-based active learning

``` r
y <- dat$y[1:200, ]
true_x <- dat$x[1:200]
out <- active_learning_pHMM(y = y,
                            true_x = true_x,
                            T0 = 100, 
                            B = 0.1,
                            verbose = TRUE)
```

    ## t=101 | Available labels: 10 | Explor. p-value = 0.212 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=102 | Available labels: 10 | Explor. p-value = 0.314 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=103 | Available labels: 10 | Explor. p-value = 0.696 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=104 | Available labels: 10 | Explor. p-value = 0.744 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=105 | Available labels: 10 | Explor. p-value = 0.254 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=106 | Available labels: 10 | Explor. p-value = 0.174 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=107 | Available labels: 10 | Explor. p-value = 0.064 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=108 | Available labels: 10 | Explor. p-value = 0.571 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=109 | Available labels: 10 | Explor. p-value = 0.046 | Exploit. p-value = 1.000 | True state = 1 | Decision = label_exploration

    ## t=110 | Available labels: 9 | Explor. p-value = 0.062 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=111 | Available labels: 9 | Explor. p-value = 0.221 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=112 | Available labels: 9 | Explor. p-value = 0.167 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=113 | Available labels: 9 | Explor. p-value = 0.872 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=114 | Available labels: 9 | Explor. p-value = 0.799 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=115 | Available labels: 9 | Explor. p-value = 0.489 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=116 | Available labels: 9 | Explor. p-value = 0.234 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=117 | Available labels: 9 | Explor. p-value = 0.567 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=118 | Available labels: 9 | Explor. p-value = 0.392 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=119 | Available labels: 9 | Explor. p-value = 0.648 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=120 | Available labels: 9 | Explor. p-value = 0.959 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=121 | Available labels: 9 | Explor. p-value = 0.902 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=122 | Available labels: 9 | Explor. p-value = 0.519 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=123 | Available labels: 9 | Explor. p-value = 0.524 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=124 | Available labels: 9 | Explor. p-value = 0.506 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=125 | Available labels: 9 | Explor. p-value = 0.695 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=126 | Available labels: 9 | Explor. p-value = 0.891 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=127 | Available labels: 9 | Explor. p-value = 0.928 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=128 | Available labels: 9 | Explor. p-value = 0.841 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=129 | Available labels: 9 | Explor. p-value = 0.736 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=130 | Available labels: 9 | Explor. p-value = 0.899 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=131 | Available labels: 9 | Explor. p-value = 0.828 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=132 | Available labels: 9 | Explor. p-value = 0.178 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=133 | Available labels: 9 | Explor. p-value = 0.086 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=134 | Available labels: 9 | Explor. p-value = 0.275 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=135 | Available labels: 9 | Explor. p-value = 0.483 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=136 | Available labels: 9 | Explor. p-value = 0.851 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=137 | Available labels: 9 | Explor. p-value = 0.879 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=138 | Available labels: 9 | Explor. p-value = 0.090 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=139 | Available labels: 9 | Explor. p-value = 0.008 | Exploit. p-value = 1.000 | True state = 1 | Decision = label_exploration

    ## t=140 | Available labels: 8 | Explor. p-value = 0.271 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=141 | Available labels: 8 | Explor. p-value = 0.578 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=142 | Available labels: 8 | Explor. p-value = 0.295 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=143 | Available labels: 8 | Explor. p-value = 0.400 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=144 | Available labels: 8 | Explor. p-value = 0.477 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=145 | Available labels: 8 | Explor. p-value = 0.491 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=146 | Available labels: 8 | Explor. p-value = 0.416 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=147 | Available labels: 8 | Explor. p-value = 0.783 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=148 | Available labels: 8 | Explor. p-value = 0.503 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=149 | Available labels: 8 | Explor. p-value = 0.160 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=150 | Available labels: 8 | Explor. p-value = 0.309 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=151 | Available labels: 8 | Explor. p-value = 0.163 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=152 | Available labels: 8 | Explor. p-value = 0.180 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=153 | Available labels: 8 | Explor. p-value = 0.243 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=154 | Available labels: 8 | Explor. p-value = 0.700 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=155 | Available labels: 8 | Explor. p-value = 0.960 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=156 | Available labels: 8 | Explor. p-value = 0.503 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=157 | Available labels: 8 | Explor. p-value = 0.518 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=158 | Available labels: 8 | Explor. p-value = 0.217 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=159 | Available labels: 8 | Explor. p-value = 0.934 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=160 | Available labels: 8 | Explor. p-value = 0.883 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=161 | Available labels: 8 | Explor. p-value = 0.791 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=162 | Available labels: 8 | Explor. p-value = 0.570 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=163 | Available labels: 8 | Explor. p-value = 0.237 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=164 | Available labels: 8 | Explor. p-value = 0.035 | Exploit. p-value = 1.000 | True state = 1 | Decision = label_exploration

    ## t=165 | Available labels: 7 | Explor. p-value = 0.240 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=166 | Available labels: 7 | Explor. p-value = 0.378 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=167 | Available labels: 7 | Explor. p-value = 0.374 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=168 | Available labels: 7 | Explor. p-value = 0.234 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=169 | Available labels: 7 | Explor. p-value = 0.251 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=170 | Available labels: 7 | Explor. p-value = 0.344 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=171 | Available labels: 7 | Explor. p-value = 0.789 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=172 | Available labels: 7 | Explor. p-value = 0.608 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=173 | Available labels: 7 | Explor. p-value = 0.482 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=174 | Available labels: 7 | Explor. p-value = 0.497 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=175 | Available labels: 7 | Explor. p-value = 0.706 | Exploit. p-value = 1.000 | True state = 1 | Decision = 1

    ## t=176 | Available labels: 7 | Explor. p-value = 0.093 | Exploit. p-value = 1.000 | True state = 2 | Decision = label_exploration

    ## t=177 | Available labels: 6 | Explor. p-value = 0.008 | Exploit. p-value = 0.500 | True state = 2 | Decision = label_exploration

    ## t=178 | Available labels: 5 | Explor. p-value = 0.353 | Exploit. p-value = 0.300 | True state = 2 | Decision = 2

    ## t=179 | Available labels: 5 | Explor. p-value = 0.722 | Exploit. p-value = 0.200 | True state = 2 | Decision = 2

    ## t=180 | Available labels: 5 | Explor. p-value = 0.949 | Exploit. p-value = 0.650 | True state = 2 | Decision = 2

    ## t=181 | Available labels: 5 | Explor. p-value = 0.947 | Exploit. p-value = 0.250 | True state = 1 | Decision = 2

    ## t=182 | Available labels: 5 | Explor. p-value = 0.029 | Exploit. p-value = 0.650 | True state = 1 | Decision = label_exploration

    ## t=183 | Available labels: 4 | Explor. p-value = 0.018 | Exploit. p-value = 0.100 | True state = 1 | Decision = label_exploitation

    ## t=184 | Available labels: 3 | Explor. p-value = 0.105 | Exploit. p-value = 0.455 | True state = 1 | Decision = 1

    ## t=185 | Available labels: 3 | Explor. p-value = 0.060 | Exploit. p-value = 0.381 | True state = 1 | Decision = label_exploration

    ## t=186 | Available labels: 2 | Explor. p-value = 0.401 | Exploit. p-value = 0.033 | True state = 1 | Decision = label_exploitation

    ## t=187 | Available labels: 1 | Explor. p-value = 0.861 | Exploit. p-value = 0.804 | True state = 1 | Decision = 1

    ## t=188 | Available labels: 1 | Explor. p-value = 0.618 | Exploit. p-value = 0.058 | True state = 1 | Decision = 1

    ## t=189 | Available labels: 1 | Explor. p-value = 0.513 | Exploit. p-value = 0.146 | True state = 1 | Decision = 1

    ## t=190 | Available labels: 1 | Explor. p-value = 0.154 | Exploit. p-value = 0.182 | True state = 1 | Decision = 1

    ## t=191 | Available labels: 1 | Explor. p-value = 0.602 | Exploit. p-value = 0.525 | True state = 1 | Decision = 1

    ## t=192 | Available labels: 1 | Explor. p-value = 0.441 | Exploit. p-value = 0.056 | True state = 1 | Decision = 1

    ## t=193 | Available labels: 1 | Explor. p-value = 0.582 | Exploit. p-value = 0.469 | True state = 1 | Decision = 1

    ## t=194 | Available labels: 1 | Explor. p-value = 0.502 | Exploit. p-value = 0.857 | True state = 1 | Decision = 1

    ## t=195 | Available labels: 1 | Explor. p-value = 0.803 | Exploit. p-value = 0.792 | True state = 1 | Decision = 1

    ## t=196 | Available labels: 1 | Explor. p-value = 0.371 | Exploit. p-value = 0.200 | True state = 1 | Decision = 1

    ## t=197 | Available labels: 1 | Explor. p-value = 0.703 | Exploit. p-value = 0.700 | True state = 1 | Decision = 1

    ## t=198 | Available labels: 1 | Explor. p-value = 0.676 | Exploit. p-value = 0.500 | True state = 1 | Decision = 1

    ## t=199 | Available labels: 1 | Explor. p-value = 0.611 | Exploit. p-value = 0.400 | True state = 1 | Decision = 1

    ## t=200 | Available labels: 1 | Explor. p-value = 0.182 | Exploit. p-value = 0.150 | True state = 1 | Decision = label_exploitation

``` r
table(out$xhat, true_x)
```

    ##    true_x
    ##       1   2
    ##   1 194   0
    ##   2   1   5

``` r
out$scores
```

    ## $accuracy
    ## [1] 0.99
    ## 
    ## $precision
    ## [1] 0.8333333
    ## 
    ## $recall
    ## [1] 1
    ## 
    ## $f1
    ## [1] 0.9090909
    ## 
    ## $auc
    ## [1] 0.9947368
