Bayesian Changepoint Detection

Introduction

Bayesian changepoint detection offers several advantages over frequentist methods:

Probabilistic uncertainty: Full posterior distributions over changepoint locations
Prior information: Incorporate domain knowledge
Natural online updates: Sequential updating as new data arrives
Existence probability: Probability that a changepoint exists at each location

BOCPD: Bayesian Online Changepoint Detection

BOCPD (Adams & MacKay, 2007) is the flagship Bayesian method:

library(RegimeChange)

# Generate example data
set.seed(123)
data <- c(rnorm(100, 0, 1), rnorm(100, 3, 1))

# Basic BOCPD
result <- detect_regimes(data, method = "bocpd")
print(result)
#> 
#> Regime Change Detection Results
#> ================================
#> 
#> Method: bocpd 
#> Change type: both 
#> Mode: offline 
#> 
#> Data: n = 200 observations
#> 
#> Changepoints detected: 0

Prior Specification

Normal-Gamma Prior

For unknown mean and variance (most common case):

# Define prior
prior <- normal_gamma(
  mu0 = 0,      # Prior mean for the mean
  kappa0 = 1,   # Prior precision for the mean
  alpha0 = 1,   # Shape for inverse-gamma on variance
  beta0 = 1     # Rate for inverse-gamma on variance
)

print(prior)
#> Regime Prior Specification
#> ==========================
#> Type: normal_gamma 
#> Description: Normal-Gamma prior for unknown mean and variance 
#> 
#> Parameters:
#>   mu0: 0
#>   kappa0: 1
#>   alpha0: 1
#>   beta0: 1

Using the prior:

result <- detect_regimes(data, method = "bocpd", prior = prior)

Normal Prior with Known Variance

When variance is known:

prior_known <- normal_known_var(
  mu0 = 0,         # Prior mean
  sigma0 = 1,      # Prior standard deviation for mean
  known_var = 1    # Known variance
)

result <- detect_regimes(data, method = "bocpd", prior = prior_known)

Poisson-Gamma Prior

For count data:

prior_poisson <- poisson_gamma(alpha0 = 1, beta0 = 1)

Hazard Prior

The hazard prior controls the expected frequency of changepoints:

Geometric Hazard

Constant hazard rate:

# Expected run length = 1/lambda = 100
hazard <- geometric_hazard(lambda = 0.01)
print(hazard)
#> Hazard Prior Specification
#> ==========================
#> Type: geometric 
#> Description: Geometric hazard (expected run length: 100.0) 
#> 
#> Parameters:
#>   lambda: 0.01
#>   Expected run length: 100.0

Constant Hazard

Fixed probability per time step:

hazard_const <- constant_hazard(lambda = 0.05)

Negative Binomial Hazard

For overdispersed run lengths:

hazard_nb <- negbin_hazard(r = 5, p = 0.1)

Posterior Analysis

Probability of Change at Each Point

result <- detect_regimes(data, method = "bocpd")

# Access changepoint probabilities
prob_change <- result$prob_change

# Plot posterior probability
plot(prob_change, type = "l", 
     main = "Posterior Probability of Changepoint",
     xlab = "Time", ylab = "P(changepoint)")
abline(v = 100, col = "red", lty = 2)

Posterior Visualization

plot(result, type = "posterior")

Run Length Distribution

plot(result, type = "runlength")

Shiryaev-Roberts Procedure

An alternative Bayesian approach optimal for minimizing detection delay:

result_sr <- detect_regimes(data, method = "shiryaev",
                            mu0 = 0,      # Pre-change mean
                            mu1 = 3,      # Post-change mean
                            sigma = 1,    # Known standard deviation
                            threshold = 100)
print(result_sr)
#> 
#> Regime Change Detection Results
#> ================================
#> 
#> Method: shiryaev 
#> Change type: both 
#> Mode: offline 
#> 
#> Data: n = 200 observations
#> 
#> Changepoints detected: 5 
#> Locations: 107, 112, 134, 142, 200 
#> 
#> Segments:
#>   Segment 1: [1, 107] (n=107) | mean=0.254, sd=1.085
#>   Segment 2: [108, 112] (n=5) | mean=2.781, sd=1.029
#>   Segment 3: [113, 134] (n=22) | mean=2.870, sd=0.796
#>   Segment 4: [135, 142] (n=8) | mean=2.908, sd=1.434
#>   Segment 5: [143, 200] (n=58) | mean=2.944, sd=1.014
#>   Segment 6: [201, 200] (n=0) | mean=NA, sd=NA

The Shiryaev-Roberts statistic:

plot(result_sr, type = "diagnostic")
#> Warning: Removed 1 row containing missing values or values outside the scale range
#> (`geom_hline()`).

Uncertainty Quantification

Credible Intervals

BOCPD provides natural credible intervals from the posterior:

# Highest posterior density interval
if (length(result$changepoints) > 0) {
  cat("Changepoint estimate:", result$changepoints[1], "\n")
  
  # Find 95% credible interval
  prob <- result$prob_change
  cumprob <- cumsum(prob) / sum(prob)
  lower <- which(cumprob >= 0.025)[1]
  upper <- which(cumprob >= 0.975)[1]
  cat("95% Credible interval: [", lower, ",", upper, "]\n")
}

Existence Probability

Probability that at least one changepoint exists:

existence_prob <- max(result$prob_change)
cat("Maximum changepoint probability:", round(existence_prob, 3), "\n")
#> Maximum changepoint probability: 0.02

Online Mode

BOCPD is naturally suited for online detection:

# Create online detector
detector <- regime_detector(
  method = "bocpd",
  prior = normal_gamma(),
  hazard = geometric_hazard(0.01),
  threshold = 0.5
)

# Simulate online processing
set.seed(123)
stream <- c(rnorm(100, 0, 1), rnorm(50, 3, 1))

alarm_time <- NULL
for (i in seq_along(stream)) {
  detector <- update(detector, stream[i])
  
  if (isTRUE(detector$last_result$alarm) && is.null(alarm_time)) {
    alarm_time <- i
    cat("Alarm at observation:", i, "\n")
    cat("Probability of change:", round(detector$last_result$prob_change, 3), "\n")
    break
  }
}

Multivariate BOCPD

For multivariate time series:

# Generate bivariate data
set.seed(42)
n <- 200
cp <- 100
data_mv <- rbind(
  matrix(rnorm(cp * 2, 0), ncol = 2),
  matrix(rnorm((n - cp) * 2, 2), ncol = 2)
)

# Normal-Wishart prior for multivariate data
prior_mv <- normal_wishart(
  mu0 = c(0, 0),
  kappa0 = 1,
  nu0 = 4,
  Psi0 = diag(2)
)

result_mv <- detect_regimes(data_mv, method = "bocpd", prior = prior_mv)
print(result_mv)
#> 
#> Regime Change Detection Results
#> ================================
#> 
#> Method: bocpd 
#> Change type: both 
#> Mode: offline 
#> 
#> Data: n = 200 observations
#> 
#> Changepoints detected: 0

Comparison: BOCPD vs Frequentist

# Compare BOCPD with PELT
comparison <- compare_methods(
  data = data,
  methods = c("bocpd", "pelt"),
  true_changepoints = 100
)
print(comparison)
#> 
#> Changepoint Detection Method Comparison
#> ========================================
#> 
#>  method n_changepoints   time_sec f1 hausdorff adj_rand
#>   bocpd              0 0.10733008  0       Inf        0
#>    pelt              1 0.02529216  1         0        1
#> 
#> True changepoints: 100 
#> 
#> Detected changepoints by method:
#>   bocpd: (none)
#>   pelt: 100

Advantages and Limitations

Advantages

Full posterior uncertainty quantification
Natural sequential updates
Principled prior incorporation
Works well with limited data

Limitations

Requires prior specification
Computationally more intensive
Sensitive to prior misspecification

Best Practices

Choose appropriate priors: Use domain knowledge when available
Calibrate hazard rate: Set based on expected changepoint frequency
Check sensitivity: Try different priors to assess robustness
Use truncation: For long series, truncate run length for efficiency
Visualize posterior: Always examine the full posterior distribution

References

Adams, R. P., & MacKay, D. J. C. (2007). Bayesian online changepoint detection. arXiv preprint arXiv:0710.3742.

Tartakovsky, A. G., & Moustakides, G. V. (2010). State-of-the-art in Bayesian changepoint detection. Sequential Analysis, 29(2), 125-145.