G-Computation or standardization for the Cox, Fine-Gray and binomial regression models for survival data

Klaus Holst & Thomas Scheike

2024-02-16

G-computation for the Cox and Fine-Gray models

Computing the standardized estimate (G-estimation) based on the Cox or Fine-Gray model : \[ \hat S(t,A=a) = n^{-1} \sum_i S(t,A=a,X_i) \] and this estimator has influence function \[ S(t,A=a,X_i) - S(t,A=a) + E( D_{A_0(t), \beta} S(t,A=a,X_i) ) \epsilon_i(t) \] where \(\epsilon_i(t)\) is the iid decomposition of \((\hat A(t) - A(t), \hat \beta- \beta)\).

These estimates have a causal interpration under the assumption of no-unmeasured confounders, and even without the causal assumptions this standardization can still be a useful summary measure.

set.seed(100)

data(bmt); bmt$time <- bmt$time+runif(nrow(bmt))*0.001
dfactor(bmt) <- tcell~tcell
bmt$event <- (bmt$cause!=0)*1

fg1 <- cifreg(Event(time,cause)~tcell+platelet+age,bmt,cause=1,
          cox.prep=TRUE,propodds=NULL)
summary(survivalG(fg1,bmt,50))
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.4331 0.02749 0.3793 0.4870 6.321e-56
#> risk1   0.2727 0.05863 0.1577 0.3876 3.313e-06
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err   2.5%    97.5% P-value
#> ps0  -0.1605 0.06353 -0.285 -0.03597 0.01153
#> 
#> Average Treatment effect ratio (G-estimator) :
#>        Estimate  Std.Err      2.5%     97.5%    P-value
#> [ps0] 0.6295004 0.139248 0.3565794 0.9024214 0.00779742

fg2 <- cifreg(Event(time,cause)~tcell+platelet+age,bmt,cause=2,
          cox.prep=TRUE,propodds=NULL)
summary(survivalG(fg2,bmt,50))
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.2127 0.02314 0.1674 0.2581 3.757e-20
#> risk1   0.3336 0.06799 0.2003 0.4668 9.281e-07
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err     2.5%  97.5% P-value
#> ps0   0.1208 0.07189 -0.02009 0.2617 0.09285
#> 
#> Average Treatment effect ratio (G-estimator) :
#>       Estimate   Std.Err      2.5%    97.5%   P-value
#> [ps0] 1.567915 0.3627528 0.8569321 2.278897 0.1174496

ss <- phreg(Surv(time,event)~tcell+platelet+age,bmt)
summary(survivalG(ss,bmt,50))
#> risk:
#>       Estimate Std.Err   2.5%  97.5%    P-value
#> risk0   0.6539 0.02709 0.6008 0.7070 9.218e-129
#> risk1   0.5640 0.05971 0.4470 0.6811  3.531e-21
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err    2.5%   97.5% P-value
#> ps0 -0.08992  0.0629 -0.2132 0.03337  0.1529
#> 
#> Average Treatment effect ratio (G-estimator) :
#>        Estimate    Std.Err      2.5%    97.5%   P-value
#> [ps0] 0.8624974 0.09446477 0.6773499 1.047645 0.1455042

G-computation for the binomial regression

We compare with the similar estimates using the Doubly Robust estimating equations using binregATE. The standardization from the G-computation can also be computed using a specialized function that takes less memory and is quicker (for large data).


## survival situation
sr1 <- binregATE(Event(time,event)~tcell+platelet+age,bmt,cause=1,
         time=40, treat.model=tcell~platelet+age)
summary(sr1)
#> 
#>    n events
#>  408    241
#> 
#>  408 clusters
#> coeffients:
#>              Estimate   Std.Err      2.5%     97.5% P-value
#> (Intercept)  0.679693  0.138551  0.408138  0.951248  0.0000
#> tcell1      -0.032018  0.353415 -0.724698  0.660662  0.9278
#> platelet    -0.504940  0.248245 -0.991492 -0.018387  0.0419
#> age          0.315033  0.117786  0.084178  0.545889  0.0075
#> 
#> exp(coeffients):
#>             Estimate    2.5%  97.5%
#> (Intercept)  1.97327 1.50401 2.5889
#> tcell1       0.96849 0.48447 1.9361
#> platelet     0.60354 0.37102 0.9818
#> age          1.37030 1.08782 1.7261
#> 
#> Average Treatment effects (G-formula) :
#>             Estimate    Std.Err       2.5%      97.5% P-value
#> treat0     0.6233534  0.0274214  0.5696085  0.6770983   0.000
#> treat1     0.6161006  0.0748225  0.4694512  0.7627499   0.000
#> treat:1-0 -0.0072528  0.0802736 -0.1645862  0.1500805   0.928
#> 
#> Average Treatment effects (double robust) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.623341  0.027505  0.569433  0.677249   0.000
#> treat1     0.645159  0.085872  0.476853  0.813465   0.000
#> treat:1-0  0.021818  0.090254 -0.155076  0.198711   0.809

## relative risk effect 
estimate(coef=sr1$riskDR,vcov=sr1$var.riskDR,f=function(p) p[2]/p[1],null=1)
#>          Estimate Std.Err   2.5% 97.5% P-value
#> [treat1]    1.035  0.1453 0.7503  1.32  0.8096
#> 
#>  Null Hypothesis: 
#>   [treat1] = 1

## competing risks 
br1 <- binregATE(Event(time,cause)~tcell+platelet+age,bmt,cause=1,
         time=40,treat.model=tcell~platelet+age)
summary(br1)
#> 
#>    n events
#>  408    157
#> 
#>  408 clusters
#> coeffients:
#>              Estimate   Std.Err      2.5%     97.5% P-value
#> (Intercept) -0.188365  0.130922 -0.444967  0.068237  0.1502
#> tcell1      -0.715361  0.352473 -1.406195 -0.024527  0.0424
#> platelet    -0.537310  0.244804 -1.017117 -0.057502  0.0282
#> age          0.417814  0.107282  0.207545  0.628084  0.0001
#> 
#> exp(coeffients):
#>             Estimate    2.5%  97.5%
#> (Intercept)  0.82831 0.64085 1.0706
#> tcell1       0.48902 0.24507 0.9758
#> platelet     0.58432 0.36164 0.9441
#> age          1.51864 1.23065 1.8740
#> 
#> Average Treatment effects (G-formula) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.417795  0.027029  0.364819  0.470771  0.0000
#> treat1     0.266393  0.062041  0.144795  0.387991  0.0000
#> treat:1-0 -0.151402  0.067763 -0.284214 -0.018589  0.0255
#> 
#> Average Treatment effects (double robust) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.417337  0.027120  0.364184  0.470491  0.0000
#> treat1     0.231224  0.060718  0.112218  0.350229  0.0001
#> treat:1-0 -0.186114  0.066117 -0.315700 -0.056527  0.0049

and using the specialized function

br1 <- binreg(Event(time,cause)~tcell+platelet+age,bmt,cause=1,time=40)
Gbr1 <- binregG(br1,data=bmt)
summary(Gbr1)
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.4178 0.02703 0.3648 0.4708 6.752e-54
#> risk1   0.2664 0.06204 0.1448 0.3880 1.756e-05
#> 
#> Average Treatment effects (G-estimator) :
#>    Estimate Std.Err    2.5%    97.5% P-value
#> p1  -0.1514 0.06776 -0.2842 -0.01859 0.02546
#> 
#> Average Treatment effect ratio (G-estimator) :
#>       Estimate   Std.Err      2.5%     97.5%    P-value
#> [p1] 0.6376167 0.1542628 0.3352673 0.9399661 0.01881733

## contrasting average age to +2-sd age, Avalues
Gbr2 <- binregG(br1,data=bmt,varname="age",Avalues=c(0,2))
summary(Gbr2)
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.3935 0.02529 0.3439 0.4431 1.432e-54
#> risk2   0.5929 0.05580 0.4836 0.7023 2.261e-26
#> 
#> Average Treatment effects (G-estimator) :
#>    Estimate Std.Err   2.5%  97.5%   P-value
#> p1   0.1994 0.05019 0.1011 0.2978 7.069e-05
#> 
#> Average Treatment effect ratio (G-estimator) :
#>      Estimate  Std.Err     2.5%    97.5%     P-value
#> [p1] 1.506855 0.132196 1.247756 1.765954 0.000126016

SessionInfo

sessionInfo()
#> R version 4.3.2 (2023-10-31)
#> Platform: aarch64-apple-darwin22.6.0 (64-bit)
#> Running under: macOS Sonoma 14.3.1
#> 
#> Matrix products: default
#> BLAS:   /Users/kkzh/.asdf/installs/R/4.3.2/lib/R/lib/libRblas.dylib 
#> LAPACK: /Users/kkzh/.asdf/installs/R/4.3.2/lib/R/lib/libRlapack.dylib;  LAPACK version 3.11.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Europe/Copenhagen
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] splines   stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#> [1] ggplot2_3.4.4  cowplot_1.1.1  mets_1.3.4     timereg_2.0.5  survival_3.5-7
#> 
#> loaded via a namespace (and not attached):
#>  [1] sass_0.4.7          utf8_1.2.4          future_1.33.1      
#>  [4] generics_0.1.3      lattice_0.22-5      listenv_0.9.1      
#>  [7] digest_0.6.34       magrittr_2.0.3      evaluate_0.23      
#> [10] grid_4.3.2          mvtnorm_1.2-4       fastmap_1.1.1      
#> [13] jsonlite_1.8.8      Matrix_1.6-5        fansi_1.0.6        
#> [16] scales_1.2.1        isoband_0.2.7       codetools_0.2-19   
#> [19] numDeriv_2016.8-1.1 jquerylib_0.1.4     lava_1.7.4         
#> [22] cli_3.6.2           rlang_1.1.3         parallelly_1.37.0  
#> [25] future.apply_1.11.1 munsell_0.5.0       withr_3.0.0        
#> [28] cachem_1.0.8        yaml_2.3.7          tools_4.3.2        
#> [31] parallel_4.3.2      ucminf_1.2.0        dplyr_1.1.3        
#> [34] colorspace_2.1-0    globals_0.16.2      vctrs_0.6.5        
#> [37] R6_2.5.1            lifecycle_1.0.4     MASS_7.3-60        
#> [40] pkgconfig_2.0.3     bslib_0.5.1         pillar_1.9.0       
#> [43] gtable_0.3.4        glue_1.7.0          Rcpp_1.0.12        
#> [46] tidyselect_1.2.0    xfun_0.41           tibble_3.2.1       
#> [49] highr_0.10          knitr_1.45          farver_2.1.1       
#> [52] htmltools_0.5.6.1   labeling_0.4.3      rmarkdown_2.25     
#> [55] compiler_4.3.2