Quick Start Guide

Emre Gönülateş

2024-02-20

library(irt)

Basic Objects

irt package contains many useful functions commonly used in psychometrics.

Item parameters are defined within three main objects types:

Item

In order to create an Item object, the psychometric model and item parameter values is sufficient. Specifying an item_id field is required if Item will be used within an Itempool or Testlet.

3PL Model

A three parameter logistic model item (3PL) requires a, b and c parameters to be specified:

item1 <- item(a = 1.2, b = -.8, c = .33, model = "3PL")
item1
#> A '3PL' item.
#> Model:   3PL (Three-Parameter Logistic Model)
#> Model Parameters:
#>   a = 1.2
#>   b = -0.8
#>   c = 0.33
#>   D = 1
#> 
#> --------------------------

a is the item discrimination, b is the item difficulty and c is the pseudo-guessing parameter.

By default, the value of scaling constant D is specified as 1. But it can be overridden:

item1 <- item(a = 1.2, b = -.8, c = .33, D = 1.7, model = "3PL")
item1
#> A '3PL' item.
#> Model:   3PL (Three-Parameter Logistic Model)
#> Model Parameters:
#>   a = 1.2
#>   b = -0.8
#>   c = 0.33
#>   D = 1.7
#> 
#> --------------------------

item_id and content field can be specified as well:

item1 <- item(a = 1.2, b = -.8, c = .33, D = 1.7, model = "3PL", 
              item_id = "ITM384", content = "Quadratic Equations")
item1
#> A '3PL' item.
#> Item ID:      ITM384
#> Model:   3PL (Three-Parameter Logistic Model)
#> Content: Quadratic Equations
#> Model Parameters:
#>   a = 1.2
#>   b = -0.8
#>   c = 0.33
#>   D = 1.7
#> 
#> --------------------------

Additional fields can be added through misc field:

item1 <- item(a = 1.2, b = -.8, c = .33, D = 1.7, model = "3PL", 
              item_id = "ITM384", content = "Quadratic Equations", 
              misc = list(key = "A", 
                          enemies = c("ITM664", "ITM964"), 
                          seed_year = 2020, 
                          target_grade = "11")
              )
item1
#> A '3PL' item.
#> Item ID:      ITM384
#> Model:   3PL (Three-Parameter Logistic Model)
#> Content: Quadratic Equations
#> Model Parameters:
#>   a = 1.2
#>   b = -0.8
#>   c = 0.33
#>   D = 1.7
#> 
#> Misc: 
#>   key: "A"
#>   enemies: "ITM664", "ITM964"
#>   seed_year: 2020
#>   target_grade: "11"
#> --------------------------

An item characteristic curve can be plotted using plot function:

plot(item1)

plot of chunk unnamed-chunk-6

Rasch Model

Rasch model item requires b parameter to be specified:

item2 <- item(b = -.8, model = "Rasch")
item2
#> A 'Rasch' item.
#> Model:   Rasch (Rasch Model)
#> Model Parameters:
#>   b = -0.8
#> 
#> --------------------------

For Rasch model, D parameter cannot be specified.

1PL Model

A one-parameter model item requires b parameter to be specified:

item3 <- item(b = -.8, D = 1.7, model = "1PL")
item3
#> A '1PL' item.
#> Model:   1PL (One-Parameter Logistic Model)
#> Model Parameters:
#>   b = -0.8
#>   D = 1.7
#> 
#> --------------------------

2PL Model

A two-parameter model item requires a and b parameters to be specified:

item4 <- item(a = 1.2, b = -.8, D = 1.702, model = "2PL")
item4
#> A '2PL' item.
#> Model:   2PL (Two-Parameter Logistic Model)
#> Model Parameters:
#>   a = 1.2
#>   b = -0.8
#>   D = 1.702
#> 
#> --------------------------

4PL Model

A four-parameter model item requires a, b, c and d parameters to be specified:

item5 <- item(a = 1.06, b = 1.76, c = .13, d = .98, model = "4PL", 
              item_id = "itm-5")
item5
#> A '4PL' item.
#> Item ID:      itm-5
#> Model:   4PL (Four-Parameter Logistic Model)
#> Model Parameters:
#>   a = 1.06
#>   b = 1.76
#>   c = 0.13
#>   d = 0.98
#>   D = 1
#> 
#> --------------------------

d is the upper-asymptote parameter.

Graded Response Model (GRM)

A Graded Response model item requires a and b parameters to be specified. b parameters is ascending vector of threshold parameters:

item6 <- item(a = 1.22, b = c(-1.9, -0.37, 0.82, 1.68), model = "GRM", 
              item_id = "itm-6")
item6
#> A 'GRM' item.
#> Item ID:      itm-6
#> Model:   GRM (Graded Response Model)
#> Model Parameters:
#>   a = 1.22
#>   b = -1.9;  -0.37;  0.82;  1.68
#>   D = 1
#> 
#> --------------------------
plot(item6)

plot of chunk unnamed-chunk-11

D parameter can also be specified.

Generalized Partial Credit Model (GPCM)

A Generalized Partial Credit model item requires a and b parameters to be specified. b parameters is ascending vector of threshold parameters:

item7 <- item(a = 1.22, b = c(-1.9, -0.37, 0.82, 1.68), D = 1.7, model = "GPCM", 
              item_id = "itm-7")
item7
#> A 'GPCM' item.
#> Item ID:      itm-7
#> Model:   GPCM (Generalized Partial Credit Model)
#> Model Parameters:
#>   a = 1.22
#>   b = -1.9;  -0.37;  0.82;  1.68
#>   D = 1.7
#> 
#> --------------------------

Partial Credit Model (PCM)

A Partial Credit model item requires b parameters to be specified. b parameters is ascending vector of threshold parameters:

item8 <- item(b = c(-1.9, -0.37, 0.82, 1.68), model = "PCM")
item8
#> A 'PCM' item.
#> Model:   PCM (Partial Credit Model)
#> Model Parameters:
#>   b = -1.9;  -0.37;  0.82;  1.68
#> 
#> --------------------------

Generating Random Item Parameters

An item with random item parameters can be generated using generate_item function:

generate_item("3PL")
#> A '3PL' item.
#> Model:   3PL (Three-Parameter Logistic Model)
#> Model Parameters:
#>   a = 0.4768
#>   b = 0.5809
#>   c = 0.0192
#>   D = 1
#> 
#> Misc: 
#>   key: "D"
#>   possible_options: "A", "B", "C", "D"
#> --------------------------
generate_item("2PL")
#> A '2PL' item.
#> Model:   2PL (Two-Parameter Logistic Model)
#> Model Parameters:
#>   a = 0.9363
#>   b = -2.2367
#>   D = 1
#> 
#> Misc: 
#>   key: "D"
#>   possible_options: "A", "B", "C", "D"
#> --------------------------
generate_item("Rasch")
#> A 'Rasch' item.
#> Model:   Rasch (Rasch Model)
#> Model Parameters:
#>   b = 1.0838
#> 
#> Misc: 
#>   key: "C"
#>   possible_options: "A", "B", "C", "D"
#> --------------------------
generate_item("GRM")
#> A 'GRM' item.
#> Model:   GRM (Graded Response Model)
#> Model Parameters:
#>   a = 1.5541
#>   b = -2.1701;  -0.5095;  0.3358
#>   D = 1
#> 
#> --------------------------
# The number of categories of polytomous items can be specified:
generate_item("GPCM", n_categories = 5)
#> A 'GPCM' item.
#> Model:   GPCM (Generalized Partial Credit Model)
#> Model Parameters:
#>   a = 1.1223
#>   b = -0.2897;  0.5058;  0.8368;  1.5294
#>   D = 1
#> 
#> --------------------------

Testlet

A testlet is simply a collection of Item objects:

item1 <- item(a = 1.2, b = -.8, c = .33, D = 1.7, model = "3PL", 
              item_id = "ITM384", content = "Quadratic Equations")
item2 <- item(a = 0.75, b = 1.8, c = .21, D = 1.7, model = "3PL", 
              item_id = "ITM722", content = "Quadratic Equations")
item3 <- item(a = 1.06, b = 1.76, c = .13, d = .98, model = "4PL", 
              item_id = "itm-5")
t1 <- testlet(c(item1, item2, item3))
t1
#> An object of class 'Testlet'.
#> Model:   BTM
#> 
#> Item List:
#> 
#>   item_id model     a     b     c     d     D content            
#>   <chr>   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>              
#> 1 ITM384  3PL    1.2  -0.8   0.33 NA      1.7 Quadratic Equations
#> 2 ITM722  3PL    0.75  1.8   0.21 NA      1.7 Quadratic Equations
#> 3 itm-5   4PL    1.06  1.76  0.13  0.98   1   <NA>

An testlet_id field is required if testlet will be used in an item pool.

t1 <- testlet(item1, item2, item3, testlet_id = "T1")
t1
#> An object of class 'Testlet'.
#> Testlet ID:      T1
#> Model:   BTM
#> 
#> Item List:
#> 
#>   item_id model     a     b     c     d     D content            
#>   <chr>   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>              
#> 1 ITM384  3PL    1.2  -0.8   0.33 NA      1.7 Quadratic Equations
#> 2 ITM722  3PL    0.75  1.8   0.21 NA      1.7 Quadratic Equations
#> 3 itm-5   4PL    1.06  1.76  0.13  0.98   1   <NA>

Itempool

An Itempool object is the most frequently used object type in irt package. It is a collection of Item and Testlet objects.

item1 <- generate_item("3PL", item_id = "I1") 
item2 <- generate_item("3PL", item_id = "I2") 
item3 <- generate_item("3PL", item_id = "I3") 
ip1 <- itempool(item1, item2, item3)

Item pools can be composed of items from different psychometric models and testlets:

item4 <- generate_item("GRM", item_id = "I4") 
item5 <- generate_item("3PL", item_id = "T1-I1") 
item6 <- generate_item("3PL", item_id = "T1-I2") 
t1 <- testlet(item5, item6, item_id = "T1")
ip2 <- itempool(item1, item2, item3, item4, t1)

Most of the time item pools are generated using data frames:

n_item <- 6 # Number of items
ipdf <- data.frame(a = rlnorm(n_item), b = rnorm(n_item), 
                   c = runif(n_item, 0, .3))
ip3 <- itempool(ipdf)
ip3
#> An object of class 'Itempool'.
#> Model of items: 3PL
#> D = 1
#> 
#>   item_id      a      b      c
#>   <chr>    <dbl>  <dbl>  <dbl>
#> 1 Item_1   0.458 -1.15  0.289 
#> 2 Item_2   0.742  0.193 0.0461
#> 3 Item_3   0.825 -0.737 0.0190
#> 4 Item_4   1.20  -0.692 0.0372
#> 5 Item_5   1.14  -0.333 0.231 
#> 6 Item_6  16.3    1.06  0.187

# Scaling constant can be specified
ip4 <- itempool(ipdf, D = 1.7)
ip4
#> An object of class 'Itempool'.
#> Model of items: 3PL
#> D = 1.7
#> 
#>   item_id      a      b      c
#>   <chr>    <dbl>  <dbl>  <dbl>
#> 1 Item_1   0.458 -1.15  0.289 
#> 2 Item_2   0.742  0.193 0.0461
#> 3 Item_3   0.825 -0.737 0.0190
#> 4 Item_4   1.20  -0.692 0.0372
#> 5 Item_5   1.14  -0.333 0.231 
#> 6 Item_6  16.3    1.06  0.187
ipdf <- data.frame(
  item_id = c("Item_1", "Item_2", "Item_3", "Item_4", "Item_5", "Item_6"), 
  model = c("3PL", "3PL", "3PL", "GPCM", "GPCM", "GPCM"), 
  a = c(1.0253, 1.3609, 1.6617, 1.096, 0.9654, 1.3995), 
  b1 = c(NA, NA, NA, -1.112, -0.1709, -1.1324), 
  b2 = c(NA, NA, NA, -0.4972, 0.2778, -0.5242), 
  b3 = c(NA, NA, NA, -0.0077, 0.9684, NA), 
  D = c(1.7, 1.7, 1.7, 1.7, 1.7, 1.7), 
  b = c(0.7183, -0.4107, -1.5452, NA, NA, NA), 
  c = c(0.0871, 0.0751, 0.0589, NA, NA, NA), 
  content = c("Geometry", "Algebra", "Algebra", "Geometry", "Algebra", 
              "Algebra") 
)

ip5 <- itempool(ipdf)

Itempool objects can also be converted to a data frame:

as.data.frame(ip2)
#>   item_id testlet_id model      a       b      c      b1     b2     b3 D  key
#> 1      I1       <NA>   3PL 0.9027 -0.0463 0.1484      NA     NA     NA 1    C
#> 2      I2       <NA>   3PL 1.2123  0.7124 0.2154      NA     NA     NA 1    D
#> 3      I3       <NA>   3PL 1.0751  1.1153 0.2230      NA     NA     NA 1    D
#> 4      I4       <NA>   GRM 1.1190      NA     NA -1.5937 0.3181 1.3589 1 <NA>
#> 5   T1-I1  Testlet_1   3PL 1.2368 -0.5051 0.1002      NA     NA     NA 1    D
#> 6   T1-I2  Testlet_1   3PL 0.8357  0.8960 0.0016      NA     NA     NA 1    C
#>   possible_options
#> 1       A, B, C, D
#> 2       A, B, C, D
#> 3       A, B, C, D
#> 4               NA
#> 5       A, B, C, D
#> 6       A, B, C, D

Basic IRT Functions

Probability

Probability of correct response (for dichotomous items) and probability of each category (for polytomous items) can be calculated using prob function:

item1 <- generate_item("3PL")
theta <- 0.84
# The probability of correct and incorrect response for `item1` at theta = 0.84
prob(item1, theta)
#>              0         1
#> [1,] 0.3103985 0.6896015

# Multiple theta values
prob(item1, theta = c(-1, 1))
#>              0         1
#> [1,] 0.7168617 0.2831383
#> [2,] 0.2783276 0.7216724

# Polytomous items:
item2 <- generate_item(model = "GPCM")
prob(item2, theta = 1)
#>              0         1         2         3
#> [1,] 0.0181105 0.1707979 0.5781817 0.2329099
prob(item2, theta = c(-1, 0, 1))
#>              0         1          2           3
#> [1,] 0.6553320 0.2958140 0.04792984 0.000924134
#> [2,] 0.2115960 0.4365783 0.32333048 0.028495280
#> [3,] 0.0181105 0.1707979 0.57818171 0.232909901

Probability of correct response (or category) for each item in an item pool can be calculated as:

ip <- generate_ip(model = "3PL", n = 7)
ip
#> An object of class 'Itempool'.
#> Model of items: 3PL
#> D = 1
#> possible_options = c("A", "B", "C", "D")
#> 
#>   item_id     a      b      c key  
#>   <chr>   <dbl>  <dbl>  <dbl> <chr>
#> 1 Item_1  0.716  0.217 0.253  D    
#> 2 Item_2  0.874  0.861 0.214  A    
#> 3 Item_3  1.86   0.974 0.231  C    
#> 4 Item_4  0.840 -0.258 0.160  A    
#> 5 Item_5  0.911  1.98  0.266  C    
#> 6 Item_6  1.46   0.296 0.177  A    
#> 7 Item_7  0.634  0.870 0.0836 B
prob(ip, theta = 0)
#>                0         1
#> Item_1 0.4024527 0.5975473
#> Item_2 0.5344001 0.4655999
#> Item_3 0.6608946 0.3391054
#> Item_4 0.3744612 0.6255388
#> Item_5 0.6299144 0.3700856
#> Item_6 0.4988078 0.5011922
#> Item_7 0.5815182 0.4184818
# When there are multiple theta values, a list where each element corresponds
# to a theta value returned. 
prob(ip, theta = c(-2, 0, 1))
#> [[1]]
#>                0         1
#> Item_1 0.6201176 0.3798824
#> Item_2 0.7266214 0.2733786
#> Item_3 0.7662261 0.2337739
#> Item_4 0.6817440 0.3182560
#> Item_5 0.7147542 0.2852458
#> Item_6 0.7946347 0.2053653
#> Item_7 0.7886959 0.2113041
#> 
#> [[2]]
#>                0         1
#> Item_1 0.4024527 0.5975473
#> Item_2 0.5344001 0.4655999
#> Item_3 0.6608946 0.3391054
#> Item_4 0.3744612 0.6255388
#> Item_5 0.6299144 0.3700856
#> Item_6 0.4988078 0.5011922
#> Item_7 0.5815182 0.4184818
#> 
#> [[3]]
#>                0         1
#> Item_1 0.2714881 0.7285119
#> Item_2 0.3691917 0.6308083
#> Item_3 0.3754810 0.6245190
#> Item_4 0.2165827 0.7834173
#> Item_5 0.5203189 0.4796811
#> Item_6 0.2171664 0.7828336
#> Item_7 0.4393019 0.5606981

Item characteristic curves (ICC) can be plotted:

# Plot ICC of each item in the item pool
plot(ip)

plot of chunk unnamed-chunk-24


# Plot test characteristic curve
plot(ip, type = "tcc")

plot of chunk unnamed-chunk-24

Information

Information value of an item at a given \(\theta\) value can also be calculated:

item1 <- generate_item("3PL")
info(item1, theta = -2)
#> [1] 0.02406235

# Multiple theta values
info(item1, theta = c(-1, 1))
#> [1] 0.04919700 0.08960753

# Polytomous items:
item2 <- generate_item(model = "GPCM")
info(item2, theta = 1)
#> [1] 0.5427016
info(item2, theta = c(-1, 0, 1))
#> [1] 0.8909455 1.1019585 0.5427016

Information values for each item in an item pool can be calculated as:

ip <- generate_ip(model = "3PL", n = 7)
ip
#> An object of class 'Itempool'.
#> Model of items: 3PL
#> D = 1
#> possible_options = c("A", "B", "C", "D")
#> 
#>   item_id     a      b      c key  
#>   <chr>   <dbl>  <dbl>  <dbl> <chr>
#> 1 Item_1  0.833 -0.951 0.0354 C    
#> 2 Item_2  0.934  0.970 0.184  B    
#> 3 Item_3  1.39  -0.112 0.288  A    
#> 4 Item_4  1.43   0.257 0.218  C    
#> 5 Item_5  1.02  -0.638 0.0673 B    
#> 6 Item_6  1.11  -0.100 0.246  A    
#> 7 Item_7  1.50   1.12  0.220  D
info(ip, theta = 0)
#>         Item_1    Item_2    Item_3    Item_4    Item_5    Item_6    Item_7
#> [1,] 0.1413498 0.1001709 0.2730636 0.2959566 0.2114078 0.1887165 0.1056933
info(ip, theta = c(-2, 0, 1))
#>          Item_1     Item_2     Item_3      Item_4    Item_5     Item_6
#> [1,] 0.12822222 0.00994675 0.01750681 0.008938492 0.1221402 0.02964977
#> [2,] 0.14134978 0.10017088 0.27306358 0.295956630 0.2114078 0.18871654
#> [3,] 0.09134627 0.15085687 0.18723309 0.285300661 0.1277243 0.15170483
#>           Item_7
#> [1,] 0.000624366
#> [2,] 0.105693322
#> [3,] 0.346033174

Information functions can be plotted:

# Plot information function of each item
plot_info(ip)

plot of chunk unnamed-chunk-27

# Plot test information function
plot_info(ip, tif = TRUE)

plot of chunk unnamed-chunk-27

Ability Estimation

For a given set of item parameters and item responses, the ability ($\theta$) estimates can be calculated using est_ability function.

# Generate an item pool 
ip <- generate_ip(model = "2PL", n = 10)
true_theta <- rnorm(5)
resp <- sim_resp(ip = ip, theta = true_theta, output = "matrix")

# Calculate raw scores
est_ability(resp = resp, ip = ip, method = "sum_score")
#> $est
#> S1 S2 S3 S4 S5 
#>  4  3  3  5  7 
#> 
#> $se
#> S1 S2 S3 S4 S5 
#> NA NA NA NA NA
# Estimate ability using maximum likelihood estimation:
est_ability(resp = resp, ip = ip, method = "ml")
#> $est
#>        S1        S2        S3        S4        S5 
#> -0.941935 -1.394198 -1.664643 -0.145236  0.732982 
#> 
#> $se
#>       S1       S2       S3       S4       S5 
#> 0.778740 0.776946 0.792755 0.816266 0.887546
# Estimate ability using EAP estimation:
est_ability(resp = resp, ip = ip, method = "eap")
#> $est
#>        S1        S2        S3        S4        S5 
#> -0.570637 -0.863875 -1.034963 -0.068463  0.442611 
#> 
#> $se
#>       S1       S2       S3       S4       S5 
#> 0.626353 0.623287 0.623245 0.637943 0.655944
# Estimate ability using EAP estimation with a different prior 
# (prior mean = 0, prior standard deviation = 2):
est_ability(resp = resp, ip = ip, method = "eap", prior_pars = c(0, 2))
#> $est
#>        S1        S2        S3        S4        S5 
#> -0.804940 -1.226069 -1.479286 -0.086785  0.681779 
#> 
#> $se
#>       S1       S2       S3       S4       S5 
#> 0.747158 0.752559 0.764971 0.768755 0.822592