rm.sdt.Rd
This function estimates a version of the hierarchical rater model (HRM) based on signal detection theory (HRM-SDT; DeCarlo, 2005; DeCarlo, Kim & Johnson, 2011; Robitzsch & Steinfeld, 2018). The model is estimated by means of an EM algorithm adapted from multilevel latent class analysis (Vermunt, 2008).
rm.sdt(dat, pid, rater, Qmatrix=NULL, theta.k=seq(-9, 9, len=30), est.a.item=FALSE, est.c.rater="n", est.d.rater="n", est.mean=FALSE, est.sigma=TRUE, skillspace="normal", tau.item.fixed=NULL, a.item.fixed=NULL, d.min=0.5, d.max=100, d.start=3, c.start=NULL, tau.start=NULL, sd.start=1, d.prior=c(3,100), c.prior=c(3,100), tau.prior=c(0,1000), a.prior=c(1,100), link_item="GPCM", max.increment=1, numdiff.parm=0.00001, maxdevchange=0.1, globconv=.001, maxiter=1000, msteps=4, mstepconv=0.001, optimizer="nlminb" ) # S3 method for rm.sdt summary(object, file=NULL, ...) # S3 method for rm.sdt plot(x, ask=TRUE, ...) # S3 method for rm.sdt anova(object,...) # S3 method for rm.sdt logLik(object,...) # S3 method for rm.sdt IRT.factor.scores(object, type="EAP", ...) # S3 method for rm.sdt IRT.irfprob(object,...) # S3 method for rm.sdt IRT.likelihood(object,...) # S3 method for rm.sdt IRT.posterior(object,...) # S3 method for rm.sdt IRT.modelfit(object,...) # S3 method for IRT.modelfit.rm.sdt summary(object,...)
dat | Original data frame. Ratings on variables must be in rows, i.e. every row corresponds to a person-rater combination. |
---|---|
pid | Person identifier. |
rater | Rater identifier. |
Qmatrix | An optional Q-matrix. If this matrix is not provided, then by default the ordinary scoring of categories (from 0 to the maximum score of \(K\)) is used. |
theta.k | A grid of theta values for the ability distribution. |
est.a.item | Should item parameters \(a_i\) be estimated? |
est.c.rater | Type of estimation for item-rater parameters \(c_{ir}\)
in the signal detection model. Options are |
est.d.rater | Type of estimation of \(d\) parameters. Options are
the same as in |
est.mean | Optional logical indicating whether the mean of the trait distribution should be estimated. |
est.sigma | Optional logical indicating whether the standard deviation of the trait distribution should be estimated. |
skillspace | Specified \(\theta\) distribution type. It can be
|
tau.item.fixed | Optional matrix with three columns specifying fixed \(\tau\) parameters. The first two columns denote item and category indices, the third the fixed value. See Example 3. |
a.item.fixed | Optional matrix with two columns specifying fixed \(a\) parameters. First column: Item index. Second column: Fixed \(a\) parameter. |
d.min | Minimal \(d\) parameter to be estimated |
d.max | Maximal \(d\) parameter to be estimated |
d.start | Starting value(s) of \(d\) parameters |
c.start | Starting values of \(c\) parameters |
tau.start | Starting values of \(\tau\) parameters |
sd.start | Starting value for trait standard deviation |
d.prior | Normal prior \(N(M,S^2)\) for \(d\) parameters |
c.prior | Normal prior for \(c\) parameters. The prior for
parameter \(c_{irk}\) is defined as \(M \cdot ( k - 0.5) \)
where \(M\) is |
tau.prior | Normal prior for \(\tau\) parameters |
a.prior | Normal prior for \(a\) parameters |
link_item | Type of item response function for latent responses.
Can be |
max.increment | Maximum increment of item parameters during estimation |
numdiff.parm | Numerical differentiation step width |
maxdevchange | Maximum relative deviance change as a convergence criterion |
globconv | Maximum parameter change |
maxiter | Maximum number of iterations |
msteps | Maximum number of iterations during an M step |
mstepconv | Convergence criterion in an M step |
optimizer | Choice of optimization function in M-step for
item parameters. Options are |
object | Object of class |
file | Optional file name in which summary should be written. |
x | Object of class |
ask | Optional logical indicating whether a new plot should be asked for. |
type | Factor score estimation method. Up to now,
only |
... | Further arguments to be passed |
The specification of the model follows DeCarlo et al. (2011).
The second level models the ideal rating (latent response) \(\eta=0, ...,K\)
of person \(p\) on item \(i\). The option link_item='GPCM'
follows the
generalized partial credit model
$$ P( \eta_{pi}=\eta | \theta_p ) \propto
exp( a_{i} q_{i \eta } \theta_p - \tau_{i \eta } ) $$. The option link_item='GRM'
employs the
graded response model $$ P( \eta_{pi}=\eta | \theta_p )=
\Psi( \tau_{i,\eta + 1} - a_i \theta_p ) - \Psi( \tau_{i,\eta} - a_i \theta_p ) $$
At the first level, the ratings \(X_{pir}\) for person \(p\) on item \(i\) and rater \(r\) are modeled as a signal detection model $$ P( X_{pir} \le k | \eta_{pi} )= G( c_{irk} - d_{ir} \eta_{pi} )$$ where \(G\) is the logistic distribution function and the categories are \(k=1,\ldots, K+1\). Note that the item response model can be equivalently written as $$ P( X_{pir} \ge k | \eta_{pi} )= G( d_{ir} \eta_{pi} - c_{irk})$$
The thresholds \(c_{irk}\) can be further restricted to
\(c_{irk}=c_{k}\) (est.c.rater='e'
),
\(c_{irk}=c_{ik}\) (est.c.rater='i'
) or
\(c_{irk}=c_{ir}\) (est.c.rater='r'
). The same
holds for rater precision parameters \(d_{ir}\).
A list with following entries:
Deviance
Information criteria and number of parameters
Data frame with item parameters. The columns
N
and M
denote the number of observed ratings and the
observed mean of all ratings, respectively.
In addition to item parameters \(\tau_{ik}\) and \(a_i\), the mean
for the latent response (latM
) is computed as
\(E( \eta_i )=\sum_p P( \theta_p ) q_{ik} P( \eta_i=k | \theta_p ) \)
which provides an item parameter at the original metric of ratings. The latent standard
deviation (latSD
) is computed in the same manner.
Data frame with rater parameters.
Transformed \(c\) parameters
(c_x.trans
) are computed as \(c_{irk} / ( d_{ir} )\).
Data frame with person parameters: EAP and corresponding standard errors
EAP reliability
EAP reliability
Mean of the trait distribution
Standard deviation of the trait distribution
Item parameters \(\tau_{ik}\)
Standard error of item parameters \(\tau_{ik}\)
Item slopes \(a_i\)
Standard error of item slopes \(a_i\)
Rater parameters \(c_{irk}\)
Standard error of rater severity parameter \(c_{irk}\)
Rater slope parameter \(d_{ir}\)
Standard error of rater slope parameter \(d_{ir}\)
Individual likelihood
Individual posterior distribution
Item probabilities at grid theta.k
. Note that these
probabilities are calculated on the pseudo items \(i \times r\),
i.e. the interaction of item and rater.
Probabilities \(P( \eta_i=\eta | \theta )\) of latent item responses evaluated at theta grid \(\theta_p\).
Expected counts
Estimated trait distribution \(P(\theta_p)\).
Maximum number of categories
Processed data
Number of iterations
Further values
DeCarlo, L. T. (2005). A model of rater behavior in essay grading based on signal detection theory. Journal of Educational Measurement, 42, 53-76.
DeCarlo, L. T. (2010). Studies of a latent-class signal-detection model for constructed response scoring II: Incomplete and hierarchical designs. ETS Research Report ETS RR-10-08. Princeton NJ: ETS.
DeCarlo, T., Kim, Y., & Johnson, M. S. (2011). A hierarchical rater model for constructed responses, with a signal detection rater model. Journal of Educational Measurement, 48, 333-356.
Robitzsch, A., & Steinfeld, J. (2018). Item response models for human ratings: Overview, estimation methods, and implementation in R. Psychological Test and Assessment Modeling, 60(1), 101-139.
Vermunt, J. K. (2008). Latent class and finite mixture models for multilevel data sets. Statistical Methods in Medical Research, 17, 33-51.
The facets rater model can be estimated with rm.facets
.
############################################################################# # EXAMPLE 1: Hierarchical rater model (HRM-SDT) data.ratings1 ############################################################################# data(data.ratings1) dat <- data.ratings1 if (FALSE) { # Model 1: Partial Credit Model: no rater effects mod1 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="n", d.start=100, est.d.rater="n" ) summary(mod1) # Model 2: Generalized Partial Credit Model: no rater effects mod2 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="n", est.d.rater="n", est.a.item=TRUE, d.start=100) summary(mod2) # Model 3: Equal effects in SDT mod3 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="e", est.d.rater="e") summary(mod3) # Model 4: Rater effects in SDT mod4 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="r", est.d.rater="r") summary(mod4) ############################################################################# # EXAMPLE 2: HRM-SDT data.ratings3 ############################################################################# data(data.ratings3) dat <- data.ratings3 dat <- dat[ dat$rater < 814, ] psych::describe(dat) # Model 1: item- and rater-specific effects mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], rater=dat$rater, pid=dat$idstud, est.c.rater="a", est.d.rater="a" ) summary(mod1) plot(mod1) # Model 2: Differing number of categories per variable mod2 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4,6)) ], rater=dat$rater, pid=dat$idstud, est.c.rater="a", est.d.rater="a") summary(mod2) plot(mod2) ############################################################################# # EXAMPLE 3: Hierarchical rater model with discrete skill spaces ############################################################################# data(data.ratings3) dat <- data.ratings3 dat <- dat[ dat$rater < 814, ] psych::describe(dat) # Model 1: Discrete theta skill space with values of 0,1,2 and 3 mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], theta.k=0:3, rater=dat$rater, pid=dat$idstud, est.c.rater="a", est.d.rater="a", skillspace="discrete" ) summary(mod1) plot(mod1) # Model 2: Modelling of one item by using a discrete skill space and # fixed item parameters # fixed tau and a parameters tau.item.fixed <- cbind( 1, 1:3, 100*cumsum( c( 0.5, 1.5, 2.5)) ) a.item.fixed <- cbind( 1, 100 ) # fit HRM-SDT mod2 <- sirt::rm.sdt( dat[, "crit2", drop=FALSE], theta.k=0:3, rater=dat$rater, tau.item.fixed=tau.item.fixed,a.item.fixed=a.item.fixed, pid=dat$idstud, est.c.rater="a", est.d.rater="a", skillspace="discrete" ) summary(mod2) plot(mod2) }