CohenKappa.Rd
Computes the agreement rates Cohen's kappa and weighted kappa and their confidence intervals.
CohenKappa(x, y = NULL, weights = c("Unweighted", "Equal-Spacing", "Fleiss-Cohen"),
conf.level = NA, ...)
can either be a numeric vector or a confusion matrix. In the latter case x must be a square matrix.
NULL (default) or a vector with compatible dimensions to x
. If y
is provided, table(x, y, ...)
is calculated. In order to get a square matrix, x
and y
are
coerced to factors with synchronized levels. (Note, that the vector interface can not be used together with weights.)
either one out of "Unweighted"
(default), "Equal-Spacing"
, "Fleiss-Cohen"
, which will calculate the weights accordingly, or a user-specified matrix having the same dimensions as x containing the weights for each cell.
confidence level of the interval. If set to NA
(which is the default) no confidence intervals will be calculated.
further arguments are passed to the function table
, allowing i.e. to set useNA
. This refers only to the vector interface.
Cohen's kappa is the diagonal sum of the (possibly weighted) relative frequencies, corrected for expected values and standardized by its maximum value.
The equal-spacing weights (see Cicchetti and Allison 1971) are defined by $$1 - \frac{|i - j|}{r - 1}$$
r
being the number of columns/rows, and the Fleiss-Cohen weights by $$1 - \frac{(i - j)^2}{(r - 1)^2}$$
The latter attaches greater importance to closer disagreements.
Data can be passed to the function either as matrix or data.frame in x
, or as two numeric vectors x
and y
. In the latter case table(x, y, ...)
is calculated. Thus NA
s are handled the same way as table
does. Note that tables are by default calculated without NAs. The specific argument useNA
can be passed via the ... argument.
The vector interface (x, y)
is only supported for the calculation of unweighted kappa. This is because we cannot ensure a safe construction of a confusion table for two factors with different levels, which is independent of the order of the levels in x
and y
. So weights might lead to inconsistent results. The function will raise an error in this case.
if no confidence intervals are requested:
the estimate as numeric value
else a named numeric vector with 3 elements
estimate
lower confidence interval
upper confidence interval
Cohen, J. (1960) A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46.
Everitt, B.S. (1968), Moments of statistics kappa and weighted kappa. The British Journal of Mathematical and Statistical Psychology, 21, 97-103.
Fleiss, J.L., Cohen, J., and Everitt, B.S. (1969), Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 72, 332-327.
Cicchetti, D.V., Allison, T. (1971) A New Procedure for Assessing Reliability of Scoring EEG Sleep Recordings American Journal of EEG Technology, 11, 101-109.
# from Bortz et. al (1990) Verteilungsfreie Methoden in der Biostatistik, Springer, pp. 459
m <- matrix(c(53, 5, 2,
11, 14, 5,
1, 6, 3), nrow=3, byrow=TRUE,
dimnames = list(rater1 = c("V","N","P"), rater2 = c("V","N","P")) )
# confusion matrix interface
CohenKappa(m, weight="Unweighted")
#> [1] 0.4285714
# vector interface
x <- Untable(m)
CohenKappa(x$rater1, x$rater2, weight="Unweighted")
#> [1] 0.4285714
# pairwise Kappa
rating <- data.frame(
rtr1 = c(4,2,2,5,2, 1,3,1,1,5, 1,1,2,1,2, 3,1,1,2,1, 5,2,2,1,1, 2,1,2,1,5),
rtr2 = c(4,2,3,5,2, 1,3,1,1,5, 4,2,2,4,2, 3,1,1,2,3, 5,4,2,1,4, 2,1,2,3,5),
rtr3 = c(4,2,3,5,2, 3,3,3,4,5, 4,4,2,4,4, 3,1,1,4,3, 5,4,4,4,4, 2,1,4,3,5),
rtr4 = c(4,5,3,5,4, 3,3,3,4,5, 4,4,3,4,4, 3,4,1,4,5, 5,4,5,4,4, 2,1,4,3,5),
rtr5 = c(4,5,3,5,4, 3,5,3,4,5, 4,4,3,4,4, 3,5,1,4,5, 5,4,5,4,4, 2,5,4,3,5),
rtr6 = c(4,5,5,5,4, 3,5,4,4,5, 4,4,3,4,5, 5,5,2,4,5, 5,4,5,4,5, 4,5,4,3,5)
)
PairApply(rating, FUN=CohenKappa, symmetric=TRUE)
#> rtr1 rtr2 rtr3 rtr4 rtr5 rtr6
#> rtr1 1.00000000 0.6511628 0.3838254 0.2583436 0.1881919 0.08088235
#> rtr2 0.65116279 1.0000000 0.6311475 0.4392523 0.3633952 0.17105263
#> rtr3 0.38382542 0.6311475 1.0000000 0.7260274 0.6401799 0.33333333
#> rtr4 0.25834363 0.4392523 0.7260274 1.0000000 0.8569157 0.51923077
#> rtr5 0.18819188 0.3633952 0.6401799 0.8569157 1.0000000 0.64824121
#> rtr6 0.08088235 0.1710526 0.3333333 0.5192308 0.6482412 1.00000000
# Weighted Kappa
cats <- c("<10%", "11-20%", "21-30%", "31-40%", "41-50%", ">50%")
m <- matrix(c(5,8,1,2,4,2, 3,5,3,5,5,0, 1,2,6,11,2,1,
0,1,5,4,3,3, 0,0,1,2,5,2, 0,0,1,2,1,4), nrow=6, byrow=TRUE,
dimnames = list(rater1 = cats, rater2 = cats) )
CohenKappa(m, weight="Equal-Spacing")
#> [1] 0.3156685
# supply an explicit weight matrix
ncol(m)
#> [1] 6
(wm <- outer(1:ncol(m), 1:ncol(m), function(x, y) {
1 - ((abs(x-y)) / (ncol(m)-1)) } ))
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1.0 0.8 0.6 0.4 0.2 0.0
#> [2,] 0.8 1.0 0.8 0.6 0.4 0.2
#> [3,] 0.6 0.8 1.0 0.8 0.6 0.4
#> [4,] 0.4 0.6 0.8 1.0 0.8 0.6
#> [5,] 0.2 0.4 0.6 0.8 1.0 0.8
#> [6,] 0.0 0.2 0.4 0.6 0.8 1.0
CohenKappa(m, weight=wm, conf.level=0.95)
#> kappa lwr.ci upr.ci
#> 0.3156685 0.1968117 0.4345252
# however, Fleiss, Cohen and Everitt weight similarities
fleiss <- matrix(c(
106, 10, 4,
22, 28, 10,
2, 12, 6
), ncol=3, byrow=TRUE)
#Fleiss weights the similarities
weights <- matrix(c(
1.0000, 0.0000, 0.4444,
0.0000, 1.0000, 0.6666,
0.4444, 0.6666, 1.0000
), ncol=3)
CohenKappa(fleiss, weights)
#> [1] 0