UncertCoef.Rd
The uncertainty coefficient U(C|R) measures the proportion of uncertainty (entropy) in the column variable Y that is explained by the row variable X. The function has interfaces for a table, a matrix, a data.frame and for single vectors.
a numeric vector, a factor, matrix or data frame.
NULL
(default) or a vector, an ordered factor, matrix or data frame with compatible dimensions to x.
direction of the calculation. Can be "row"
(default) or "column"
, where
"row"
calculates UncertCoef (R|C) ("column dependent").
confidence level of the interval. If set to NA
(which is the default) no confidence interval will be calculated.
slightly nudge zero values so that their logarithm can be calculated
further arguments are passed to the function table
, allowing i.e. to set useNA. This refers only to the vector interface.
The uncertainty coefficient is computed as $$U(C|R) = \frac{H(X) + H(Y) - H(XY)}{H(Y)} $$ and
ranges from [0, 1].
Either a single numeric value, if no confidence interval is required,
or a vector with 3 elements for estimate, lower and upper confidence intervall.
Theil, H. (1972), Statistical Decomposition Analysis, Amsterdam: North-Holland Publishing Company.
# example from Goodman Kruskal (1954)
m <- as.table(cbind(c(1768,946,115), c(807,1387,438), c(189,746,288), c(47,53,16)))
dimnames(m) <- list(paste("A", 1:3), paste("B", 1:4))
m
#> B 1 B 2 B 3 B 4
#> A 1 1768 807 189 47
#> A 2 946 1387 746 53
#> A 3 115 438 288 16
# direction default is "symmetric"
UncertCoef(m)
#> [1] 0.0799
UncertCoef(m, conf.level=0.95)
#> uc lwr.ci upr.ci
#> 0.0799 0.0713 0.0885
UncertCoef(m, direction="row")
#> [1] 0.0851
UncertCoef(m, direction="column")
#> [1] 0.0753