Lambda.Rd
Calculate symmetric and asymmetric Goodman Kruskal lambda and their confidence intervals. Lamdba is a measure of proportional reduction in error in cross tabulation analysis. For any sample with a nominal independent variable and dependent variable (or ones that can be treated nominally), it indicates the extent to which the modal categories and frequencies for each value of the independent variable differ from the overall modal category and frequency, i.e. for all values of the independent variable together
Lambda(x, y = NULL, direction = c("symmetric", "row", "column"), conf.level = NA, ...)
a numeric vector, a matrix or a table.
NULL
(default) or a vector with compatible dimensions to x. If y is provided, table(x, y, ...)
is calculated.
type of lambda. Can be one out of "symmetric"
(default), "row"
, "column"
(abbreviations are allowed).
If direction is set to "row"
then Lambda(R|C) (column dependent) will be reported. See details.
confidence level for the returned confidence interval, restricted to lie between 0 and 1.
further arguments are passed to the function table
, allowing i.e. to set useNA = c("no", "ifany", "always")
.
Asymmetric lambda is interpreted as the probable improvement in predicting the column variable Y given knowledge of the row variable X.
The nondirectional lambda is the average of the two asymmetric lambdas, Lambda(C|R) and Lambda(R|C).
Lambda (asymmetric and symmetric) has a scale ranging from 0 to 1.
Data can be passed to the function either as matrix or data.frame in x
, or as two numeric vectors x
and y
. In the latter case table(x, y, ...)
is calculated. Thus NA
s are handled the same way as table
does. Note that tables are by default calculated without NAs (which breaks the package's law to in general not omit NAs silently). The specific argument useNA
can be passed via the ... argument.PairApply
can be used to calculate pairwise lambdas.
if no confidence intervals are requested:
the estimate as numeric value
else a named numeric vector with 3 elements
estimate
lower confidence interval
upper confidence interval
Agresti, A. (2002) Categorical Data Analysis. John Wiley & Sons
Goodman, L. A., Kruskal W. H. (1979) Measures of Association for Cross Classifications. New
York: Springer-Verlag (contains articles appearing in J. Amer. Statist. Assoc. in 1954,
1959, 1963, 1972).
http://www.nssl.noaa.gov/users/brooks/public_html/feda/papers/goodmankruskal1.pdf (might be outdated)
Liebetrau, A. M. (1983) Measures of Association, Sage University Papers Series on Quantitative Applications in the Social Sciences, 07-004. Newbury Park, CA: Sage, pp. 17–24
# example from Goodman Kruskal (1954)
m <- as.table(cbind(c(1768,946,115), c(807,1387,438), c(189,746,288), c(47,53,16)))
dimnames(m) <- list(paste("A", 1:3), paste("B", 1:4))
m
#> B 1 B 2 B 3 B 4
#> A 1 1768 807 189 47
#> A 2 946 1387 746 53
#> A 3 115 438 288 16
# direction default is "symmetric"
Lambda(m)
#> [1] 0.2076188
Lambda(m, conf.level=0.95)
#> lambda lwr.ci upr.ci
#> 0.2076188 0.1871747 0.2280629
Lambda(m, direction="row")
#> [1] 0.2241003
Lambda(m, direction="column")
#> [1] 0.1923949