Calculate Goodman Kruskal's tau statistic, a measure of association for ordinal factors in a two-way table.
The function has interfaces for a table (matrix) and for single vectors.

GoodmanKruskalTau(x, y = NULL, direction = c("row", "column"), conf.level = NA, ...)

Arguments

x

a numeric vector or a table. A matrix will be treated as table.

y

NULL (default) or a vector with compatible dimensions to x. If y is provided, table(x, y, ...) is calculated.

direction

direction of the calculation. Can be "row" (default) or "column", where "row" calculates Goodman Kruskal's tau-a (R|C) ("column dependent").

conf.level

confidence level of the interval. If set to NA (which is the default) no confidence interval will be calculated.

...

further arguments are passed to the function table, allowing i.e. to set useNA. This refers only to the vector interface.

Details

Goodman-Kruskal tau measures association for cross tabulations of nominal level variables. Goodman-Kruskal tau is based on random category assignment. It measures the percentage improvement in predictability of the dependent variable (column or row variable) given the value of other variables (row or column variables). Goodman-Kruskal tau is the same as Goodman-Kruskal lambda except the calculations of the tau statistic are based on assignment probabilities specified by marginal or conditional proportions. Misclassification probabilities are based on random category assignment with probabilities specified by marginal or conditional proportion.

Goodman Kruskal tau reduces to \(\phi^2\) (see: Phi) in the 2x2-table case.

Value

a single numeric value if no confidence intervals are requested,
and otherwise a numeric vector with 3 elements for the estimate, the lower and the upper confidence interval

References

Agresti, A. (2002) Categorical Data Analysis. John Wiley & Sons, pp. 57-59.

Goodman, L. A., & Kruskal, W. H. (1954) Measures of association for cross classifications. Journal of the American Statistical Association, 49, 732-764.

Somers, R. H. (1962) A New Asymmetric Measure of Association for Ordinal Variables, American Sociological Review, 27, 799-811.

Goodman, L. A., & Kruskal, W. H. (1963) Measures of association for cross classifications III: Approximate sampling theory. Journal of the American Statistical Association, 58, 310-364.

Liebetrau, A. M. (1983) Measures of Association, Sage University Papers Series on Quantitative Applications in the Social Sciences, 07-004. Newbury Park, CA: Sage, pp. 24–30

Author

Andri Signorell <andri@signorell.net>, based on code from Antti Arppe <antti.arppe@helsinki.fi>

See also

ConDisPairs yields concordant and discordant pairs

Other association measures:
KendallTauA (Tau a), cor (method="kendall") for Tau b, StuartTauC, GoodmanKruskalGamma
Lambda, UncertCoef, MutInf

Examples


# example in:
# http://support.sas.com/documentation/cdl/en/statugfreq/63124/PDF/default/statugfreq.pdf
# pp. S. 1821

tab <- as.table(rbind(c(26,26,23,18,9),c(6,7,9,14,23)))

# Goodman Kruskal's tau C|R
GoodmanKruskalTau(tab, direction="column", conf.level=0.95)
#>        tauA      lwr.ci      upr.ci 
#> 0.041216580 0.009920576 0.072512583 
# Goodman Kruskal's tau R|C
GoodmanKruskalTau(tab, direction="row", conf.level=0.95)
#>       tauA     lwr.ci     upr.ci 
#> 0.16523315 0.04921484 0.28125146 

# http://support.sas.com/documentation/cdl/en/statugfreq/63124/PDF/default/statugfreq.pdf
# pp. 1814 (143)
tab <- as.table(cbind(c(11,2),c(4,6)))

GoodmanKruskalTau(tab, direction="row", conf.level=0.95)
#>       tauA     lwr.ci     upr.ci 
#>  0.2156410 -0.1224903  0.5537724 
GoodmanKruskalTau(tab, direction="column", conf.level=0.95)
#>       tauA     lwr.ci     upr.ci 
#>  0.2156410 -0.1224903  0.5537724 
# reduce both to:
Phi(tab)^2
#> [1] 0.215641


# example 1 in Liebetrau (1983)

tt <- matrix(c(549,93,233,119,225,455,402,  
               212,124,78,42,41,12,132,
               54,54,33,13,46,7,153), ncol=3,
             dimnames=list(rownames=c("Gov", "Mil", "Edu", "Eco", "Intel", "Rel", "For"), 
                           colnames=c("One", "Two", "Multi")))

GoodmanKruskalTau(tt, direction = "row", conf.level = 0.95)
#>       tauA     lwr.ci     upr.ci 
#> 0.02580507 0.02094902 0.03066113 
GoodmanKruskalTau(tt, direction = "column", conf.level = 0.95)
#>       tauA     lwr.ci     upr.ci 
#> 0.08601920 0.07243943 0.09959897 


# SPSS
ttt <- matrix(c(225,53,206,3,1,12), nrow=3,
              dimnames=list(rownames=c("right","center", "left"), 
                            colnames=c("us","ussr")))

round(GoodmanKruskalTau(ttt, direction = "r", con=0.95), d=3)
#>   tauA lwr.ci upr.ci 
#>  0.010 -0.004  0.023 
round(GoodmanKruskalTau(ttt, direction = "c"), d=3)
#> [1] 0.013