Exact Version of Jonckheere-Terpstra Test — JonckheereTerpstraTest • DescTools

Jonckheere-Terpstra test to test for ordered differences among classes.

JonckheereTerpstraTest(x, ...)

# Default S3 method
JonckheereTerpstraTest(x, g, alternative = c("two.sided", "increasing", "decreasing"), 
                       nperm = NULL, exact = NULL, ...)

# S3 method for class 'formula'
JonckheereTerpstraTest(formula, data, subset, na.action, ...)

Arguments

x: a numeric vector of data values, or a list of numeric data vectors.
g: a vector or factor object giving the group for the corresponding elements of x. Ignored if x is a list.
alternative: means are monotonic (two.sided), increasing, or decreasing
nperm: number of permutations for the reference distribution. The default is NULL in which case the permutation p-value is not computed. It's recommended to set nperm to 1000 or higher if permutation p-value is desired.
formula: a formula of the form lhs ~ rhs where lhs gives the data values and rhs the corresponding groups.
data: an optional matrix or data frame (or similar: see model.frame) containing the variables in the formula formula. By default the variables are taken from environment(formula).
subset: an optional vector specifying a subset of observations to be used.
na.action: a function which indicates what should happen when the data contain NAs. Defaults to getOption("na.action").
exact: logical, defining if the exact test should be calculated. If left to NULL, the function uses the exact test up to a samplesize of 100 and falls back to normal approximation for larger samples. The exact procedure can not be applied to samples containing ties.
...: further argument to be passed to methods.

Details

JonckheereTerpstraTest is the exact (permutation) version of the Jonckheere-Terpstra test. It uses the statistic $$\sum_{k<l} \sum_{ij} I(X_{ik} < X_{jl}) + 0.5 I(X_{ik} = X_{jl}),$$ where $i, j$ are observations in groups $k$ and $l$ respectively. The asymptotic version is equivalent to cor.test(x, g, method="k"). The exact calculation requires that there be no ties and that the sample size is less than 100. When data are tied and sample size is at most 100 permutation p-value is returned.

If x is a list, its elements are taken as the samples to be compared, and hence have to be numeric data vectors. In this case, g is ignored, and one can simply use JonckheereTerpstraTest(x) to perform the test. If the samples are not yet contained in a list, use JonckheereTerpstraTest(list(x, ...)).

Otherwise, x must be a numeric data vector, and g must be a vector or factor object of the same length as x giving the group for the corresponding elements of x.

Note

The function was previously published as jonckheere.test() in the clinfun package and has been integrated here without logical changes. Some argument checks and a formula interface were added.

Author

Venkatraman E. Seshan <seshanv@mskcc.org>, minor adaptations Andri Signorell

References

Jonckheere, A. R. (1954). A distribution-free k-sample test again ordered alternatives. Biometrika 41:133-145.

Terpstra, T. J. (1952). The asymptotic normality and consistency of Kendall's test against trend, when ties are present in one ranking. Indagationes Mathematicae 14:327-333.

Examples

set.seed(1234)
g <- ordered(rep(1:5, rep(10,5)))
x <- rnorm(50) + 0.3 * as.numeric(g)

JonckheereTerpstraTest(x, g)
#> 
#> 	Jonckheere-Terpstra test
#> 
#> data:  x by g
#> JT = 629, p-value = 0.02734
#> alternative hypothesis: two.sided
#> 

x[1:2] <- mean(x[1:2]) # tied data

JonckheereTerpstraTest(x, g)
#> Warning: Sample size > 100 or data with ties 
#>  p-value based on normal approximation. Specify nperm for permutation p-value
#> 
#> 	Jonckheere-Terpstra test
#> 
#> data:  x by g
#> JT = 639, p-value = 0.01741
#> alternative hypothesis: two.sided
#> 
JonckheereTerpstraTest(x, g, nperm=5000)
#> Warning: Sample size > 100 or data with ties 
#>  p-value based on normal approximation. Specify nperm for permutation p-value
#> 
#> 	Jonckheere-Terpstra test
#> 
#> data:  x by g
#> JT = 639, p-value = 0.0136
#> alternative hypothesis: two.sided
#> 

# Duller, S. 222
coffee <- list(
  c_4=c(447,396,383,410),
  c_2=c(438,521,468,391,504,472),
  c_0=c(513,543,506,489,407))  

# the list interface:
JonckheereTerpstraTest(coffee)
#> 
#> 	Jonckheere-Terpstra test
#> 
#> data:  coffee
#> JT = 59, p-value = 0.01973
#> alternative hypothesis: two.sided
#> 

# the formula interface
breaking <- data.frame(
  speed=c(20,25,25,25,25,30,30,30,35,35),
  distance=c(48,33,59,48,56,60,101,67,85,107))

JonckheereTerpstraTest(distance ~ speed, breaking, alternative="increasing")
#> Warning: Sample size > 100 or data with ties 
#>  p-value based on normal approximation. Specify nperm for permutation p-value
#> 
#> 	Jonckheere-Terpstra test
#> 
#> data:  distance by speed
#> JT = 32.5, p-value = 0.002263
#> alternative hypothesis: increasing
#>