Clean data by means of trimming, i.e., by omitting outlying observations.

Trim(x, trim = 0.1, na.rm = FALSE)

Arguments

x

a numeric vector to be trimmed.

trim

the fraction (0 to 0.5) of observations to be trimmed from each end of x. Values of trim outside that range (and < 1) are taken as the nearest endpoint. If trim is set to a value >1 it's interpreted as the number of elements to be cut off at each tail of x.

na.rm

a logical value indicating whether NA values should be stripped before the computation proceeds.

Details

A symmetrically trimmed vector x with a fraction of trim observations (resp. the given number) deleted from each end will be returned. If trim is set to a value >0.5 or to an integer value > n/2 then the result will be NA.

Value

The trimmed vector x. The indices of the trimmed values will be attached as attribute named "trim".

Note

This function is basically an excerpt from the base function mean, which allows the vector x to be trimmed before calculating the mean. But what if a trimmed standard deviation is needed?

Author

R-Core (function mean), Andri Signorell <andri@signorell.net>

See also

Examples

## generate data
set.seed(1234)     # for reproducibility
x <- rnorm(10)     # standard normal
x[1] <- x[1] * 10  # introduce outlier

## Trim data
x
#>  [1] -12.071   0.277   1.084  -2.346   0.429   0.506  -0.575  -0.547  -0.564
#> [10]  -0.890
Trim(x, trim=0.1)
#> [1]  0.277 -2.346  0.429  0.506 -0.575 -0.547 -0.564 -0.890
#> attr(,"trim")
#> [1] 1 3

## Trim fixed number, say cut the 3 extreme elements from each end
Trim(x, trim=3)
#> [1]  0.277 -0.575 -0.547 -0.564
#> attr(,"trim")
#> [1]  1  4 10  5  6  3

## check function
s <- sample(10:20)
s.tr <- Trim(s, trim = 2)
setequal(c(s[attr(s.tr, "trim")], s.tr), s)
#> [1] TRUE