Sample Twins

Draw a twin sample out of a population for a given recordset, by matching some strata criteria.

SampleTwins(x, stratanames = NULL, twins, 
            method = c("srswor", "srswr", "poisson", "systematic"), 
            pik, description = FALSE)

Arguments

x: the data to draw the sample from
stratanames: the stratanames to use
twins: the twin sample
method: method to select units; the following methods are implemented: simple random sampling without replacement (srswor), simple random sampling with replacement (srswr), Poisson sampling (poisson), systematic sampling (systematic); if "method" is missing, the default method is "srswor". See Strata.
pik: vector of inclusion probabilities or auxiliary information used to compute them; this argument is only used for unequal probability sampling (Poisson and systematic). If an auxiliary information is provided, the function uses the inclusionprobabilities function for computing these probabilities. If the method is "srswr" and the sample size is larger than the population size, this vector is normalized to one.
description: a message is printed if its value is TRUE; the message gives the number of selected units and the number of the units in the population. By default, the value is FALSE.

Value

The function produces an object, which contains the following information:

id: the identifier of the selected units.
stratum: the unit stratum.
prob: the final unit inclusion probability.

Author

Andri Signorell <andri@signorell.net>

Examples

m <- rbind(matrix(rep("nc",165), 165, 1, byrow=TRUE), 
           matrix(rep("sc", 70), 70, 1, byrow=TRUE))
m <- cbind.data.frame(m, c(rep(1, 100), rep(2,50), rep(3,15), 
                           rep(1,30), rep(2,40)), 1000*runif(235))
names(m) <- c("state","region","income")

# this would be our sample to be reproduced by a twin sample
d.smp <- m[sample(nrow(m), size=10, replace=TRUE),]

# draw the sample
s <- SampleTwins(x = m, stratanames=c("state","region"), twins = d.smp, method="srswor")
#> Warning: Could not find a twin for all records. Enlighten the restrictions!

d.twin <- m[s$id,]
d.twin
#>     state region     income
#> 186    sc      1  71.904097
#> 173    sc      1 648.818124
#> 195    sc      1   9.429905
#> 107    nc      2 674.376388
#> 215    sc      2 295.895455
#> 213    sc      2 856.885420
#> 224    sc      2 617.235270
#> 212    sc      2 353.986103
#> 151    nc      3 404.399181
#> 152    nc      3 471.576278

Arguments

Value

Author

See also

Examples