Draw a twin sample out of a population for a given recordset, by matching some strata criteria.

SampleTwins(x, stratanames = NULL, twins, 
            method = c("srswor", "srswr", "poisson", "systematic"), 
            pik, description = FALSE)

Arguments

x

the data to draw the sample from

stratanames

the stratanames to use

twins

the twin sample

method

method to select units; the following methods are implemented: simple random sampling without replacement (srswor), simple random sampling with replacement (srswr), Poisson sampling (poisson), systematic sampling (systematic); if "method" is missing, the default method is "srswor". See Strata.

pik

vector of inclusion probabilities or auxiliary information used to compute them; this argument is only used for unequal probability sampling (Poisson and systematic). If an auxiliary information is provided, the function uses the inclusionprobabilities function for computing these probabilities. If the method is "srswr" and the sample size is larger than the population size, this vector is normalized to one.

description

a message is printed if its value is TRUE; the message gives the number of selected units and the number of the units in the population. By default, the value is FALSE.

Value

The function produces an object, which contains the following information:

id

the identifier of the selected units.

stratum

the unit stratum.

prob

the final unit inclusion probability.

Author

Andri Signorell <andri@signorell.net>

See also

Examples

m <- rbind(matrix(rep("nc",165), 165, 1, byrow=TRUE), 
           matrix(rep("sc", 70), 70, 1, byrow=TRUE))
m <- cbind.data.frame(m, c(rep(1, 100), rep(2,50), rep(3,15), 
                           rep(1,30), rep(2,40)), 1000*runif(235))
names(m) <- c("state","region","income")

# this would be our sample to be reproduced by a twin sample
d.smp <- m[sample(nrow(m), size=10, replace=TRUE),]

# draw the sample
s <- SampleTwins(x = m, stratanames=c("state","region"), twins = d.smp, method="srswor")
#> Warning: Could not find a twin for all records. Enlighten the restrictions!

d.twin <- m[s$id,]
d.twin
#>     state region     income
#> 186    sc      1  71.904097
#> 173    sc      1 648.818124
#> 195    sc      1   9.429905
#> 107    nc      2 674.376388
#> 215    sc      2 295.895455
#> 213    sc      2 856.885420
#> 224    sc      2 617.235270
#> 212    sc      2 353.986103
#> 151    nc      3 404.399181
#> 152    nc      3 471.576278