Extract a part of a string, defined as regular expression. StrExtractBetween() is a convenience function used to extract parts between a left and right delimiter.

StrExtract(x, pattern, ...)

StrExtractBetween(x, left, right, greedy = FALSE)

Arguments

x

a character vector where matches are sought, or an object which can be coerced by as.character to a character vector.

pattern

character string containing a regular expression (or character string for fixed = TRUE) to be matched in the given character vector. Coerced by as.character to a character string if possible. If a character vector of length 2 or more is supplied, the first element is used with a warning. Missing values are not allowed.

left

left character(s) limiting the string to be extracted

right

right character(s) limiting the string to be extracted

greedy

logical, determines whether the first found match for right should be used (FALSE, default) or the last (TRUE).

...

the dots are passed to the the internally used function regexpr(), which allows to use e.g. Perl-like regular expressions.

Details

The function wraps regexpr and regmatches.

Value

A character vector.

Author

Andri Signorell <andri@signorell.net>

See also

Examples

txt <- c("G1:E001", "No points here", "G2:E002", "G3:E003", NA)

# extract everything after the :
StrExtract(x=txt, pattern=":.*")
#> [1] ":E001" NA      ":E002" ":E003" NA     

# extract everything between "left" and "right"
z <- c("yBS (23A) 890", "l 89Z) 890.?/", "WS (55X) 8(90)", "123 abc", "none", NA)
# everything enclosed by spaces
StrExtractBetween(z, " ", " ")
#> [1] "(23A)" "89Z)"  "(55X)" NA      NA      NA     

# note to escape special characters
StrExtractBetween(z, "\\(", "\\)")
#> [1] "23A" NA    "55X" NA    NA    NA