% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/grepl.R
\name{grepl2}
\alias{grepl2}
\alias{grepv2}
\alias{grepv2<-}
\alias{grepl}
\alias{grep}
\title{Detect Pattern Occurrences}
\usage{
grepl2(x, pattern, ..., ignore_case = FALSE, fixed = FALSE, invert = FALSE)

grepv2(x, pattern, ..., ignore_case = FALSE, fixed = FALSE, invert = FALSE)

grepv2(x, pattern, ..., ignore_case = FALSE, fixed = FALSE, invert = FALSE) <- value

grepl(
  pattern,
  x,
  ...,
  ignore.case = FALSE,
  fixed = FALSE,
  invert = FALSE,
  perl = FALSE,
  useBytes = FALSE
)

grep(
  pattern,
  x,
  ...,
  ignore.case = FALSE,
  fixed = FALSE,
  value = FALSE,
  invert = FALSE,
  perl = FALSE,
  useBytes = FALSE
)
}
\arguments{
\item{x}{character vector whose elements are to be examined}

\item{pattern}{character vector of nonempty search patterns;
for \code{grepv2} and \code{grep}, must not be longer than \code{x}}

\item{...}{further arguments to \code{\link[stringi]{stri_detect}},
e.g., \code{max_count}, \code{locale}, \code{dotall}}

\item{ignore_case, ignore.case}{single logical value; indicates whether matching
should be case-insensitive}

\item{fixed}{single logical value;
\code{FALSE} for matching with regular expressions
    (see \link[stringi]{about_search_regex});
\code{TRUE} for fixed pattern matching
    (\link[stringi]{about_search_fixed});
\code{NA} for the Unicode collation algorithm
    (\link[stringi]{about_search_coll})}

\item{invert}{single logical value; indicates whether a no-match
is rather of interest}

\item{value}{character vector of replacement strings
or a single logical value
indicating whether indexes of strings in \code{x} matching
patterns should be returned}

\item{perl, useBytes}{not used (with a warning if
attempting to do so) [DEPRECATED]}
}
\value{
\code{grepl2} and [DEPRECATED] \code{grep} return a logical vector.
They preserve the attributes of the longest inputs (unless they are
dropped due to coercion). Missing values in the inputs are propagated
consistently.

\code{grepv2} and [DEPRECATED] \code{grep} with \code{value=TRUE} returns
a subset of \code{x} with elements matching the corresponding patterns.
[DEPRECATED] \code{grep} with \code{value=FALSE} returns the indexes
in \code{x} where a match occurred.
Missing values are not included in the outputs and only the \code{names}
attribute is preserved, because the length of the result may be different
than that of \code{x}.

The replacement version of \code{grepv2} modifies \code{x} 'in-place'.
}
\description{
\code{grepl2} indicates whether a string matches the corresponding pattern
or not.

\code{grepv2} returns a subset of \code{x} matching the corresponding
patterns. Its replacement version allows for substituting such a subset with
new content.
}
\details{
These functions are fully vectorised with respect to \code{x} and
\code{pattern}.

The [DEPRECATED] \code{grepl} simply calls
\code{grepl2} which have a cleaned-up argument list.

The [DEPRECATED] \code{grep} with \code{value=FALSE} is actually redundant --
it can be trivially reproduced with \code{grepl} and
\code{\link[base]{which}}.

\code{grepv2} and \code{grep} with \code{value=FALSE} combine
pattern matching and subsetting and some users may find it convenient
in conjunction with the forward pipe operator, \code{\link[base]{|>}}.
}
\section{Differences from Base R}{

\code{grepl} and \code{grep} are [DEPRECATED] replacements for base
\code{\link[base]{grep}} and \code{\link[base]{grepl}}
implemented with \code{\link[stringi]{stri_detect}}.

\itemize{
\item there are inconsistencies between the argument order and naming
    in \code{\link[base]{grepl}}, \code{\link[base]{strsplit}},
    and \code{\link[base]{startsWith}} (amongst others); e.g.,
    where the needle can precede the haystack, the use of the forward
    pipe operator, \code{\link[base]{|>}}, is less convenient
    \bold{[fixed by introducing \code{grepl2}]}
\item base R implementation is not portable as it is based on
    the system PCRE or TRE library
    (e.g., some Unicode classes may not be available or matching thereof
    can depend on the current \code{LC_CTYPE} category
    \bold{[fixed here]}
\item not suitable for natural language processing
    \bold{[fixed here -- use \code{fixed=NA}]}
\item two different regular expression libraries are used
    (and historically, ERE was used in place of TRE)
    \bold{[here, \pkg{ICU} Java-like regular expression engine
    is only available, hence the \code{perl} argument has no meaning]}
\item not vectorised w.r.t. \code{pattern}
    \bold{[fixed here, however, in \code{grep}, \code{pattern} cannot be
    longer than \code{x}]}
\item missing values in haystack will result in a no-match
    \bold{[fixed in \code{grepl}; see Value]}
\item \code{ignore.case=TRUE} cannot be used with \code{fixed=TRUE}
    \bold{[fixed here]}
\item no attributes are preserved
    \bold{[fixed here; see Value]}
}
}

\examples{
x <- c("abc", "1237", "\U0001f602", "\U0001f603", "stringx\U0001f970", NA)
grepl2(x, "\\\\p{L}")
which(grepl2(x, "\\\\p{L}"))  # like grep

# at least 1 letter or digit:
p <- c("\\\\p{L}", "\\\\p{N}")
`dimnames<-`(outer(x, p, grepl2), list(x, p))

x |> grepv2("\\\\p{L}")
grepv2(x, "\\\\p{L}", invert=TRUE) <- "\U0001F496"
print(x)

}
\seealso{
The official online manual of \pkg{stringx} at \url{https://stringx.gagolewski.com/}

Related function(s): \code{\link{paste}}, \code{\link{nchar}},
    \code{\link{strsplit}}, \code{\link{gsub2}},
    \code{\link{gregexpr2}}, \code{\link{gregextr2}},
    \code{\link{gsubstr}}
}
\author{
\href{https://www.gagolewski.com/}{Marek Gagolewski}
}
