% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/textreuse-package.r
\docType{package}
\name{textreuse-package}
\alias{textreuse}
\alias{textreuse-package}
\title{textreuse: Detect Text Reuse and Document Similarity}
\description{
Tools for measuring similarity among documents and detecting
    passages which have been reused. Implements shingled n-gram, skip n-gram,
    and other tokenizers; similarity/dissimilarity functions; pairwise
    comparisons; minhash and locality sensitive hashing algorithms; and a
    version of the Smith-Waterman local alignment algorithm suitable for
    natural language.
}
\details{
The best place to begin with this package in the introductory vignette.

\code{vignette("textreuse-introduction", package = "textreuse")}

After reading that vignette, the "pairwise" and "minhash" vignettes introduce
specific paths for working with the package.

\code{vignette("textreuse-pairwise", package = "textreuse")}

\code{vignette("textreuse-minhash", package = "textreuse")}

\code{vignette("textreuse-alignment", package = "textreuse")}

Another good place to begin with the package is the documentation for loading
documents (\code{\link{TextReuseTextDocument}} and
\code{\link{TextReuseCorpus}}), for \link{tokenizers},
\link[=similarity-functions]{similarity functions}, and
\link[=lsh]{locality-sensitive hashing}.
}
\references{
The sample data provided in the \code{extdata/legal} directory is
  taken from a
  \href{http://lincolnmullen.com/blog/corpus-of-american-tract-society-publications/}{corpus
   of American Tract Society publications} from the nineteen-century,
  gathered from the \href{https://archive.org/}{Internet Archive}.

  The sample data provided in the \code{extdata/legal} directory, are taken
  from the following nineteenth-century codes of civil procedure from
  California and New York.

  \emph{Final Report of the Commissioners on Practice and Pleadings}, in 2
  \emph{Documents of the Assembly of New York}, 73rd Sess., No. 16, (1850):
  243-250, sections 597-613.
  \href{http://books.google.com/books?id=9HEbAQAAIAAJ&pg=PA243#v=onepage&q&f=false}{Google
   Books}.

  \emph{An Act To Regulate Proceedings in Civil Cases}, 1851 \emph{California
  Laws} 51, 51-53 sections 4-17; 101, sections 313-316.
  \href{http://books.google.com/books?id=4PHEAAAAIAAJ&pg=PA51#v=onepage&q&f=false}{Google
   Books}.
}
\seealso{
Useful links:
\itemize{
  \item \url{https://docs.ropensci.org/textreuse}
  \item \url{https://github.com/ropensci/textreuse}
  \item Report bugs at \url{https://github.com/ropensci/textreuse/issues}
}

}
\author{
\strong{Maintainer}: Lincoln Mullen \email{lincoln@lincolnmullen.com} (\href{https://orcid.org/0000-0001-5103-6917}{ORCID})

}
