Package: wordpiece
Type: Package
Title: R Implementation of Wordpiece Tokenization
Version: 1.0.2
Authors@R: c(
    person(given = "Jonathan",
           family = "Bratt",
           role = c("aut", "cre"),
           email = "jonathan.bratt@macmillan.com",
           comment = c(ORCID = "0000-0003-2859-0076")),
    person(given = "Jon",
           family = "Harmon",
           role = c("aut"),
           email = "jonthegeek@gmail.com",
           comment = c(ORCID = "0000-0003-4781-4346")),
    person(given = "Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning", 
           role = c("cph"))
    )
Description: Apply 'Wordpiece' (<arXiv:1609.08144>) tokenization to input text, 
 given an appropriate vocabulary. The 'BERT' (<arXiv:1810.04805>) tokenization 
 conventions are used by default.
Encoding: UTF-8
LazyData: true
URL: https://github.com/jonathanbratt/wordpiece
BugReports: https://github.com/jonathanbratt/wordpiece/issues
Depends: R (>= 3.3.0)
License: Apache License (>= 2)
RoxygenNote: 7.1.1
Imports: digest (>= 0.6.5), purrr (>= 0.2.3), rappdirs (>= 0.3),
        stringi (>= 1.0)
Suggests: testthat (>= 2.1.0), knitr, rmarkdown, covr
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2021-02-08 16:41:02 UTC; jonathan.bratt
Author: Jonathan Bratt [aut, cre] (<https://orcid.org/0000-0003-2859-0076>),
  Jon Harmon [aut] (<https://orcid.org/0000-0003-4781-4346>),
  Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]
Maintainer: Jonathan Bratt <jonathan.bratt@macmillan.com>
Repository: CRAN
Date/Publication: 2021-02-11 15:40:06 UTC
