piecemaker: Tools for Preparing Text for Tokenizers (original) (raw)
Tokenizers break text into pieces that are more usable by machine learning models. Many tokenizers share some preparation steps. This package provides those shared steps, along with a simple tokenizer.
Version: | 1.0.2 |
---|---|
Depends: | R (≥ 2.10) |
Imports: | cli, glue, rlang (≥ 0.4.2), stringi, stringr |
Suggests: | covr, testthat (≥ 3.0.0) |
Published: | 2023-06-02 |
DOI: | 10.32614/CRAN.package.piecemaker |
Author: | Jon Harmon |
Maintainer: | Jon Harmon |
BugReports: | https://github.com/macmillancontentscience/piecemaker/issues |
License: | Apache License (≥ 2) |
URL: | https://github.com/macmillancontentscience/piecemaker,https://macmillancontentscience.github.io/piecemaker/ |
NeedsCompilation: | no |
Materials: | README, NEWS |
CRAN checks: | piecemaker results |
Documentation:
Downloads:
Reverse dependencies:
Linking:
Please use the canonical formhttps://CRAN.R-project.org/package=piecemakerto link to this page.