https://poppler.freedesktop.org> for extracting text, fonts, attachments and metadata from a PDF file. Also supports high quality rendering of PDF documents into PNG, JPEG, TIFF format, or into raw bitmap vectors for further processing in R.">

pdftools: Text Extraction, Rendering and Converting of PDF Documents (original) (raw)

Utilities based on 'libpoppler' <https://poppler.freedesktop.org> for extracting text, fonts, attachments and metadata from a PDF file. Also supports high quality rendering of PDF documents into PNG, JPEG, TIFF format, or into raw bitmap vectors for further processing in R.

Version: 3.5.0
Imports: Rcpp (≥ 0.12.12), qpdf
LinkingTo: Rcpp
Suggests: png, webp, tesseract, testthat
Published: 2025-03-03
DOI: 10.32614/CRAN.package.pdftools
Author: Jeroen Ooms ORCID iD [aut, cre]
Maintainer: Jeroen Ooms
BugReports: https://github.com/ropensci/pdftools/issues
License: MIT + file
URL: https://ropensci.r-universe.dev/pdftools,https://docs.ropensci.org/pdftools/
NeedsCompilation: yes
SystemRequirements: Poppler C++ API: libpoppler-cpp-dev (deb) or poppler-cpp-devel (rpm), and poppler-data (rpm/deb) package.
Materials:
CRAN checks: pdftools results

Documentation:

Downloads:

Reverse dependencies:

Reverse imports: bridger, chatAI4R, daiR, disclosuR, doconv, dtrackr, eurlex, findR, gdiff, huito, IDEATools, iheiddown, LJexm, LLMAgentR, mapscanner, mlts, OMICsPCA, PacketLLM, pdfsearch, pooledpeaks, RAGFlowChainR, readtext, revise, speech, staplr, SwimmeR, tall, tesseract, texor, TextForecast, tidyllm, timeLineGraphics, vmeasur
Reverse suggests: bagyo, caracas, easyr, fairadapt, fixest, flextable, fplot, gMOIP, gridGraphics, inlpubs, LexisNexisTools, magick, orderanalyzer, pagedown, patientProfilesVis, piecepackr, plotgardener, poldis, ricu, rjtools, RRphylo, seqArchRplus, slickR, spelling, TBox, texPreview, tm, troopdata, xmpdf

Linking:

Please use the canonical formhttps://CRAN.R-project.org/package=pdftoolsto link to this page.