Title: | PDF Tools Based on Poppler |
---|---|
Description: | PDF tools based on the Poppler PDF rendering library. See <http://poppler.freedesktop.org/> for more information on Poppler. |
Authors: | Kurt Hornik [aut, cre] |
Maintainer: | Kurt Hornik <[email protected]> |
License: | GPL-2 |
Version: | 0.1-3 |
Built: | 2024-10-31 20:34:37 UTC |
Source: | CRAN |
Create a reference to a Portable Document Format (PDF) file for use in subsequent information extraction from the file.
PDF_doc(file)
PDF_doc(file)
file |
A character string giving the path to a PDF file. |
A reference to a PDF file (external pointer object).
file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils") doc <- PDF_doc(file) ## Can now use the reference for information extraction, avoiding ## the creation of new PopplerDocument objects when doing so. PDF_info(doc) PDF_fonts(doc)
file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils") doc <- PDF_doc(file) ## Can now use the reference for information extraction, avoiding ## the creation of new PopplerDocument objects when doing so. PDF_info(doc) PDF_fonts(doc)
Obtain the fonts used in a Portable Document Format (PDF) file and further information about these fonts.
PDF_fonts(file)
PDF_fonts(file)
file |
A character string giving the path to a PDF file, or an
object of class |
A data frame inheriting from PDF_fonts
(which has a useful
print method), with the following variables:
name |
the full name of the font (character) |
type |
the font type (Type 1, Type 3, etc.; character) |
file |
the file name of the font (character; empty if the font is embedded) |
emb |
whether the font is embedded in the PDF file or not (logical) |
sub |
whether the font is a subset of another font (logical) |
file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils") PDF_fonts(file)
file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils") PDF_fonts(file)
Extract document information from a Portable Document Format (PDF) file.
PDF_info(file)
PDF_info(file)
file |
A character string giving the path to a PDF file, or an
object of class |
An object of class PDF_info
(which has useful format and print
methods), containing the information in the PDF Info dictionary
(title, subject, keywords, author, creator, producer, creation date,
modification date) as well as the number of pages and the page sizes,
whether the document is optimized (linearized), and the PDF version it
uses.
file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils") PDF_info(file)
file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils") PDF_info(file)
Extract text from a Portable Document Format (PDF) file.
PDF_text(file)
PDF_text(file)
file |
A character string giving the path to a PDF file, or an
object of class |
A character vector with the extracted texts for each page.
file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils") PDF_text(file)
file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils") PDF_text(file)