add_content {inlpubs}R Documentation

Add Content from PDF Documents

Description

Incorporate the text or cover image from a PDF document into the inlpubs package.

Usage

add_content(
  pub_id,
  year,
  type = c("text", "image"),
  ...,
  srcdir = "archive",
  destdir = tempdir(),
  ignore = NULL,
  pubs = inlpubs::pubs,
  overwrite = FALSE
)

Arguments

pub_id

'character' vector. Unique identifier for the publication. May also be specified using the year of publication.

year

'integer' vector. Year of publication.

type

'character' string. Type of content to extract from the PDF file. Specify as either "text" (the default) or "image".

...

Arguments to be passed to the extraction function, extract_pdf_text for "text" and extract_pdf_image for "image".

srcdir

'character' string. The PDF document is located in a subdirectory of the source directory, and this subdirectory is named after the publication year. It is set to default to the 'archive' directory, which is found in the working directory.

destdir

'character' string. Target folder for the cover image that is saved in JPEG format. Defaults to the temporary directory.

ignore

'character' vector. Publication identifier(s) to ignore.

pubs

'pub' table. Publications of the INLPO, see pubs dataset for data format.

overwrite

'logical' flag. Whether to overwrite an existing text or image file.

Value

Returns the path to the saved text or image file, invisibly.

Author(s)

J.C. Fisher, U.S. Geological Survey, Idaho Water Science Center


[Package inlpubs version 1.1.3 Index]