class CSVH::Reader

Sequantially and lazily reads from CSV-formatted data that has a header row. Allows accessing headers before reading any subsequent data rows and/or when no additional data rows are present in the data.

Constants

DEFAULT_CSV_OPTS

Public Class Methods

foreach(file_path, **opts)
Alias for: from_file
from_file(file_path, **opts) { |instance| ... } click to toggle source

When called without a block argument, returns an open reader for data from the file at the given file_path.

When called with a block argument, passes an open reader for data from the file to the given block, closes the reader (and its underlying file IO channel) before returning, and then returns the value that was returned by the block.

By default, the underlying CSV object is initialized with default options for data with a header row and to return the header row. Any oadditional options you supply will be added to those defaults or override them.

A [Reader] created using this method will delegate all of the same IO methods that a ‘CSV` created using `CSV#open` does except `close_write`, `flush`, `fsync`, `sync`, `sync=`, and `truncate`. You may call:

  • binmode()

  • binmode?()

  • close()

  • close_read()

  • closed?()

  • eof()

  • eof?()

  • external_encoding()

  • fcntl()

  • fileno()

  • flock()

  • flush()

  • internal_encoding()

  • ioctl()

  • isatty()

  • path()

  • pid()

  • pos()

  • pos=()

  • reopen()

  • seek()

  • fstat()

  • tell()

  • to_i()

  • to_io()

  • tty?()

@param file_path [String] the path of the file to read. @param opts options for ‘CSV.new`. @yieldparam [Reader] the new reader. @return [Reader,object]

the new reader or the value returned from the given
block.
# File lib/csvh/reader.rb, line 71
def from_file(file_path, **opts)
  opts = default_csv_opts.merge(opts)
  io = File.open(file_path, 'r')
  csv = CSV.new(io, **opts)
  instance = new(csv)

  if block_given?
    begin
      yield instance
    ensure
      instance.close unless instance.closed?
    end
  else
    instance
  end
end
Also aliased as: foreach
from_string_or_io(data, **opts) click to toggle source

Returns an open reader for data from given string or readable IO stream.

@param data [String, IO] the source of the data to read. @param opts options for ‘CSV.new`. @return [Reader] the new reader.

# File lib/csvh/reader.rb, line 94
def from_string_or_io(data, **opts)
  opts = default_csv_opts.merge(opts)
  csv = CSV.new(data, **opts)
  new(csv)
end
Also aliased as: parse
new(csv) click to toggle source

Returns a new reader based on the given CSV object. The CSV object must be configured to return a header row (a ‘CSV::ROW` that returns true from its `#header?` method as its first item. The header item must also not have been read yet. @param csv [CSV] A Ruby `::CSV` object.

# File lib/csvh/reader.rb, line 116
def initialize(csv)
  unless csv.return_headers?
    raise \
      InappropreateCsvInstanceError,
       "%{self.class} requires a CSV instance that returns headers." \
      " It needs to have been initialized with non-false/nil values" \
      " for :headers and :return_headers options."
  end
  @csv = csv
end
parse(data, **opts)
Alias for: from_string_or_io

Private Class Methods

default_csv_opts() click to toggle source
# File lib/csvh/reader.rb, line 105
def default_csv_opts
  DEFAULT_CSV_OPTS
end

Public Instance Methods

each() { |row| ... } click to toggle source

When given a block, yields each remaining data row of the data source in turn as a ‘CSV::Row` instance. When called without a block, returns an Enumerator over those rows.

Will never yield the header row, however, the headers are available via the headers method of either the reader or the row object.

@yieldparam [CSV::Row]

# File lib/csvh/reader.rb, line 210
def each
  headers
  if block_given?
    @csv.each { |row| yield row }
  else
    @csv.each
  end
end
gets()
Alias for: shift
headers() click to toggle source

Returns the list of column header values from the CSV data.

If any rows have already been read, then the result is immediately returned, having been recorded when the header row was initially encountered.

If no rows have been read yet, then the first row is read from the data in order to return the result.

@return [Array<String>] the column header names.

# File lib/csvh/reader.rb, line 142
def headers
  @headers ||= begin
    row = @csv.readline
    unless row.header_row?
      raise \
        CsvPrematurelyShiftedError,
        "the header row was prematurely read from the underlying CSV object."
    end
    row.headers
  end
end
read() click to toggle source

Slurps the remaining data rows and returns a ‘CSV::Table`.

This is essentially the same behavior as ‘CSV#read`, but ensures that the header info has been fetched first, and the resulting table will never include the header row.

Note that the Ruby documentation (at least as of 2.2.2) is for ‘CSV#read` is incomplete and simply says that it returns “an Array of Arrays”, but it actually returns a table if a truthy `:headers` option was used when creating the `CSV` object.

@return [CSV::Table] a table of remaining unread rows

# File lib/csvh/reader.rb, line 254
def read
  headers
  @csv.read
end
Also aliased as: readlines
readline()
Alias for: shift
readlines()
Alias for: read
shift() click to toggle source

A single data row is pulled from the data source, parsed and returned as a CSV::Row.

This is essentially the same behavior as ‘CSV#shift`, but ensures that the header info has been fetched first, and shift will never return the header row.

@return [CSV::Row] the next previously unread row

# File lib/csvh/reader.rb, line 269
def shift
  headers
  @csv.shift
end
Also aliased as: gets, readline
to_csvh_reader() click to toggle source

@return [Reader] the target of the method call.

# File lib/csvh/reader.rb, line 128
def to_csvh_reader
  self
end