module CSVDecision::Data
Methods to load data from a file, CSV string or an array of arrays. @api private
Constants
- CSV_OPTIONS
Options
passed to CSV.parse and CSV.read.
Public Class Methods
If the input is a file name return true, otherwise false.
@param data (see Parse.parse) @return [Boolean] Set to true if the input data is passed as a File or Pathname.
# File lib/csv_decision/data.rb, line 34 def self.input_file?(data) data.is_a?(Pathname) || data.is_a?(File) end
Strip the empty columns from the input data rows.
@param data (see Parse.parse) @param empty_columns [Array<Index>] @return [Array<Array<String>>] Data
array stripped of empty columns.
# File lib/csv_decision/data.rb, line 43 def self.strip_columns(data:, empty_columns:) # Adjust column indices as we delete columns the rest shift to the left by 1 empty_columns.map!.with_index { |col, index| col - index } # Delete all empty columns from the array of arrays empty_columns.each { |col| data.each_index { |row| data[row].delete_at(col) } } end
Parse
the input data which may either be a file path name, CSV string or array of arrays. Strips out empty columns/rows and comment cells.
@param data (see Parse.parse) @return [Array<Array<String>>] Data
array stripped of empty rows.
# File lib/csv_decision/data.rb, line 26 def self.to_array(data:) strip_rows(data: data_array(data)) end
Private Class Methods
Parse
the input data which may either be a file path name, CSV string or array of arrays
# File lib/csv_decision/data.rb, line 53 def self.data_array(input) return CSV.read(input, CSV_OPTIONS) if input_file?(input) return input.deep_dup if input.is_a?(Array) && input[0].is_a?(Array) return CSV.parse(input, CSV_OPTIONS) if input.is_a?(String) raise ArgumentError, "#{input.class} input invalid; " \ 'input must be a file path name, CSV string or array of arrays' end
# File lib/csv_decision/data.rb, line 82 def self.strip_cell(cell) return '' unless cell.is_a?(String) cell = cell.force_encoding('UTF-8') return '' unless cell.ascii_only? return '' if cell.lstrip[0] == COMMENT_CHARACTER cell.strip end
Strip cells of leading/trailing spaces; treat comments as an empty cell. Non string values treated as empty cells. Non-ascii strings treated as empty cells by default.
# File lib/csv_decision/data.rb, line 77 def self.strip_cells(row:) row.map! { |cell| strip_cell(cell) } end
# File lib/csv_decision/data.rb, line 64 def self.strip_rows(data:) rows = [] data.each do |row| row = strip_cells(row: row) rows << row if row.find { |cell| cell != '' } end rows end