class FileConcat::PrePostConcat

A more complex file concatenation allowing for the use of some pre and post processing functionality during the actual run. We do our best to rely on the file system wherever possible so we can use the I/O optimization that is written by the experts.

Attributes

batch_count[R]
batch_first_line[RW]
batch_last_line[RW]
file_first_line[RW]
file_last_line[RW]

Public Class Methods

new(file_provider, target_dir, target_ext, batch_size) click to toggle source

This method does:

Creates a new instance of our PrePostConcat processor.

With params:

@param file_provider <Proc> - the provider of files to concatenate @param target_dir <String> - the path to store the batch files in @param target_ext <String> - the extension for the batch files @param batch_size <Integer> - the total number of files to concatenate into one

And returns:

@return batch_count <Integer> - the total number of batch files created

And is used like:

FileBatcher.new my_proc, '/path/to/output', 'xml', 25

# File lib/file_concat/pre_post_concat.rb, line 36
def initialize(file_provider, target_dir, target_ext, batch_size)
  @file_provider = file_provider
  @target_dir = target_dir
  @target_ext = target_ext
  @batch_size = batch_size
  @batch_count = 0
end

Public Instance Methods

run() click to toggle source

This method does:

Actually executes file concatenation.

With params:

N/A

And returns:

@return batch_count <Integer> - the total number of batch files created

And is used like:

batches = instance.run

# File lib/file_concat/pre_post_concat.rb, line 60
def run
  # reset the batch count
  @batch_count = 0

  # if the target doesn't exist, create it... if it does, empty it
  FileUtils.remove_dir @target_dir if Dir.exist? @target_dir
  Dir.mkdir @target_dir

  # loop over all the files
  current_file_index = 0
  active_batch_file = nil

  @file_provider.call.each do |next_file|
    # create a new file to concatenate to
    current_file_index += 1
    if (current_file_index - 1) % @batch_size == 0
      @batch_count += 1
      end_batch_file(active_batch_file) unless active_batch_file.nil?
      active_batch_file = new_batch_file
    end

    # concatenate the file
    concat_to active_batch_file, next_file
  end
  end_batch_file(active_batch_file) unless active_batch_file.nil?

  # return the batch count
  @batch_count
end

Private Instance Methods

blank?(string) click to toggle source

This method does:

Utility method to test if strings are empty.

With params:

@param string <String> - the string to test

And returns:

@return blank <Boolean> whether or not the string is blank

And is used like:

blank? 'something'

# File lib/file_concat/pre_post_concat.rb, line 195
def blank?(string)
  string.to_s.strip.empty?
end
concat_to(target_file, source_file) click to toggle source

This method does:

Concatenates the source file to the target file.

With params:

@param target_file <File> - the file to concatenate the other to @param source_file <String> - path to the file to concatenate to the other

And returns:

N/A

And is used like:

instance.concat_to(target_file, source_file)

# File lib/file_concat/pre_post_concat.rb, line 164
def concat_to(target_file, source_file)
  # get the first line
  first_line = file_first_line.nil? ? nil : file_first_line.call(source_file)
  File.open(target_file, 'a') { |f| f.puts first_line } unless blank? first_line

  # concat the file
  system("cat \"#{source_file}\" >> \"#{target_file}\"")
  exit_code = $?.exitstatus
  raise StandardError, "Error concatenating files : Target[#{target_file}], Source[#{source_file}]" unless exit_code.zero?

  # get the last line
  last_line = file_last_line.nil? ? nil : file_last_line.call(source_file)
  File.open(target_file, 'a') { |f| f.puts last_line } unless blank? last_line
end
end_batch_file(file) click to toggle source

This method does:

Ends a batch file.

With params:

@param file <File> - the actual file we are ending work with

And returns:

N/A

And is used like:

instance.end_batch_file(active_batch_file)

# File lib/file_concat/pre_post_concat.rb, line 140
def end_batch_file(file)
  unless batch_last_line.nil?
    last_line = batch_last_line.call(file)
    File.open(file, 'a') { |f| f.puts last_line } unless last_line.nil?
  end
end
new_batch_file() click to toggle source

This method does:

Sets up the new batch file to use for processing.

With params:

N/A

And returns:

@return file_path <File> - the complete path to the next batch file

And is used like:

next_file = instance.new_batch_file(@batch_count)

# File lib/file_concat/pre_post_concat.rb, line 108
def new_batch_file
  # basic setup
  formatted_index = batch_count.to_s.rjust(5, '0')
  file_name = "Batch-#{formatted_index}.#{@target_ext}"
  file_path = File.join(@target_dir, file_name)

  # if we have a batch first line, use it
  unless batch_first_line.nil?
    first_line = batch_first_line.call(file_name)
    File.open(file_path, 'w') { |f| f.puts first_line } unless first_line.nil?
  end

  # exit
  file_path
end