class EncodingEstimator::Conversion

Class that represents the conversion of a string to or from an other encoding

Constants

DEFAULT_TARGET_ENCODING

Attributes

encoding[R]
operation[R]

Public Class Methods

default() click to toggle source

Get the default (utf-8) encoding conversion: does nothing when performing the conversion.

@return [EncodingEstimator::Conversion] A conversion object representing the default conversion

# File lib/encoding_estimator/conversion.rb, line 64
def self.default
  EncodingEstimator::Conversion.new
end
generate( encodings = %w(utf-8 iso-8859-1 Windows-1251), operations = [Operation::ENCODE, Operation::DECODE ], include_no_change = true ) click to toggle source

Generate all conversions of for given encodings and operations. Note: this will produce encodings * operations conversions if default is not included and encoding * operations + 1 if the default is included.

@param [Array<String>] encodings Names of the encodings to generate conversions for @param [Array<Symbol>] operations Operations describing which conversions (encode/decode/keep) to include @param [Boolean] include_no_change Include the default conversion (keep UTF-8) in the list @return [Array<Conversion>] List of conversions generated from the encodings and operations

# File lib/encoding_estimator/conversion.rb, line 76
def self.generate(
    encodings  = %w(utf-8 iso-8859-1 Windows-1251),
    operations = [Operation::ENCODE, Operation::DECODE ],
    include_no_change = true
)

  conversions = include_no_change ? [ Conversion.new ] : []

  encodings.each do |encoding|
    conversions = conversions + operations.map { |operation| Conversion.new( encoding, operation )  }
  end

  conversions
end
new( encoding = DEFAULT_TARGET_ENCODING, operation = Operation::KEEP ) click to toggle source

Initialize a new conversion object from an encoding and tell it whether to convert from it or to it

@param [String] encoding Encoding to convert to/from @param [Symbol] operation Whether to convert from that encoding or to it

# File lib/encoding_estimator/conversion.rb, line 21
def initialize( encoding = DEFAULT_TARGET_ENCODING, operation = Operation::KEEP )
  @encoding  = encoding
  @operation = operation
end

Public Instance Methods

equals?( other ) click to toggle source

Check if two conversions are representing the same operation.

@param [EncodingEstimator::Conversion] other Conversion to compare this instance to @return [Boolean] True if equal, false if not

# File lib/encoding_estimator/conversion.rb, line 30
def equals?( other )

  # Not the same encoding? Cannot be equal
  return false if other.encoding.to_s != self.encoding.to_s

  # If the default and the target encoding is the same, the operation doesn't matter
  # as the conversion does nothing at all
  return true if self.encoding.to_s == DEFAULT_TARGET_ENCODING.to_s

  # Not the default encoding, so check if the operation is the same
  self.operation == other.operation
end
key() click to toggle source

Get the internal name (unique key) for this conversion. Useful when storing/referencing conversions in hashes.

@return [String] Unique key of this conversion

# File lib/encoding_estimator/conversion.rb, line 57
def key
  @key ||= "#{operation}_#{encoding}"
end
perform( data ) click to toggle source

Perform the conversion with the current settings on a given string

@param [String] data String to encode/decode @return [String] The encoded/decoded string

# File lib/encoding_estimator/conversion.rb, line 47
def perform( data )
  return encode( data, encoding ) if operation == Operation::ENCODE
  return decode( data, encoding ) if operation == Operation::DECODE
  data
end

Private Instance Methods

decode( str, encoding ) click to toggle source

Decode a given string from a given encoding to the default (UTF-8).

@param [String] str String to decode @param [String] encoding Name of the encoding used to decode the string @return [String] The decoded string

# File lib/encoding_estimator/conversion.rb, line 109
def decode( str, encoding )
  str.clone.force_encoding( encoding ).encode(
      DEFAULT_TARGET_ENCODING.to_s, invalid: :replace, undef: :replace, replace: ''
  )
end
encode( str, encoding ) click to toggle source

Encode a given string from the default (UTF-8) to a given encoding.

@param [String] str String to encode @param [String] encoding Name of the encoding used to encode the string @return [String] The encoded string

# File lib/encoding_estimator/conversion.rb, line 98
def encode( str, encoding )
  str.clone.force_encoding( DEFAULT_TARGET_ENCODING.to_s ).encode(
      encoding, invalid: :replace, undef: :replace, replace: ''
  ).force_encoding( DEFAULT_TARGET_ENCODING.to_s )
end