module DataMetaAvro

DataMetaDOM and Avro Schemas.

For command line details either check the new method's source or the README, the usage section.

Constants

AVRO_TYPES

Mapping from a DataMeta DOM type to a matching renderer of Avro schema JSON. The lambda expect whole DataMetaDom::Field instance, must return the whole specification that you would put under the "type": JSON tag, such as:

"int"

or, for a type with a size:

{ "type": "fixed", "name": "theFieldName", "size": 16}

Note that wrapping this type into optional specification, i.e. unioned with "null" is done by calling the avroType method.

GEM_ROOT

The root of the gem.

TMPL_ROOT

Location of templates.

VERSION

Current version

Public Class Methods

assertMapKeyType(fld, type) click to toggle source
# File lib/dataMetaAvro.rb, line 141
def assertMapKeyType(fld, type)
    raise ArgumentError, %<Field "#{fld.name}": Avro supports only strings as map keys, "#{
        type}" is not supported as a map key by Avro> unless type == DataMetaDom::STRING
end
assertNamespace(fullName) click to toggle source

Splits the full name of a class into the namespace and the base, returns an array of the namespace (empty string if there is no namespace on the name) and the base name.

Examples:

  • 'BaseNameAlone' -> ['', 'BaseNameAlone']

  • 'one.package.another.pack.FinallyTheName' -> ['one.package.another.pack', 'FinallyTheName']

# File lib/dataMetaAvro.rb, line 113
def assertNamespace(fullName)
  ns, base = DataMetaDom::splitNameSpace(fullName)
  [DataMetaDom.validNs?(ns, base) ? ns : '', base]
end
avroType(dataMetaType) click to toggle source

Converts DataMeta DOM type to Avro schema type.

# File lib/dataMetaAvro.rb, line 73
def avroType(dataMetaType)
    renderer = AVRO_TYPES[dataMetaType.type]
    raise "Unsupported type #{dataMetaType}" unless renderer
    renderer.call(dataMetaType)
end
genRecordJson(model, outFile, rec, nameSpace, base) click to toggle source

Generates an Avro Schema for the given model's record.

It makes impression that some parameters are not used, but it is not so: they are used by the ERB template as the part of the method's binding.

The parameters nameSpace and the base can be derived from rec, but since they are evaluated previously by calling assertNamespace, can just as well reuse them.

  • Params:

    • model - DataMetaDom::Model

    • outFile - output file name

    • rec - DataMetaDom::Record

    • nameSpace - the namespace for the record

    • base - base name of the record

# File lib/dataMetaAvro.rb, line 100
def genRecordJson(model, outFile, rec, nameSpace, base)
    vars =  OpenStruct.new # for template's local variables. ERB does not make them visible to the binding
    IO.write(outFile, "#{ERB.new(IO.read("#{TMPL_ROOT}/dataClass.avsc.erb"), 0, '-').result(binding)}", {:mode => 'wb'})
end
genSchema(model, outRoot) click to toggle source

Generates the Avro Schema, one avsc file per a record.

# File lib/dataMetaAvro.rb, line 121
def genSchema(model, outRoot)
  model.records.values.each { |rec| # loop through all the records in the model
    nameSpace, base = assertNamespace(rec.name)
    FileUtils.mkdir_p outRoot # write json files named as one.package.another.package.ClassName.json in one dir
    outFile = File.join(outRoot, "#{rec.name}.avsc")
      case
        when rec.kind_of?(DataMetaDom::Record)
            genRecordJson model, outFile, rec, nameSpace, base
        else # since we are cycling through records, should never get here
          raise "Unsupported Entity: #{rec.inspect}"
      end
  }
end
helpAvroSchemaGen(file, errorText=nil) click to toggle source

Shortcut to help for the Hadoop Writables generator.

# File lib/dataMetaAvro.rb, line 136
def helpAvroSchemaGen(file, errorText=nil)
    DataMetaDom::help(file, "DataMeta DOM Avro Schema Generation ver #{VERSION}",
                      '<DataMeta DOM source> <Avro Schemas target dir>', errorText)
end
wrapReqOptional(field, baseType) click to toggle source

Wraps required/optional in proper enclosure

# File lib/dataMetaAvro.rb, line 80
def wrapReqOptional(field, baseType)
    field.isRequired ? baseType : %Q^[#{baseType}, "null"]^
end

Private Instance Methods

assertMapKeyType(fld, type) click to toggle source
# File lib/dataMetaAvro.rb, line 141
def assertMapKeyType(fld, type)
    raise ArgumentError, %<Field "#{fld.name}": Avro supports only strings as map keys, "#{
        type}" is not supported as a map key by Avro> unless type == DataMetaDom::STRING
end
assertNamespace(fullName) click to toggle source

Splits the full name of a class into the namespace and the base, returns an array of the namespace (empty string if there is no namespace on the name) and the base name.

Examples:

  • 'BaseNameAlone' -> ['', 'BaseNameAlone']

  • 'one.package.another.pack.FinallyTheName' -> ['one.package.another.pack', 'FinallyTheName']

# File lib/dataMetaAvro.rb, line 113
def assertNamespace(fullName)
  ns, base = DataMetaDom::splitNameSpace(fullName)
  [DataMetaDom.validNs?(ns, base) ? ns : '', base]
end
avroType(dataMetaType) click to toggle source

Converts DataMeta DOM type to Avro schema type.

# File lib/dataMetaAvro.rb, line 73
def avroType(dataMetaType)
    renderer = AVRO_TYPES[dataMetaType.type]
    raise "Unsupported type #{dataMetaType}" unless renderer
    renderer.call(dataMetaType)
end
genRecordJson(model, outFile, rec, nameSpace, base) click to toggle source

Generates an Avro Schema for the given model's record.

It makes impression that some parameters are not used, but it is not so: they are used by the ERB template as the part of the method's binding.

The parameters nameSpace and the base can be derived from rec, but since they are evaluated previously by calling assertNamespace, can just as well reuse them.

  • Params:

    • model - DataMetaDom::Model

    • outFile - output file name

    • rec - DataMetaDom::Record

    • nameSpace - the namespace for the record

    • base - base name of the record

# File lib/dataMetaAvro.rb, line 100
def genRecordJson(model, outFile, rec, nameSpace, base)
    vars =  OpenStruct.new # for template's local variables. ERB does not make them visible to the binding
    IO.write(outFile, "#{ERB.new(IO.read("#{TMPL_ROOT}/dataClass.avsc.erb"), 0, '-').result(binding)}", {:mode => 'wb'})
end
genSchema(model, outRoot) click to toggle source

Generates the Avro Schema, one avsc file per a record.

# File lib/dataMetaAvro.rb, line 121
def genSchema(model, outRoot)
  model.records.values.each { |rec| # loop through all the records in the model
    nameSpace, base = assertNamespace(rec.name)
    FileUtils.mkdir_p outRoot # write json files named as one.package.another.package.ClassName.json in one dir
    outFile = File.join(outRoot, "#{rec.name}.avsc")
      case
        when rec.kind_of?(DataMetaDom::Record)
            genRecordJson model, outFile, rec, nameSpace, base
        else # since we are cycling through records, should never get here
          raise "Unsupported Entity: #{rec.inspect}"
      end
  }
end
helpAvroSchemaGen(file, errorText=nil) click to toggle source

Shortcut to help for the Hadoop Writables generator.

# File lib/dataMetaAvro.rb, line 136
def helpAvroSchemaGen(file, errorText=nil)
    DataMetaDom::help(file, "DataMeta DOM Avro Schema Generation ver #{VERSION}",
                      '<DataMeta DOM source> <Avro Schemas target dir>', errorText)
end
wrapReqOptional(field, baseType) click to toggle source

Wraps required/optional in proper enclosure

# File lib/dataMetaAvro.rb, line 80
def wrapReqOptional(field, baseType)
    field.isRequired ? baseType : %Q^[#{baseType}, "null"]^
end