module Lccnorm

Constants

VERSION

Public Class Methods

normalize(lccn) click to toggle source

Normalize LCCN according to Library of Congress rules (www.loc.gov/marc/lccn-namespace.html)

An LCCN is to be normalized to its canonical form described in the syntax description above, as follows:

  1. Remove all blanks.

  2. If there is a forward slash (/) in the string, remove it, and remove all

characters to the right of the forward slash.

  1. If there is a hyphen in the string:

    • Remove it.

    • Inspect the substring following (to the right of) the (removed) hyphen. Then (and assuming that steps 1 and 2 have been carried out):

      *  All these characters should be digits, and there should be six
         or less.
      *  If the length of the substring is less than 6, left-fill the
         substring with zeros until the length is six.
# File lib/lccnorm.rb, line 27
def self.normalize(lccn)
  l, r = lccn.tr(' ', '').split('/')[0].split('-', 2)

  # No hyphen? Easy
  if r.nil?
    return l
  end

  # Make sure there are 6 or fewer characters to the right of the hyphen
  # (there has to be at least on or we already would have returned) and pad to
  # the left with zeros
  if r =~ /\A\d{1,6}\z/
    return l + r.rjust(6, '0')
  end

  # If we get this far, something is wrong with the number
  raise InvalidLccnError.new(
          "%s is not a valid LCCN " % lccn + \
          "(part to the right of hyphen should be 6 numeric characters " + \
          "or fewer")
  
end
valid?(lccn) click to toggle source

Validate LCCN according to Library of Congress rules (www.loc.gov/marc/lccn-namespace.html)

A normalized LCCN is a character string eight to twelve characters in length. (For purposes of this description characters are ordered from left to right – “first” means “leftmost”.)

  • The rightmost eight characters are always digits.

  • If the length is 9, then the first character must be alphabetic.

  • If the length is 10, then the first two characters must be either both digits or both alphabetic.

  • If the length is 11, then the first character must be alphabetic and the next two characters must be either both digits or both alphabetic.

  • If the length is 12, then the first two characters must be alphabetic and the remaining characters digits.

# File lib/lccnorm.rb, line 65
def self.valid?(lccn)
  if lccn =~ /\A([A-z]{2}\d{2}|([A-z]?([A-z]{2}|\d{2}))|[A-z])?\d{8}\z/
    return true
  end

  return false
end