class NewspaperWorks::Ingest::PageImage
Represents TIFF/JP2 page, access to file, page-numbering metadata
Attributes
issue[RW]
path[RW]
sequence[RW]
Public Class Methods
new(path, issue, sequence)
click to toggle source
# File lib/newspaper_works/ingest/page_image.rb, line 9 def initialize(path, issue, sequence) # path to image: @path = path validate_path # Issue is NewspaperWorks::Ingest::IssueImages object @issue = issue # sequence is page sequence number (Integer) @sequence = sequence.to_i end
Public Instance Methods
named_page_number()
click to toggle source
Page number inferred from image filename, or nil, presuming that:
- The page number follows the actual word "page" (case-insenstive) in filename, possibly separated by a dash or underscore. - The page number is terminated by the period-plus-file-extension. - Both of the above can be determined by regular expression match. - Extraneous leading information in filename (e.g. datestamp) will be ignored. - Examples: - 'Page1.tiff' - '2019091801-page_1.jp2' - 'page_C2.tiff'
@return [String, NilClass] page number string, or nil if indecipherable
# File lib/newspaper_works/ingest/page_image.rb, line 31 def named_page_number pattern = /(page)([_-]?)([^.]+)([.])/i match = pattern.match(path) match.nil? ? nil : match[3] end
page_number()
click to toggle source
# File lib/newspaper_works/ingest/page_image.rb, line 37 def page_number named_page_number || @sequence.to_s end
title()
click to toggle source
# File lib/newspaper_works/ingest/page_image.rb, line 41 def title ["#{@issue.title.first}: Page #{page_number}"] end
validate_path()
click to toggle source
# File lib/newspaper_works/ingest/page_image.rb, line 45 def validate_path # expect path to be regular file, that exists: raise ArgumentError unless File.exist?(path) raise ArgumentError unless File.ftype(path) == 'file' end