class CombinePDF::PDF

PDF class is the PDF object that can save itself to a file and that can be used as a container for a full PDF file data, including version, information etc'.

PDF objects can be used to combine or to inject data.

Combine/Merge PDF files or Pages

To combine PDF files (or data):

pdf = CombinePDF.new
pdf << CombinePDF.load("file1.pdf") # one way to combine, very fast.
pdf << CombinePDF.load("file2.pdf")
pdf.save "combined.pdf"

or even a one liner:

(CombinePDF.load("file1.pdf") << CombinePDF.load("file2.pdf") << CombinePDF.load("file3.pdf")).save("combined.pdf")

you can also add just odd or even pages:

pdf = CombinePDF.new
i = 0
CombinePDF.load("file.pdf").pages.each do |page|
  i += 1
  pdf << page if i.even?
end
pdf.save "even_pages.pdf"

notice that adding all the pages one by one is slower then adding the whole file.

Add content to existing pages (Stamp / Watermark)

To add content to existing PDF pages, first import the new content from an existing PDF file. after that, add the content to each of the pages in your existing PDF.

in this example, we will add a company logo to each page:

company_logo = CombinePDF.load("company_logo.pdf").pages[0]
pdf = CombinePDF.load "content_file.pdf"
pdf.pages.each {|page| page << company_logo} # notice the << operator is on a page and not a PDF object.
pdf.save "content_with_logo.pdf"

Notice the << operator is on a page and not a PDF object. The << operator acts differently on PDF objects and on Pages.

The << operator defaults to secure injection by renaming references to avoid conflics. For overlaying pages using compressed data that might not be editable (due to limited filter support), you can use:

pdf.pages(nil, false).each {|page| page << stamp_page}

Page Numbering

adding page numbers to a PDF object or file is as simple as can be:

pdf = CombinePDF.load "file_to_number.pdf"
pdf.number_pages
pdf.save "file_with_numbering.pdf"

numbering can be done with many different options, with different formating, with or without a box object, and even with opacity values.

Loading PDF data

Loading PDF data can be done from file system or directly from the memory.

Loading data from a file is easy:

pdf = CombinePDF.load("file.pdf")

you can also parse PDF files from memory:

pdf_data = IO.read 'file.pdf' # for this demo, load a file to memory
pdf = CombinePDF.parse(pdf_data)

Loading from the memory is especially effective for importing PDF data recieved through the internet or from a different authoring library such as Prawn.

Constants

HASH_MERGE_NEW_NO_PAGE

@private JRuby Alternative this method reviews a Hash and updates it by merging Hash data, preffering the new over the old.

POSSIBLE_NAME_TREES
PRIVATE_HASH_KEYS

lists the Hash keys used for PDF objects

the CombinePDF library doesn't use special classes for its objects (PDFPage class, PDFStream class or anything like that).

there is only one PDF class which represents the whole of the PDF file.

this Hash lists the private Hash keys that the CombinePDF library uses to differentiate between complex PDF objects.

Attributes

forms_data[R]

the form_data attribute is a Hash that corresponds to the PDF form data (if any).

info[R]

the info attribute is a Hash that sets the Info data for the PDF. use, for example:

pdf.info[:Title] = "title"
names[R]

Access the Names PDF object Hash (or reference). Use with care.

objects[R]

the objects attribute is an Array containing all the PDF sub-objects for te class.

outlines[R]

Access the Outlines PDF object Hash (or reference). Use with care.

version[RW]

set/get the PDF version of the file (1.1-1.7) - shuold be type Float.

viewer_preferences[R]

the viewer_preferences attribute is a Hash that sets the ViewerPreferences data for the PDF. use, for example:

pdf.viewer_preferences[:HideMenubar] = true

Public Class Methods

new(parser = nil) click to toggle source
# File lib/combine_pdf/pdf_public.rb, line 89
def initialize(parser = nil)
  # default before setting
  @objects = []
  @version = 0
  @viewer_preferences = {}
  @info = {}
  parser ||= PDFParser.new('')
  raise TypeError, "initialization error, expecting CombinePDF::PDFParser or nil, but got #{parser.class.name}" unless parser.is_a? PDFParser
  @objects = parser.parse
  # remove any existing id's
  remove_old_ids
  # set data from parser
  @version = parser.version if parser.version.is_a? Float
  @info = parser.info_object || {}
  @names = parser.names_object || {}
  @forms_data = parser.forms_object || {}
  @outlines = parser.outlines_object || {}
  # rebuild the catalog, to fix wkhtmltopdf's use of static page numbers
  rebuild_catalog

  # general globals
  @set_start_id = 1
  @info[:Producer] = "Ruby CombinePDF #{CombinePDF::VERSION} Library"
  @info.delete :CreationDate
  @info.delete :ModDate
end

Public Instance Methods

<<(data) click to toggle source

add the pages (or file) to the PDF (combine/merge) and RETURNS SELF, for nesting. for example:

pdf = CombinePDF.new "first_file.pdf"

pdf << CombinePDF.new "second_file.pdf"

pdf.save "both_files_merged.pdf"
data

is PDF page (Hash), and Array of PDF pages or a parsed PDF object to be added.

# File lib/combine_pdf/pdf_public.rb, line 280
def <<(data)
  insert(-1, data)
end
>>(data) click to toggle source

add the pages (or file) to the BEGINNING of the PDF (combine/merge) and RETURNS SELF for nesting operators. for example:

pdf = CombinePDF.new "second_file.pdf"

pdf >> CombinePDF.new "first_file.pdf"

pdf.save "both_files_merged.pdf"
data

is PDF page (Hash), and Array of PDF pages or a parsed PDF object to be added.

# File lib/combine_pdf/pdf_public.rb, line 293
def >>(data)
  insert 0, data
end
author() click to toggle source

get the author value for the pdf. The author is stored in the information dictionary and isn't required

# File lib/combine_pdf/pdf_public.rb, line 142
def author
  @info[:Author]
end
author=(new_author = nil) click to toggle source

set the author value for the pdf. The author is stored in the information dictionary and isn't required

new_title

a string that is the new author value.

# File lib/combine_pdf/pdf_public.rb, line 150
def author=(new_author = nil)
  @info[:Author] = new_author
end
clear_forms_data() click to toggle source

Clears any existing form data.

# File lib/combine_pdf/pdf_public.rb, line 155
def clear_forms_data
  @forms_data.nil? || @forms_data.clear
end
fonts(limit_to_type0 = false) click to toggle source

returns an array with the different fonts used in the file.

Type0 font objects ( “font == :Type0” ) can be registered with the font library for use in PDFWriter objects (font numbering / table creation etc'). @param limit_to_type0 [true,false] limits the list to type0 fonts.

# File lib/combine_pdf/pdf_public.rb, line 256
def fonts(limit_to_type0 = false)
  fonts_array = []
  pages.each do |pg|
    if pg[:Resources][:Font]
      pg[:Resources][:Font].values.each do |f|
        f = f[:referenced_object] if f[:referenced_object]
        if (limit_to_type0 || f[:Subtype] == :Type0) && f[:Type] == :Font && !fonts_array.include?(f)
          fonts_array << f
        end
      end
    end
  end
  fonts_array
end
insert(location, data) click to toggle source

add PDF pages (or PDF files) into a specific location.

returns the new pages Array! (unlike `#<<`, doesn't return self!)

location

the location for the added page(s). Could be any number. negative numbers represent a count backwards (-1 being the end of the page array and 0 being the begining). if the location is beyond bounds, the pages will be added to the end of the PDF object (or at the begining, if the out of bounds was a negative number).

data

a PDF page, a PDF file (CombinePDF.new “filname.pdf”) or an array of pages (CombinePDF.new("filname.pdf").pages[0..3]).

# File lib/combine_pdf/pdf_public.rb, line 303
def insert(location, data)
  pages_to_add = nil
  if data.is_a? PDF
    @version = [@version, data.version].max
    pages_to_add = data.pages
    actual_value(@names ||= {}.dup).update data.names, &HASH_MERGE_NEW_NO_PAGE
    merge_outlines((@outlines ||= {}.dup), actual_value(data.outlines), location) unless actual_value(data.outlines).empty?
    if actual_value(@forms_data)
      actual_value(@forms_data).update actual_value(data.forms_data), &HASH_MERGE_NEW_NO_PAGE if data.forms_data
    else
      @forms_data = data.forms_data
    end
    warn 'Form data might be lost when combining PDF forms (possible conflicts).' unless data.forms_data.nil? || data.forms_data.empty?
  elsif data.is_a?(Array) && (data.select { |o| !(o.is_a?(Hash) && o[:Type] == :Page) }).empty?
    pages_to_add = data
  elsif data.is_a?(Hash) && data[:Type] == :Page
    pages_to_add = [data]
  else
    warn "Shouldn't add objects to the file unless they are PDF objects or PDF pages (an Array or a single PDF page)."
    return false # return false, which will also stop any chaining.
  end
  # pages_to_add.map! {|page| page.copy }
  catalog = rebuild_catalog
  pages_array = catalog[:Pages][:referenced_object][:Kids]
  page_count = pages_array.length
  if location < 0 && (page_count + location < 0)
    location = 0
  elsif location > 0 && (location > page_count)
    location = page_count
  end
  pages_array.insert location, pages_to_add
  pages_array.flatten!
  self
end
new_page(mediabox = [0, 0, 612.0, 792.0], _location = -1) click to toggle source

adds a new page to the end of the PDF object.

returns the new page object.

unless the media box is specified, it defaults to US Letter: [0, 0, 612.0, 792.0]

# File lib/combine_pdf/pdf_public.rb, line 121
def new_page(mediabox = [0, 0, 612.0, 792.0], _location = -1)
  p = PDFWriter.new(mediabox)
  insert(-1, p)
  p
end
number_pages(options = {}) click to toggle source

add page numbers to the PDF

For unicode text, a unicode font(s) must first be registered. the registered font(s) must supply the subset of characters used in the text. UNICODE IS AN ISSUE WITH THE PDF FORMAT - USE CAUSION.

options

a Hash of options setting the behavior and format of the page numbers:

  • :number_format a string representing the format for page number. defaults to ' - %s - ' (allows for letter numbering as well, such as “a”, “b”…).

  • :location an Array containing the location for the page numbers, can be :top, :bottom, :top_left, :top_right, :bottom_left, :bottom_right or :center (:center == full page). defaults to [:top, :bottom].

  • :start_at an Integer that sets the number for first page number. also accepts a letter (“a”) for letter numbering. defaults to 1.

  • :margin_from_height a number (PDF points) for the top and bottom margins. defaults to 45.

  • :margin_from_side a number (PDF points) for the left and right margins. defaults to 15.

  • :page_range a range of pages to be numbered (i.e. (2..-1) ) defaults to all the pages (nil). Remember to set the :start_at to the correct value.

the options Hash can also take all the options for {Page_Methods#textbox}. defaults to font: :Helvetica, font_size: 12 and no box (:border_width => 0, :box_color => nil).

# File lib/combine_pdf/pdf_public.rb, line 367
def number_pages(options = {})
  opt = {
    number_format: ' - %s - ',
    start_at: 1,
    font: :Helvetica,
    margin_from_height: 45,
    margin_from_side: 15
  }
  opt.update options
  opt[:location] ||= opt[:number_location] ||= opt[:stamp_location] ||= [:top, :bottom]
  opt[:location] = [opt[:location]] unless opt[:location].is_a? Array

  page_number = opt[:start_at]
  format_repeater = opt[:number_format].count('%')
  just_center = [:center]
  small_font_size = opt[:font_size] || 12

  # some common computations can be done only once.
  from_height = opt[:margin_from_height]
  from_side = opt[:margin_from_side]
  left_position = from_side

  (opt[:page_range] ? pages[opt[:page_range]] : pages).each do |page|
    # Get page dimensions
    mediabox = page[:CropBox] || page[:MediaBox] || [0, 0, 595.3, 841.9]
    # set stamp text
    text = opt[:number_format] % (Array.new(format_repeater) { page_number })
    if opt[:location].include? :center
      add_opt = {}
      if opt[:margin_from_height] && !opt[:height] && !opt[:y]
        add_opt[:height] = mediabox[3] - mediabox[1] - (2 * opt[:margin_from_height].to_f)
        add_opt[:y] = opt[:margin_from_height]
      end
      if opt[:margin_from_side] && !opt[:width] && !opt[:x]
        add_opt[:width] = mediabox[2] - mediabox[0] - (2 * opt[:margin_from_side].to_f)
        add_opt[:x] = opt[:margin_from_side]
      end
      page.textbox text, opt.merge(add_opt)
    end
    unless opt[:location] == just_center
      add_opt = { font_size: small_font_size }.merge(opt)
      # text = opt[:number_format] % page_number
      # compute locations for text boxes
      text_dimantions = Fonts.dimensions_of(text, opt[:font], small_font_size)
      box_width = text_dimantions[0] * 1.2
      box_height = text_dimantions[1] * 2
      page_width = mediabox[2]
      page_height = mediabox[3]

      add_opt[:width] ||= box_width
      add_opt[:height] ||= box_height

      center_position = (page_width - box_width) / 2
      right_position = page_width - from_side - box_width
      top_position = page_height - from_height
      bottom_position = from_height + box_height

      if opt[:location].include? :top
        page.textbox text, { x: center_position, y: top_position }.merge(add_opt)
      end
      if opt[:location].include? :bottom
        page.textbox text, { x: center_position, y: bottom_position }.merge(add_opt)
      end
      if opt[:location].include? :top_left
        page.textbox text, { x: left_position, y: top_position, font_size: small_font_size }.merge(add_opt)
      end
      if opt[:location].include? :bottom_left
        page.textbox text, { x: left_position, y: bottom_position, font_size: small_font_size }.merge(add_opt)
      end
      if opt[:location].include? :top_right
        page.textbox text, { x: right_position, y: top_position, font_size: small_font_size }.merge(add_opt)
      end
      if opt[:location].include? :bottom_right
        page.textbox text, { x: right_position, y: bottom_position, font_size: small_font_size }.merge(add_opt)
      end
    end
    page_number = page_number.succ
  end
end
pages(catalogs = nil) click to toggle source

this method returns all the pages cataloged in the catalog.

if no catalog is passed, it seeks the existing catalog(s) and searches for any registered Page objects.

Page objects are Hash class objects. the page methods are added using a mixin or inheritance.

catalogs

a catalog, or an Array of catalog objects. defaults to the existing catalog.

# File lib/combine_pdf/pdf_public.rb, line 224
def pages(catalogs = nil)
  page_list = []
  catalogs ||= get_existing_catalogs

  if catalogs.is_a?(Array)
    catalogs.each { |c| page_list.concat pages(c) unless c.nil? }
  elsif catalogs.is_a?(Hash)
    if catalogs[:is_reference_only]
      if catalogs[:referenced_object]
        page_list.concat pages(catalogs[:referenced_object])
      else
        warn "couldn't follow reference!!! #{catalogs} not found!"
      end
    else
      case catalogs[:Type]
      when :Page
        page_list << catalogs
      when :Pages
        page_list.concat pages(catalogs[:Kids]) unless catalogs[:Kids].nil?
      when :Catalog
        page_list.concat pages(catalogs[:Pages]) unless catalogs[:Pages].nil?
      end
    end
  end
  page_list
end
remove(page_index) click to toggle source

removes a PDF page from the file and the catalog

returns the removed page.

returns nil if failed or if out of bounds.

page_index

the page's index in the zero (0) based page array. negative numbers represent a count backwards (-1 being the end of the page array and 0 being the begining).

# File lib/combine_pdf/pdf_public.rb, line 345
def remove(page_index)
  catalog = rebuild_catalog
  pages_array = catalog[:Pages][:referenced_object][:Kids]
  removed_page = pages_array.delete_at page_index
  catalog[:Pages][:referenced_object][:Count] = pages_array.length
  removed_page
end
save(file_name, options = {}) click to toggle source

Save the PDF to file.

file_name

is a string or path object for the output.

**Notice!** if the file exists, it WILL be overwritten.

# File lib/combine_pdf/pdf_public.rb, line 164
def save(file_name, options = {})
  IO.binwrite file_name, to_pdf(options)
end
stamp_pages(stamp, options = {}) click to toggle source

This method stamps all (or some) of the pages is the PDF with the requested stamp.

The method accept:

stamp

either a String or a PDF page. If this is a String, you can add formating to add page numbering (i.e. “page number %i”). otherwise remember to escape any percent ('%') sign (i.e. “page %number not shown%”).

options

an options Hash.

If the stamp is a PDF page, only :page_range and :underlay (to reverse-stamp) are valid options.

If the stamp is a String, than all the options used by {#number_pages} or {Page_Methods#textbox} can be used.

The default :location option is :center = meaning the stamp will be stamped all across the page unless the :x, :y, :width or :height options are specified.

# File lib/combine_pdf/pdf_public.rb, line 458
def stamp_pages(stamp, options = {})
  case stamp
  when String
    options[:location] ||= [:center]
    number_pages({ number_format: stamp }.merge(options))
  when Page_Methods
    # stamp = stamp.copy(true)
    if options[:underlay]
      (options[:page_range] ? pages[options[:page_range]] : pages).each { |p| p >> stamp }
    else
      (options[:page_range] ? pages[options[:page_range]] : pages).each { |p| p << stamp }
    end
  else
    raise TypeError, 'expecting a String or a PDF page as the stamp.'
  end
end
title() click to toggle source

get the title for the pdf The title is stored in the information dictionary and isn't required

# File lib/combine_pdf/pdf_public.rb, line 129
def title
  @info[:Title]
end
title=(new_title = nil) click to toggle source

set the title for the pdf The title is stored in the information dictionary and isn't required

new_title

a string that is the new author value.

# File lib/combine_pdf/pdf_public.rb, line 136
def title=(new_title = nil)
  @info[:Title] = new_title
end
to_pdf(options = {}) click to toggle source

Formats the data to PDF formats and returns a binary string that represents the PDF file content.

This method is used by the save(file_name) method to save the content to a file.

use this to export the PDF file without saving to disk (such as sending through HTTP ect').

# File lib/combine_pdf/pdf_public.rb, line 173
def to_pdf(options = {})
  # reset version if not specified
  @version = 1.5 if @version.to_f == 0.0
  # set info for merged file
  @info[:ModDate] = @info[:CreationDate] = Time.now.strftime "D:%Y%m%d%H%M%S%:::z'00"
  @info[:Subject] = options[:subject] if options[:subject]
  @info[:Producer] = options[:producer] if options[:producer]
  # rebuild_catalog
  catalog = rebuild_catalog_and_objects
  # add ID and generation numbers to objects
  renumber_object_ids

  out = []
  xref = []
  indirect_object_count = 1 # the first object is the null object
  # write head (version and binanry-code)
  out << "%PDF-#{@version}\n%\xFF\xFF\xFF\xFF\xFF\x00\x00\x00\x00".force_encoding(Encoding::ASCII_8BIT)

  # collect objects and set xref table locations
  loc = 0
  out.each { |line| loc += line.bytesize + 1 }
  @objects.each do |o|
    indirect_object_count += 1
    xref << loc
    out << object_to_pdf(o)
    loc += out.last.bytesize + 1
  end
  xref_location = loc
  # xref_location = 0
  # out.each { |line| xref_location += line.bytesize + 1}
  out << "xref\n0 #{indirect_object_count}\n0000000000 65535 f \n"
  xref.each { |offset| out << (out.pop + ("%010d 00000 n \n" % offset)) }
  out << out.pop + 'trailer'
  out << "<<\n/Root #{false || "#{catalog[:indirect_reference_id]} #{catalog[:indirect_generation_number]} R"}"
  out << "/Size #{indirect_object_count}"
  out << "/Info #{@info[:indirect_reference_id]} #{@info[:indirect_generation_number]} R"
  out << ">>\nstartxref\n#{xref_location}\n%%EOF"
  # when finished, remove the numbering system and keep only pointers
  remove_old_ids
  # output the pdf stream
  out.join("\n".force_encoding(Encoding::ASCII_8BIT)).force_encoding(Encoding::ASCII_8BIT)
end

Protected Instance Methods

add_referenced() click to toggle source

@private Some PDF objects contain references to other PDF objects.

this function adds the references contained in these objects.

this is used for internal operations, such as injectng data using the << operator.

# File lib/combine_pdf/pdf_protected.rb, line 21
def add_referenced()
  # an existing object map
  resolved = {}.dup
  existing = {}.dup
  should_resolve = [].dup
  #set all existing objects as resolved and register their children for future resolution
  @objects.each { |obj| existing[obj] = obj ; resolved[obj.object_id] = obj; should_resolve << obj.values}
  # loop until should_resolve is empty
  while should_resolve.any?
    obj = should_resolve.pop
    next if resolved[obj.object_id] # the object exists
    if obj.is_a?(Hash)
      referenced = obj[:referenced_object]
      if referenced && referenced.any?
        tmp = resolved[referenced.object_id]
        if !tmp && referenced[:raw_stream_content]
          tmp = existing[referenced[:raw_stream_content]]
          # Avoid endless recursion by limiting it to a number of layers (default == 2)
          tmp = nil unless equal_layers(tmp, referenced)
        end
        if tmp
          obj[:referenced_object] = tmp
        else
          resolved[obj.object_id] = referenced
          #        existing[referenced] = referenced
          existing[referenced[:raw_stream_content]] = referenced
          should_resolve << referenced
          @objects << referenced
        end
      else
        resolved[obj.object_id] = obj
        obj.keys.each { |k| should_resolve << obj[k] unless !obj[k].is_a?(Enumerable) || resolved[obj[k].object_id] }
      end
    elsif obj.is_a?(Array)
      resolved[obj.object_id] = obj
      should_resolve.concat obj
    end
  end
  resolved.clear
  existing.clear
end
get_existing_catalogs() click to toggle source
# File lib/combine_pdf/pdf_protected.rb, line 165
def get_existing_catalogs
  (@objects.select { |obj| obj.is_a?(Hash) && obj[:Type] == :Catalog }) || (@objects.select { |obj| obj.is_a?(Hash) && obj[:Type] == :Page })
end
merge_outlines(old_data, new_data, position) click to toggle source

Merges 2 outlines by appending one to the end or start of the other. old_data - the main outline, which is also the one that will be used in the resulting PDF. new_data - the outline to be appended position - an integer representing the position where a PDF is being inserted.

This method only differentiates between inserted at the beginning, or not.
Not at the beginning, means the new outline will be added to the end of the original outline.

An outline base node (tree base) has :Type, :Count, :First, :Last Every node within the outline base node's :First or :Last can have also have the following pointers to other nodes: :First or :Last (only if the node has a subtree / subsection) :Parent (the node's parent) :Prev, :Next (previous and next node) Non-node-pointer data in these nodes: :Title - the node's title displayed in the PDF outline :Count - Number of nodes in it's subtree (0 if no subtree) :Dest - node link destination (if the node is linking to something)

# File lib/combine_pdf/pdf_protected.rb, line 292
def merge_outlines(old_data, new_data, position)
  old_data = actual_object(old_data)
  new_data = actual_object(new_data)
  if old_data.nil? || old_data.empty? || old_data[:First].nil?
    # old_data is a reference to the actual object,
    # so if we update old_data, we're done, no need to take any further action
    old_data.update new_data
  elsif new_data.nil? || new_data.empty? || new_data[:First].nil?
    return old_data
  else
    new_data = new_data.dup # avoid old data corruption
    # number of outline nodes, after the merge
    old_data[:Count] = old_data[:Count].to_i + new_data[:Count].to_i
    # walk the Hash here ...
    # I'm just using the start / end insert-position for now...
    # first  - is going to be the start of the outline base node's :First, after the merge
    # last   - is going to be the end   of the outline base node's :Last,  after the merge
    # median - the start of what will be appended to the end of the outline base node's :First
    # parent - the outline base node of the resulting merged outline
    # FIXME implement the possibility to insert somewhere in the middle of the outline
    prev = nil
    pos = first = actual_object((position.nonzero? ? old_data : new_data)[:First])
    last = actual_object((position.nonzero? ? new_data : old_data)[:Last])
    median = { is_reference_only: true, referenced_object: actual_object((position.nonzero? ? new_data : old_data)[:First]) }
    old_data[:First] = { is_reference_only: true, referenced_object: first }
    old_data[:Last] = { is_reference_only: true, referenced_object: last }
    parent = { is_reference_only: true, referenced_object: old_data }
    while pos
      # walking through old_data here and updating the :Parent as we go,
      # this updates the inserted new_data :Parent's as well once it is appended and the
      # loop keeps walking the appended data.
      pos[:Parent] = parent if pos[:Parent]
      # connect the two outlines
      # if there is no :Next, the end of the outline base node's :First is reached and this is
      # where the new data gets appended, the same way you would append to a two-way linked list.
      if pos[:Next].nil?
        median[:referenced_object][:Prev] = { is_reference_only: true, referenced_object: prev } if median
        pos[:Next] = median
        # midian becomes 'nil' because this loop keeps going after the appending is done,
        # to update the parents of the appended tree and we wouldn't want to keep appending it infinitely.
        median = nil
      end
      # iterating over the outlines main nodes (this is not going into subtrees)
      # while keeping every rotations previous node saved
      prev = pos
      pos = actual_object(pos[:Next])
    end
    # make sure the last object doesn't have the :Next and the first no :Prev property
    prev.delete :Next
    actual_object(old_data[:First]).delete :Prev
  end
end
names_object() click to toggle source

Deprecation Notice

# File lib/combine_pdf/pdf_protected.rb, line 140
def names_object
  puts "CombinePDF Deprecation Notice: the protected method `names_object` will be deprecated in the upcoming version. Use `names` instead."
  @names
end
outlines_object() click to toggle source
# File lib/combine_pdf/pdf_protected.rb, line 145
def outlines_object
  puts "CombinePDF Deprecation Notice: the protected method `outlines_object` will be deprecated in the upcoming version. Use `oulines` instead."
  @outlines
end
print_outline_to_file(outline, file) click to toggle source

Prints the whole outline hash to a file, with basic indentation and replacing raw streams with “RAW STREAM” (subbing doesn't allways work that great for big streams) outline - outline hash file - “filename.filetype” string

rebuild_catalog(*with_pages) click to toggle source

@private

# File lib/combine_pdf/pdf_protected.rb, line 64
def rebuild_catalog(*with_pages)
  # # build page list v.1 Slow but WORKS
  # # Benchmark testing value: 26.708394
  # old_catalogs = @objects.select {|obj| obj.is_a?(Hash) && obj[:Type] == :Catalog}
  # old_catalogs ||= []
  # page_list = []
  # PDFOperations._each_object(old_catalogs,false) { |p| page_list << p if p.is_a?(Hash) && p[:Type] == :Page }

  # build page list v.2 faster, better, and works
  # Benchmark testing value: 0.215114
  page_list = pages

  # add pages to catalog, if requested
  page_list.concat(with_pages) unless with_pages.empty?

  # duplicate any non-unique pages - This is a special case to resolve Adobe Acrobat Reader issues (see issues #19 and #81)
  uniqueness = {}.dup
  page_list.each { |page| page = page[:referenced_object] || page; page = page.dup if uniqueness[page.object_id]; uniqueness[page.object_id] = page }
  page_list.clear
  page_list = uniqueness.values
  uniqueness.clear

  # build new Pages object
  page_object_kids = [].dup
  pages_object = { Type: :Pages, Count: page_list.length, Kids: page_object_kids }
  pages_object_reference = { referenced_object: pages_object, is_reference_only: true }
  page_list.each { |pg| pg[:Parent] = pages_object_reference; page_object_kids << ({ referenced_object: pg, is_reference_only: true }) }

  # rebuild/rename the names dictionary
  rebuild_names
  # build new Catalog object
  catalog_object = { Type: :Catalog,
                     Pages: { referenced_object: pages_object, is_reference_only: true } }
  # pages_object[:Parent] = { referenced_object: catalog_object, is_reference_only: true } # causes AcrobatReader to fail
  catalog_object[:ViewerPreferences] = @viewer_preferences unless @viewer_preferences.empty?

  # point old Pages pointers to new Pages object
  ## first point known pages objects - enough?
  pages.each { |p| p[:Parent] = { referenced_object: pages_object, is_reference_only: true } }
  ## or should we, go over structure? (fails)
  # each_object {|obj| obj[:Parent][:referenced_object] = pages_object if obj.is_a?(Hash) && obj[:Parent].is_a?(Hash) && obj[:Parent][:referenced_object] && obj[:Parent][:referenced_object][:Type] == :Pages}

  # # remove old catalog and pages objects
  # @objects.reject! { |obj| obj.is_a?(Hash) && (obj[:Type] == :Catalog || obj[:Type] == :Pages) }
  # remove old objects list and trees
  @objects.clear

  # inject new catalog and pages objects
  @objects << @info if @info
  @objects << catalog_object
  # @objects << pages_object

  # rebuild/rename the forms dictionary
  if @forms_data.nil? || @forms_data.empty?
    @forms_data = nil
  else
    @forms_data = { referenced_object: (@forms_data[:referenced_object] || @forms_data), is_reference_only: true }
    catalog_object[:AcroForm] = @forms_data
    @objects << @forms_data[:referenced_object]
  end

  # add the names dictionary
  if @names && @names.length > 1
    @objects << @names
    catalog_object[:Names] = { referenced_object: @names, is_reference_only: true }
  end
  # add the outlines dictionary
  if @outlines && @outlines.any?
    @objects << @outlines
    catalog_object[:Outlines] = { referenced_object: @outlines, is_reference_only: true }
  end

  catalog_object
end
rebuild_catalog_and_objects() click to toggle source

@private this is an alternative to the rebuild_catalog catalog method this method is used by the to_pdf method, for streamlining the PDF output. there is no point is calling the method before preparing the output.

# File lib/combine_pdf/pdf_protected.rb, line 157
def rebuild_catalog_and_objects
  catalog = rebuild_catalog
  catalog[:Pages][:referenced_object][:Kids].each { |e| @objects << e[:referenced_object]; e[:referenced_object] }
  # adds every referenced object to the @objects (root), addition is performed as pointers rather then copies
  add_referenced()
  catalog
end
rebuild_names(name_tree = nil, base = 'CombinePDF_0000000') click to toggle source
# File lib/combine_pdf/pdf_protected.rb, line 187
def rebuild_names(name_tree = nil, base = 'CombinePDF_0000000')
  if name_tree
    return nil unless name_tree.is_a?(Hash)
    name_tree = name_tree[:referenced_object] || name_tree
    dic = []
    # map a names tree and return a valid name tree. Do not recourse.
    should_resolve = [name_tree[:Kids], name_tree[:Names]]
    resolved = [].to_set
    while should_resolve.any?
      pos = should_resolve.pop
      if pos.is_a? Array
        next if resolved.include?(pos.object_id)
        if pos[0].is_a? String
          (pos.length / 2).times do |i|
            dic << (pos[i * 2].clear << base.next!)
            pos[(i * 2) + 1][0] = {is_reference_only: true, referenced_object: pages[pos[(i * 2) + 1][0]]} if(pos[(i * 2) + 1].is_a?(Array) && pos[(i * 2) + 1][0].is_a?(Numeric))
            dic << (pos[(i * 2) + 1].is_a?(Array) ? { is_reference_only: true, referenced_object: { indirect_without_dictionary: pos[(i * 2) + 1] } } : pos[(i * 2) + 1])
            # dic << pos[(i * 2) + 1]
          end
        else
          should_resolve.concat pos
        end
      elsif pos.is_a? Hash
        pos = pos[:referenced_object] || pos
        next if resolved.include?(pos.object_id)
        should_resolve << pos[:Kids] if pos[:Kids]
        should_resolve << pos[:Names] if pos[:Names]
      end
      resolved << pos.object_id
    end
    return { referenced_object: { Names: dic }, is_reference_only: true }
  end
  @names ||= @names[:referenced_object]
  new_names = { Type: :Names }.dup
  POSSIBLE_NAME_TREES.each do |ntree|
    if @names[ntree]
      new_names[ntree] = rebuild_names(@names[ntree], base)
      @names[ntree].clear
    end
  end
  @names.clear
  @names = new_names
end
remove_old_ids() click to toggle source
# File lib/combine_pdf/pdf_protected.rb, line 181
def remove_old_ids
  @objects.each { |obj| obj.delete(:indirect_reference_id); obj.delete(:indirect_generation_number) }
end
renumber_object_ids(start = nil) click to toggle source

end @private

# File lib/combine_pdf/pdf_protected.rb, line 171
def renumber_object_ids(start = nil)
  @set_start_id = start || @set_start_id
  start = @set_start_id
  # history = {}
  @objects.each do |obj|
    obj[:indirect_reference_id] = start
    start += 1
  end
end

Private Instance Methods

equal_layers(obj1, obj2, layer = CombinePDF.eq_depth_limit) click to toggle source
# File lib/combine_pdf/pdf_protected.rb, line 374
def equal_layers obj1, obj2, layer = CombinePDF.eq_depth_limit
  return true if obj1.object_id == obj2.object_id
  if obj1.is_a? Hash
    return false unless obj2.is_a? Hash
    return false unless obj1.length == obj2.length
    keys = obj1.keys;
    keys2 = obj2.keys;
    return false if (keys - keys2).any? || (keys2 - keys).any?
    return (warn("CombinePDF nesting limit reached") || true) if(layer == 0)
    keys.each {|k| return false unless equal_layers( obj1[k], obj2[k], layer-1) }
  elsif obj1.is_a? Array
    return false unless obj2.is_a? Array
    return false unless obj1.length == obj2.length
    (obj1-obj2).any? || (obj2-obj1).any?
  else
    obj1 == obj2
  end
end
rename_object(object, _dictionary) click to toggle source
# File lib/combine_pdf/pdf_protected.rb, line 403
def rename_object(object, _dictionary)
  case object
  when Array
    object.length.times { |i| }
  when Hash
  end
end
renaming_dictionary(object = nil, dictionary = {}) click to toggle source
# File lib/combine_pdf/pdf_protected.rb, line 393
def renaming_dictionary(object = nil, dictionary = {})
  object ||= @names
  case object
  when Array
    object.length.times { |i| object[i].is_a?(String) ? (dictionary[object[i]] = (dictionary.last || 'Random_0001').next) : renaming_dictionary(object[i], dictionary) }
  when Hash
    object.values.each { |v| renaming_dictionary v, dictionary }
  end
end