Class SynchronizedMetaDataValidation

java.lang.Object
org.apache.pdfbox.preflight.metadata.SynchronizedMetaDataValidation

public class SynchronizedMetaDataValidation extends Object
Class which checks if document information available in a document is synchronized with XMP
  • Constructor Details

    • SynchronizedMetaDataValidation

      public SynchronizedMetaDataValidation()
  • Method Details

    • analyzeTitleProperty

      protected void analyzeTitleProperty(PDDocumentInformation dico, DublinCoreSchema dc, List<ValidationResult.ValidationError> ve)
      Analyze if Title embedded in Document Information dictionary and in XMP properties are synchronized
      Parameters:
      dico - the Document Information Dictionary.
      dc - the Dublin Core Schema.
      ve - the list of validation errors.
    • analyzeAuthorProperty

      protected void analyzeAuthorProperty(PDDocumentInformation dico, DublinCoreSchema dc, List<ValidationResult.ValidationError> ve)
      Analyze if Author(s) embedded in Document Information dictionary and in XMP properties are synchronized
      Parameters:
      dico - Document Information Dictionary
      dc - Dublin Core Schema
      ve - The list of validation errors
    • analyzeSubjectProperty

      protected void analyzeSubjectProperty(PDDocumentInformation dico, DublinCoreSchema dc, List<ValidationResult.ValidationError> ve)
      Analyze if Subject(s) embedded in Document Information dictionary and in XMP properties are synchronized
      Parameters:
      dico - Document Information Dictionary
      dc - Dublin Core Schema
      ve - The list of validation errors
    • analyzeKeywordsProperty

      protected void analyzeKeywordsProperty(PDDocumentInformation dico, AdobePDFSchema pdf, List<ValidationResult.ValidationError> ve)
      Analyze if Keyword(s) embedded in Document Information dictionary and in XMP properties are synchronized
      Parameters:
      dico - Document Information Dictionary
      pdf - PDF Schema
      ve - The list of validation errors
    • analyzeProducerProperty

      protected void analyzeProducerProperty(PDDocumentInformation dico, AdobePDFSchema pdf, List<ValidationResult.ValidationError> ve)
      Analyze if Producer embedded in Document Information dictionary and in XMP properties are synchronized
      Parameters:
      dico - Document Information Dictionary
      pdf - PDF Schema
      ve - The list of validation errors
    • analyzeCreatorToolProperty

      protected void analyzeCreatorToolProperty(PDDocumentInformation dico, XMPBasicSchema xmp, List<ValidationResult.ValidationError> ve)
      Analyze if the creator tool embedded in Document Information dictionary and in XMP properties are synchronized
      Parameters:
      dico - Document Information Dictionary
      xmp - XMP Basic Schema
      ve - The list of validation errors
    • analyzeCreationDateProperty

      protected void analyzeCreationDateProperty(PDDocumentInformation dico, XMPBasicSchema xmp, List<ValidationResult.ValidationError> ve) throws ValidationException
      Analyze if the CreationDate embedded in Document Information dictionary and in XMP properties are synchronized
      Parameters:
      dico - Document Information Dictionary
      xmp - XMP Basic Schema
      ve - The list of validation errors
      Throws:
      ValidationException
    • analyzeModifyDateProperty

      protected void analyzeModifyDateProperty(PDDocumentInformation dico, XMPBasicSchema xmp, List<ValidationResult.ValidationError> ve) throws ValidationException
      Analyze if the ModifyDate embedded in Document Information dictionary and in XMP properties are synchronized
      Parameters:
      dico - Document Information Dictionary
      xmp - XMP Basic Schema
      ve - The list of validation errors
      Throws:
      ValidationException
    • validateMetadataSynchronization

      public List<ValidationResult.ValidationError> validateMetadataSynchronization(PDDocument document, XMPMetadata metadata) throws ValidationException
      Check if document information entries and XMP information are synchronized
      Parameters:
      document - the PDF Document
      metadata - the XMP MetaData
      Returns:
      List of validation errors
      Throws:
      ValidationException
    • unexpectedPrefixFoundError

      protected ValidationResult.ValidationError unexpectedPrefixFoundError(String prefFound, String prefExpected, String schema)
      Return a validationError formatted when a schema has not the expected prefix
      Parameters:
      prefFound -
      prefExpected -
      schema -
      Returns:
      the generated validation error.
    • schemaAccessException

      protected ValidationException schemaAccessException(String target, Throwable cause)
      Return an exception formatted on IOException when accessing on metadata schema
      Parameters:
      target - the name of the schema
      cause - the raised IOException
      Returns:
      the generated exception
    • unsynchronizedMetaDataError

      protected ValidationResult.ValidationError unsynchronizedMetaDataError(String target)
      Return a formatted validation error when metadata are not synchronized
      Parameters:
      target - the concerned property
      Returns:
      the generated validation error
    • absentSchemaMetaDataError

      protected ValidationResult.ValidationError absentSchemaMetaDataError(String target, String schema)
      Return a formatted validation error when a specific metadata schema can't be found
      Parameters:
      target - the concerned property
      schema - the XMP schema which can't be found
      Returns:
      the generated validation error
    • absentXMPPropertyError

      protected ValidationResult.ValidationError absentXMPPropertyError(String target, String details)
      Return a formatted validation error when a specific XMP property can't be found
      Parameters:
      target - the concerned property
      details - comments about the XMP property
      Returns:
      the generated validation error
    • removeTrailingNul

      private String removeTrailingNul(String string)
      A given string from the DocumentInformation dictionary may have some trailing Nul values which have to be stripped.
      Parameters:
      string - to be stripped
      Returns:
      the stripped string
    • hasTimeZone

      private boolean hasTimeZone(Object date)
      Verify if the date string has time zone information.

      This method doesn't do a complete parsing as this is a helper AFTER a date has proven to be valid

      Parameters:
      date -
      Returns:
      the validation result
    • isValidPDFDateFormat

      private boolean isValidPDFDateFormat(COSBase item)
      Verifies that a date item is a COSString and has the format "D:YYYYMMDDHHmmSSOHH'mm'", where D:YYYY is mandatory and the next fields optional, but only if all of their preceding fields are also present. This needs to be done because the other date utilities are too lenient.
      Parameters:
      item - the date item that is to be checked.
      Returns:
      true if the date format is assumed to be valid, false if not.