Class: HexaPDF::Document
- Inherits:
-
Object
- Object
- HexaPDF::Document
- Defined in:
- lib/hexapdf/document.rb,
lib/hexapdf/document/files.rb,
lib/hexapdf/document/fonts.rb,
lib/hexapdf/document/pages.rb,
lib/hexapdf/document/images.rb,
lib/hexapdf/document/layout.rb,
lib/hexapdf/document/signatures.rb,
lib/hexapdf/document/destinations.rb
Overview
HexaPDF::Document
Represents one PDF document.
A PDF document consists of (indirect) objects, so the main job of this class is to provide methods for working with these objects. However, since a PDF document may also be incrementally updated and can therefore contain one or more revisions, there are also methods for working with these revisions.
Note: This class provides everything to work on PDF documents on a low-level basis. This means that there are no convenience methods for higher PDF functionality. Those can be found in the objects linked from here, like #catalog.
Known Messages
The document object provides a basic message dispatch system via #register_listener and #dispatch_message.
Following are the messages that are used by HexaPDF itself:
- :complete_objects
-
This message is called before the first step of writing a document. Listeners should complete PDF objects that are missing some information.
For example, the font system uses this message to complete the font objects with information that is only available once all the used glyphs are known.
- :before_write
-
This message is called before a document is actually serialized and written.
Defined Under Namespace
Classes: Destinations, Files, Fonts, Images, Layout, Pages, Signatures
Constant Summary collapse
- UNSET =
:nordoc:
::Object.new
Instance Attribute Summary collapse
-
#config ⇒ Object
readonly
The configuration for the document.
-
#revisions ⇒ Object
readonly
The revisions of the document.
Class Method Summary collapse
-
.open(filename, **kwargs) ⇒ Object
:call-seq: Document.open(filename, **docargs) -> doc Document.open(filename, **docargs) {|doc| block} -> obj.
Instance Method Summary collapse
-
#acro_form(create: false) ⇒ Object
Returns the main AcroForm object for dealing with interactive forms.
-
#add(obj, **wrap_opts) ⇒ Object
:call-seq: doc.add(obj, **wrap_opts) -> indirect_object.
-
#cache(pdf_data, key, value = UNSET, update: false) ⇒ Object
Caches and returns the given
valueor the value of the given block using the givenpdf_dataandkeyarguments as composite cache key. -
#cached?(pdf_data, key) ⇒ Boolean
Returns
trueif there is a value cached for the composite key consisting of the givenpdf_dataandkeyobjects. -
#catalog ⇒ Object
Returns the document's catalog, the root of the object tree.
-
#clear_cache(pdf_data = nil) ⇒ Object
Clears all cached data or, if a Object::PDFData object is given, just the cache for this one object.
-
#delete(ref) ⇒ Object
:call-seq: doc.delete(ref) doc.delete(oid).
-
#deref(obj) ⇒ Object
Dereferences the given object.
-
#destinations ⇒ Object
Returns the Destinations object that provides convenience methods for working with destination objects.
-
#dispatch_message(name, *args) ⇒ Object
Dispatches the message
namewith the given arguments to all registered listeners. -
#each(only_current: true, only_loaded: false, &block) ⇒ Object
:call-seq: doc.each(only_current: true, only_loaded: false) {|obj| block } doc.each(only_current: true, only_loaded: false) {|obj, rev| block } doc.each(only_current: true, only_loaded: false) -> Enumerator.
-
#encrypt(name: :Standard, **options) ⇒ Object
Encrypts the document.
-
#encrypted? ⇒ Boolean
Returns
trueif the document is encrypted. -
#files ⇒ Object
Returns the Files object that provides convenience methods for working with files.
-
#fonts ⇒ Object
Returns the Fonts object that provides convenience methods for working with fonts.
-
#images ⇒ Object
Returns the Images object that provides convenience methods for working with images.
-
#import(obj) ⇒ Object
:call-seq: doc.import(obj) -> imported_object.
-
#initialize(io: nil, decryption_opts: {}, config: {}) ⇒ Document
constructor
Creates a new PDF document, either an empty one or one read from the provided
io. -
#inspect ⇒ Object
:nodoc:.
-
#layout ⇒ Object
Returns the Layout object that provides convenience methods for working with the HexaPDF::Layout classes for document layout.
-
#object(ref) ⇒ Object
:call-seq: doc.object(ref) -> obj or nil doc.object(oid) -> obj or nil.
-
#object?(ref) ⇒ Boolean
:call-seq: doc.object?(ref) -> true or false doc.object?(oid) -> true or false.
-
#pages ⇒ Object
Returns the Pages object that provides convenience methods for working with pages.
-
#register_listener(name, callable = nil, &block) ⇒ Object
:call-seq: doc.register_listener(name, callable) -> callable doc.register_listener(name) {|*args| block} -> block.
-
#security_handler ⇒ Object
Returns the security handler that is used for decrypting or encrypting the document, or
nilif none is set. -
#sign(file_or_io, handler: :default, signature: nil, write_options: {}, **handler_options) ⇒ Object
Signs the document and writes it to the given file or IO object.
-
#signatures ⇒ Object
Returns an array with the digital signatures of this document.
-
#signed? ⇒ Boolean
Returns
trueif the document is signed, i.e. -
#task(name, **opts, &block) ⇒ Object
Executes the given task and returns its result.
-
#trailer ⇒ Object
Returns the trailer dictionary for the document.
-
#unwrap(object, seen = {}) ⇒ Object
:call-seq: document.unwrap(obj) -> unwrapped_obj.
-
#validate(auto_correct: true, only_loaded: false, &block) ⇒ Object
Validates all objects, or, if
only_loadedistrue, only loaded objects, with optional auto-correction, and returnstrueif everything is fine. -
#version ⇒ Object
Returns the PDF document's version as string (e.g. '1.4').
-
#version=(value) ⇒ Object
Sets the version of the PDF document.
-
#wrap(obj, type: nil, subtype: nil, oid: nil, gen: nil, stream: nil) ⇒ Object
Wraps the given object inside a HexaPDF::Object class which allows one to use convenience functions to work with the object.
-
#write(file_or_io, incremental: false, validate: true, update_fields: true, optimize: false) ⇒ Object
:call-seq: doc.write(filename, incremental: false, validate: true, update_fields: true, optimize: false) doc.write(io, incremental: false, validate: true, update_fields: true, optimize: false).
Constructor Details
#initialize(io: nil, decryption_opts: {}, config: {}) ⇒ Document
Creates a new PDF document, either an empty one or one read from the provided io.
When an IO object is provided and it contains an encrypted PDF file, it is automatically decrypted behind the scenes. The decryption_opts argument has to be set appropriately in this case.
Options:
- io
-
If an IO object is provided, then this document can read PDF objects from this IO object, otherwise it can only contain created PDF objects.
- decryption_opts
-
A hash with options for decrypting the PDF objects loaded from the IO.
- config
-
A hash with configuration options that is deep-merged into the default configuration (see HexaPDF::DefaultDocumentConfiguration, meaning that direct sub-hashes are merged instead of overwritten.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/hexapdf/document.rb', line 164 def initialize(io: nil, decryption_opts: {}, config: {}) @config = Configuration.with_defaults(config) @version = '1.2' @revisions = Revisions.from_io(self, io) @security_handler = if encrypted? && @config['document.auto_decrypt'] Encryption::SecurityHandler.set_up_decryption(self, **decryption_opts) else nil end @listeners = {} @cache = Hash.new {|h, k| h[k] = {} } end |
Instance Attribute Details
#config ⇒ Object (readonly)
The configuration for the document.
142 143 144 |
# File 'lib/hexapdf/document.rb', line 142 def config @config end |
#revisions ⇒ Object (readonly)
The revisions of the document.
145 146 147 |
# File 'lib/hexapdf/document.rb', line 145 def revisions @revisions end |
Class Method Details
.open(filename, **kwargs) ⇒ Object
:call-seq:
Document.open(filename, **docargs) -> doc
Document.open(filename, **docargs) {|doc| block} -> obj
Creates a new PDF Document object for the given file.
Depending on whether a block is provided, the functionality is different:
-
If no block is provided, the whole file is instantly read into memory and the PDF Document created for it is returned.
-
If a block is provided, the file is opened and a PDF Document is created for it. The created document is passed as an argument to the block and when the block returns the associated file object is closed. The value of the block will be returned.
The block version is useful, for example, when you are dealing with a large file and you only need a small portion of it.
The provided keyword arguments (except io) are passed on unchanged to Document.new.
131 132 133 134 135 136 137 138 139 |
# File 'lib/hexapdf/document.rb', line 131 def self.open(filename, **kwargs) if block_given? File.open(filename, 'rb') do |file| yield(new(**kwargs, io: file)) end else new(**kwargs, io: StringIO.new(File.binread(filename))) end end |
Instance Method Details
#acro_form(create: false) ⇒ Object
Returns the main AcroForm object for dealing with interactive forms.
See HexaPDF::Type::Catalog#acro_form for details on the arguments.
491 492 493 |
# File 'lib/hexapdf/document.rb', line 491 def acro_form(create: false) catalog.acro_form(create: create) end |
#add(obj, **wrap_opts) ⇒ Object
:call-seq:
doc.add(obj, **wrap_opts) -> indirect_object
Adds the object to the document and returns the wrapped indirect object.
The object can either be a native Ruby object (Hash, Array, Integer, …) or a HexaPDF::Object. If it is not the latter, #wrap is called with the object and the additional keyword arguments.
See: Revisions#add_object
228 229 230 231 232 233 234 235 236 237 |
# File 'lib/hexapdf/document.rb', line 228 def add(obj, **wrap_opts) obj = wrap(obj, **wrap_opts) unless obj.kind_of?(HexaPDF::Object) if obj.document? && obj.document != self raise HexaPDF::Error, "Can't add object that is already attached to another document" end obj.document = self @revisions.add_object(obj) end |
#cache(pdf_data, key, value = UNSET, update: false) ⇒ Object
Caches and returns the given value or the value of the given block using the given pdf_data and key arguments as composite cache key. If a cached value already exists and update is false, the cached value is just returned.
Set update to true to force an update of the cached value.
This facility can be used to cache expensive operations in PDF objects that are easy to compute again.
Use #clear_cache to clear the cache if necessary.
430 431 432 433 |
# File 'lib/hexapdf/document.rb', line 430 def cache(pdf_data, key, value = UNSET, update: false) return @cache[pdf_data][key] if cached?(pdf_data, key) && !update @cache[pdf_data][key] = (value == UNSET ? yield : value) end |
#cached?(pdf_data, key) ⇒ Boolean
Returns true if there is a value cached for the composite key consisting of the given pdf_data and key objects.
Also see: #cache
439 440 441 |
# File 'lib/hexapdf/document.rb', line 439 def cached?(pdf_data, key) @cache.key?(pdf_data) && @cache[pdf_data].key?(key) end |
#catalog ⇒ Object
Returns the document's catalog, the root of the object tree.
514 515 516 |
# File 'lib/hexapdf/document.rb', line 514 def catalog trailer.catalog end |
#clear_cache(pdf_data = nil) ⇒ Object
Clears all cached data or, if a Object::PDFData object is given, just the cache for this one object.
It is not recommended to clear the whole cache! Better clear the cache for individual PDF objects!
Also see: #cache
450 451 452 |
# File 'lib/hexapdf/document.rb', line 450 def clear_cache(pdf_data = nil) pdf_data ? @cache[pdf_data].clear : @cache.clear end |
#delete(ref) ⇒ Object
:call-seq:
doc.delete(ref)
doc.delete(oid)
Deletes the indirect object specified by an exact reference or by an object number from the document.
See: Revisions#delete_object
247 248 249 |
# File 'lib/hexapdf/document.rb', line 247 def delete(ref) @revisions.delete_object(ref) end |
#deref(obj) ⇒ Object
Dereferences the given object.
Return the object itself if it is not a reference, or the indirect object specified by the reference.
214 215 216 |
# File 'lib/hexapdf/document.rb', line 214 def deref(obj) obj.kind_of?(Reference) ? object(obj) : obj end |
#destinations ⇒ Object
Returns the Destinations object that provides convenience methods for working with destination objects.
478 479 480 |
# File 'lib/hexapdf/document.rb', line 478 def destinations @destinations ||= Destinations.new(self) end |
#dispatch_message(name, *args) ⇒ Object
Dispatches the message name with the given arguments to all registered listeners.
See the main Document documentation for an overview of messages that are used by HexaPDF itself.
414 415 416 |
# File 'lib/hexapdf/document.rb', line 414 def (name, *args) @listeners[name]&.each {|obj| obj.call(*args) } end |
#each(only_current: true, only_loaded: false, &block) ⇒ Object
:call-seq:
doc.each(only_current: true, only_loaded: false) {|obj| block }
doc.each(only_current: true, only_loaded: false) {|obj, rev| block }
doc.each(only_current: true, only_loaded: false) -> Enumerator
Yields every object and the revision it is in.
If only_current is true, only the current version of each object is yielded, otherwise all objects from all revisions.
If only_loaded is true, only the already loaded objects are yielded.
For details see Revisions#each_object
395 396 397 |
# File 'lib/hexapdf/document.rb', line 395 def each(only_current: true, only_loaded: false, &block) @revisions.each_object(only_current: only_current, only_loaded: only_loaded, &block) end |
#encrypt(name: :Standard, **options) ⇒ Object
Encrypts the document.
This is done by setting up a security handler for this purpose and populating the trailer's Encrypt dictionary accordingly. The actual encryption, however, is only done when writing the document.
The security handler used for encrypting is selected via the name argument. All other arguments are passed on the security handler.
If the document should not be encrypted, the name argument has to be set to nil. This removes the security handler and deletes the trailer's Encrypt dictionary.
See: HexaPDF::Encryption::SecurityHandler#set_up_encryption and HexaPDF::Encryption::StandardSecurityHandler::EncryptionOptions for possible encryption options.
557 558 559 560 561 562 563 564 |
# File 'lib/hexapdf/document.rb', line 557 def encrypt(name: :Standard, **) if name.nil? trailer.delete(:Encrypt) @security_handler = nil else @security_handler = Encryption::SecurityHandler.set_up_encryption(self, name, **) end end |
#encrypted? ⇒ Boolean
Returns true if the document is encrypted.
538 539 540 |
# File 'lib/hexapdf/document.rb', line 538 def encrypted? !trailer[:Encrypt].nil? end |
#files ⇒ Object
Returns the Files object that provides convenience methods for working with files.
467 468 469 |
# File 'lib/hexapdf/document.rb', line 467 def files @files ||= Files.new(self) end |
#fonts ⇒ Object
Returns the Fonts object that provides convenience methods for working with fonts.
472 473 474 |
# File 'lib/hexapdf/document.rb', line 472 def fonts @fonts ||= Fonts.new(self) end |
#images ⇒ Object
Returns the Images object that provides convenience methods for working with images.
462 463 464 |
# File 'lib/hexapdf/document.rb', line 462 def images @images ||= Images.new(self) end |
#import(obj) ⇒ Object
:call-seq:
doc.import(obj) -> imported_object
Imports the given, with a different document associated PDF object and returns the imported object.
If the same argument is provided in multiple invocations, the import is done only once and the previously imoprted object is returned.
See: Importer
261 262 263 264 265 266 267 |
# File 'lib/hexapdf/document.rb', line 261 def import(obj) if !obj.kind_of?(HexaPDF::Object) || !obj.document? || obj.document == self raise ArgumentError, "Importing only works for PDF objects associated " \ "with another document" end HexaPDF::Importer.for(source: obj.document, destination: self).import(obj) end |
#inspect ⇒ Object
:nodoc:
676 677 678 |
# File 'lib/hexapdf/document.rb', line 676 def inspect #:nodoc: "<#{self.class.name}:#{object_id}>" end |
#layout ⇒ Object
Returns the Layout object that provides convenience methods for working with the HexaPDF::Layout classes for document layout.
484 485 486 |
# File 'lib/hexapdf/document.rb', line 484 def layout @layout ||= Layout.new(self) end |
#object(ref) ⇒ Object
:call-seq:
doc.object(ref) -> obj or nil
doc.object(oid) -> obj or nil
Returns the current version of the indirect object for the given exact reference or for the given object number.
For references to unknown objects, nil is returned but free objects are represented by a PDF Null object, not by nil!
See: Revisions#object
190 191 192 |
# File 'lib/hexapdf/document.rb', line 190 def object(ref) @revisions.object(ref) end |
#object?(ref) ⇒ Boolean
:call-seq:
doc.object?(ref) -> true or false
doc.object?(oid) -> true or false
Returns true if the the document contains an indirect object for the given exact reference or for the given object number.
Even though this method might return true for some references, #object may return nil because this method takes all revisions into account. Also see the discussion on #each for more information.
See: Revisions#object?
206 207 208 |
# File 'lib/hexapdf/document.rb', line 206 def object?(ref) @revisions.object?(ref) end |
#pages ⇒ Object
Returns the Pages object that provides convenience methods for working with pages.
Also see: HexaPDF::Type::PageTreeNode
457 458 459 |
# File 'lib/hexapdf/document.rb', line 457 def pages @pages ||= Pages.new(self) end |
#register_listener(name, callable = nil, &block) ⇒ Object
:call-seq:
doc.register_listener(name, callable) -> callable
doc.register_listener(name) {|*args| block} -> block
Registers the given listener for the message name.
404 405 406 407 408 |
# File 'lib/hexapdf/document.rb', line 404 def register_listener(name, callable = nil, &block) callable ||= block (@listeners[name] ||= []) << callable callable end |
#security_handler ⇒ Object
Returns the security handler that is used for decrypting or encrypting the document, or nil if none is set.
-
If the document was created by reading an existing file and the document was automatically decrypted, then this method returns the handler for decrypting.
-
Once the #encrypt method is called, the specified security handler for encrypting is returned.
574 575 576 |
# File 'lib/hexapdf/document.rb', line 574 def security_handler @security_handler end |
#sign(file_or_io, handler: :default, signature: nil, write_options: {}, **handler_options) ⇒ Object
Signs the document and writes it to the given file or IO object.
For details on the arguments file_or_io, signature and write_options see HexaPDF::Document::Signatures#add.
The signing handler to be used is determined by the handler argument together with the rest of the keyword arguments (see HexaPDF::Document::Signatures#handler for details).
If not changed, the default signing handler is HexaPDF::Document::Signatures::DefaultHandler.
Note: Once signing is done the document cannot be changed anymore since it was written. If a document needs to be signed multiple times, it needs to be loaded again after writing.
600 601 602 603 |
# File 'lib/hexapdf/document.rb', line 600 def sign(file_or_io, handler: :default, signature: nil, write_options: {}, **) handler = signatures.handler(name: handler, **) signatures.add(file_or_io, handler, signature: signature, write_options: ) end |
#signatures ⇒ Object
Returns an array with the digital signatures of this document.
584 585 586 |
# File 'lib/hexapdf/document.rb', line 584 def signatures @signatures ||= Signatures.new(self) end |
#signed? ⇒ Boolean
Returns true if the document is signed, i.e. contains digital signatures.
579 580 581 |
# File 'lib/hexapdf/document.rb', line 579 def signed? acro_form&.signature_flag?(:signatures_exist) end |
#task(name, **opts, &block) ⇒ Object
Executes the given task and returns its result.
Tasks provide an extensible way for performing operations on a PDF document without cluttering the Document interface.
See Task for more information.
501 502 503 504 505 506 |
# File 'lib/hexapdf/document.rb', line 501 def task(name, **opts, &block) task = config.constantize('task.map', name) do raise HexaPDF::Error, "No task named '#{name}' is available" end task.call(self, **opts, &block) end |
#trailer ⇒ Object
Returns the trailer dictionary for the document.
509 510 511 |
# File 'lib/hexapdf/document.rb', line 509 def trailer @revisions.current.trailer end |
#unwrap(object, seen = {}) ⇒ Object
:call-seq:
document.unwrap(obj) -> unwrapped_obj
Recursively unwraps the object to get native Ruby objects (i.e. Hash, Array, Integer, … instead of HexaPDF::Reference and HexaPDF::Object).
360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 |
# File 'lib/hexapdf/document.rb', line 360 def unwrap(object, seen = {}) object = deref(object) object = object.data if object.kind_of?(HexaPDF::Object) if seen.key?(object) raise HexaPDF::Error, "Can't unwrap a recursive structure" end case object when Hash seen[object] = true object.transform_values {|value| unwrap(value, seen.dup) } when Array seen[object] = true object.map {|inner_o| unwrap(inner_o, seen.dup) } when HexaPDF::PDFData seen[object] = true unwrap(object.value, seen.dup) else object end end |
#validate(auto_correct: true, only_loaded: false, &block) ⇒ Object
Validates all objects, or, if only_loaded is true, only loaded objects, with optional auto-correction, and returns true if everything is fine.
If a block is given, it is called on validation problems.
See HexaPDF::Object#validate for more information.
611 612 613 614 615 616 617 |
# File 'lib/hexapdf/document.rb', line 611 def validate(auto_correct: true, only_loaded: false, &block) #:yield: msg, correctable, object result = trailer.validate(auto_correct: auto_correct, &block) each(only_current: false, only_loaded: only_loaded) do |obj| result &&= obj.validate(auto_correct: auto_correct, &block) end result end |
#version ⇒ Object
Returns the PDF document's version as string (e.g. '1.4').
This method takes the file header version and the catalog's /Version key into account. If a version has been set manually and the catalog's /Version key refers to a later version, the later version is used.
See: PDF1.7 s7.2.2
525 526 527 528 |
# File 'lib/hexapdf/document.rb', line 525 def version catalog_version = (catalog[:Version] || '1.0').to_s (@version < catalog_version ? catalog_version : @version) end |
#version=(value) ⇒ Object
Sets the version of the PDF document. The argument must be a string in the format 'M.N' where M is the major version and N the minor version (e.g. '1.4' or '2.0').
532 533 534 535 |
# File 'lib/hexapdf/document.rb', line 532 def version=(value) raise ArgumentError, "PDF version must follow format M.N" unless value.to_s.match?(/\A\d\.\d\z/) @version = value.to_s end |
#wrap(obj, type: nil, subtype: nil, oid: nil, gen: nil, stream: nil) ⇒ Object
Wraps the given object inside a HexaPDF::Object class which allows one to use convenience functions to work with the object.
The obj argument can also be a HexaPDF::Object object so that it can be re-wrapped if needed.
The class of the returned object is always a subclass of HexaPDF::Object (or of HexaPDF::Stream if a stream is given). Which subclass is used, depends on the values of the type and subtype options as well as on the 'object.type_map' and 'object.subtype_map' global configuration options:
-
First
typeis used to try to determine the class. If it is not provided and ifobjis a hash with a :Type field, the value of this field is used instead. If the resulting object is already a Class object, it is used, otherwise the type is looked up in 'object.type_map'. -
If
subtypeis provided or can be determined becauseobjis a hash with a :Subtype or :S field, the type and subtype together are used to look up a special subtype class in 'object.subtype_map'.Additionally, if there is no
typebut asubtype, all required fields of the subtype class need to have values; otherwise the subtype class is not used. This is done to better prevent invalid mappings when only partial knowledge (:Type key is missing) is available. -
If there is no valid class after the above steps, HexaPDF::Stream is used if a stream is given, HexaPDF::Dictionary if the given object is a hash, HexaPDF::PDFArray if it is an array or else HexaPDF::Object is used.
Options:
- :type
-
(Symbol or Class) The type of a PDF object that should be used for wrapping. This could be, for example, :Pages. If a class object is provided, it is used directly instead of the type detection system.
- :subtype
-
(Symbol) The subtype of a PDF object which further qualifies a type. For example, image objects in PDF have a type of :XObject and a subtype of :Image.
- :oid
-
(Integer) The object number that should be set on the wrapped object. Defaults to 0 or the value of the given object's object number.
- :gen
-
(Integer) The generation number that should be set on the wrapped object. Defaults to 0 or the value of the given object's generation number.
- :stream
-
(String or StreamData) The stream object which should be set on the wrapped object.
313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 |
# File 'lib/hexapdf/document.rb', line 313 def wrap(obj, type: nil, subtype: nil, oid: nil, gen: nil, stream: nil) data = if obj.kind_of?(HexaPDF::Object) obj.data else HexaPDF::PDFData.new(obj) end data.oid = oid if oid data.gen = gen if gen data.stream = stream if stream if type.kind_of?(Class) klass = type type = (klass <= HexaPDF::Dictionary ? klass.type : nil) else type ||= deref(data.value[:Type]) if data.value.kind_of?(Hash) klass = GlobalConfiguration.constantize('object.type_map', type) { nil } if type end if data.value.kind_of?(Hash) subtype ||= deref(data.value[:Subtype]) || deref(data.value[:S]) end if subtype sub_klass = GlobalConfiguration.constantize('object.subtype_map', type, subtype) { klass } if type || sub_klass&.each_field&.none? {|name, field| field.required? && !data.value.key?(name) } klass = sub_klass end end klass ||= if data.stream HexaPDF::Stream elsif data.value.kind_of?(Hash) HexaPDF::Dictionary elsif data.value.kind_of?(Array) HexaPDF::PDFArray else HexaPDF::Object end klass.new(data, document: self) end |
#write(file_or_io, incremental: false, validate: true, update_fields: true, optimize: false) ⇒ Object
:call-seq:
doc.write(filename, incremental: false, validate: true, update_fields: true, optimize: false)
doc.write(io, incremental: false, validate: true, update_fields: true, optimize: false)
Writes the document to the given file (in case io is a String) or IO stream.
Before the document is written, it is validated using #validate and an error is raised if the document is not valid. However, this step can be skipped if needed.
Options:
- incremental
-
Use the incremental writing mode which just adds a new revision to an existing document. This is needed, for example, when modifying a signed PDF and the original signature should stay valid.
See: PDF1.7 s7.5.6
- validate
-
Validates the document and raises an error if an uncorrectable problem is found.
- update_fields
-
Updates the /ID field in the trailer dictionary as well as the /ModDate field in the trailer's /Info dictionary so that it is clear that the document has been updated.
- optimize
-
Optimize the file size by using object and cross-reference streams. This will raise the PDF version to at least 1.5.
647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 |
# File 'lib/hexapdf/document.rb', line 647 def write(file_or_io, incremental: false, validate: true, update_fields: true, optimize: false) (:complete_objects) if update_fields trailer.update_id trailer.info[:ModDate] = Time.now end if validate self.validate(auto_correct: true) do |msg, correctable, obj| next if correctable raise HexaPDF::Error, "Validation error for (#{obj.oid},#{obj.gen}): #{msg}" end end if optimize task(:optimize, object_streams: :generate) self.version = '1.5' if version < '1.5' end (:before_write) if file_or_io.kind_of?(String) File.open(file_or_io, 'w+') {|file| Writer.write(self, file, incremental: incremental) } else Writer.write(self, file_or_io, incremental: incremental) end end |