Class: HexaPDF::Serializer
- Inherits:
-
Object
- Object
- HexaPDF::Serializer
- Defined in:
- lib/hexapdf/serializer.rb
Overview
Knows how to serialize Ruby objects for a PDF file.
For normal serialization purposes, the #serialize or #serialize_to_io methods should be used. However, if the type of the object to be serialized is known, a specialized serialization method like #serialize_float can be used.
Additionally, an object for encrypting strings and streams while serializing can be set via the #encrypter= method. The assigned object has to respond to #encrypt_string(str, ind_obj) (where the string is part of the indirect object; returns the encrypted string) and #encrypt_stream(stream) (returns a fiber that represents the encrypted stream).
How This Class Works
The main public interface consists of the #serialize and #serialize_to_io methods which accept an object and return its serialized form. During serialization of this object it is accessible by individual serialization methods via the @object instance variable (useful if the object is a composed object).
Internally, the #__serialize method is used for invoking the correct serialization method based on the class of a given object. It is also used for serializing individual parts of a composed object.
Therefore the serializer contains one serialization method for each class it needs to serialize. The naming scheme of these methods is based on the class name: The full class name is converted to lowercase, the namespace separator '::' is replaced with a single underscore and the string “serialize_” is then prepended.
Examples:
NilClass => serialize_nilclass
TrueClass => serialize_trueclass
HexaPDF::Object => serialize_hexapdf_object
If no serialization method for a specific class is found, the ancestors classes are tried.
See: PDF1.7 s7.3
Constant Summary collapse
- NAME_SUBSTS =
The regexp matches all characters that need to be escaped and the substs hash contains the mapping from these characters to their escaped form.
See PDF1.7 s7.3.5
{}
- NAME_REGEXP =
:nodoc:
/[^!-~&&[^##{Regexp.escape(Tokenizer::DELIMITER)}#{Regexp.escape(Tokenizer::WHITESPACE)}]]/- NAME_CACHE =
:nodoc:
Utils::LRUCache.new(1000)
- BYTE_IS_DELIMITER =
:nodoc:
{40 => true, 47 => true, 60 => true, 91 => true, # :nodoc: 41 => true, 62 => true, 93 => true}.freeze
- STRING_ESCAPE_MAP =
:nodoc:
{"(" => "\\(", ")" => "\\)", "\\" => "\\\\", "\r" => "\\r"}.freeze
Instance Attribute Summary collapse
-
#encrypter ⇒ Object
The encrypter to use for encrypting strings and streams.
Instance Method Summary collapse
-
#initialize ⇒ Serializer
constructor
Creates a new Serializer object.
-
#serialize(obj) ⇒ Object
Returns the serialized form of the given object.
-
#serialize_array(obj) ⇒ Object
Serializes an Array object.
-
#serialize_date(obj) ⇒ Object
See: #serialize_time.
-
#serialize_datetime(obj) ⇒ Object
See: #serialize_time.
-
#serialize_falseclass(_obj) ⇒ Object
Serializes the
falsevalue. -
#serialize_float(obj) ⇒ Object
Serializes a Float object.
-
#serialize_hash(obj) ⇒ Object
Serializes a Hash object (i.e. a PDF dictionary object).
-
#serialize_integer(obj) ⇒ Object
Serializes an Integer object.
-
#serialize_nilclass(_obj) ⇒ Object
Serializes the
nilvalue. -
#serialize_numeric(obj) ⇒ Object
Serializes a Numeric object (either Integer or Float).
-
#serialize_string(obj) ⇒ Object
Serializes a String object.
-
#serialize_symbol(obj) ⇒ Object
Serializes a Symbol object (i.e. a PDF name object).
-
#serialize_time(obj) ⇒ Object
The ISO PDF specification differs in respect to the supported date format.
-
#serialize_to_io(obj, io) ⇒ Object
Serializes the given object and writes it to the IO.
-
#serialize_trueclass(_obj) ⇒ Object
Serializes the
truevalue.
Constructor Details
#initialize ⇒ Serializer
Creates a new Serializer object.
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
# File 'lib/hexapdf/serializer.rb', line 92 def initialize @dispatcher = { Hash => 'serialize_hash', Array => 'serialize_array', Symbol => 'serialize_symbol', String => 'serialize_string', Integer => 'serialize_integer', Float => 'serialize_float', Time => 'serialize_time', TrueClass => 'serialize_trueclass', FalseClass => 'serialize_falseclass', NilClass => 'serialize_nilclass', HexaPDF::Reference => 'serialize_hexapdf_reference', HexaPDF::Object => 'serialize_hexapdf_object', HexaPDF::Stream => 'serialize_hexapdf_stream', HexaPDF::Dictionary => 'serialize_hexapdf_object', HexaPDF::PDFArray => 'serialize_hexapdf_object', HexaPDF::Rectangle => 'serialize_hexapdf_object', } @dispatcher.default_proc = lambda do |h, klass| h[klass] = if klass <= HexaPDF::Stream "serialize_hexapdf_stream" elsif klass <= HexaPDF::Object "serialize_hexapdf_object" else method = nil klass.ancestors.each do |ancestor_klass| name = ancestor_klass.name.to_s.downcase name.gsub!(/::/, '_') method = "serialize_#{name}" break if respond_to?(method, true) end method end end @encrypter = false @io = nil @object = nil @in_object = false end |
Instance Attribute Details
#encrypter ⇒ Object
The encrypter to use for encrypting strings and streams. If nil, strings and streams are not encrypted.
Default: nil
89 90 91 |
# File 'lib/hexapdf/serializer.rb', line 89 def encrypter @encrypter end |
Instance Method Details
#serialize(obj) ⇒ Object
Returns the serialized form of the given object.
For developers: While the object is serialized, methods can use the instance variable
137 138 139 140 141 142 |
# File 'lib/hexapdf/serializer.rb', line 137 def serialize(obj) @object = obj __serialize(obj) ensure @object = nil end |
#serialize_array(obj) ⇒ Object
Serializes an Array object.
See: PDF1.7 s7.3.6
234 235 236 237 238 239 240 241 242 243 244 245 |
# File 'lib/hexapdf/serializer.rb', line 234 def serialize_array(obj) str = +"[" index = 0 while index < obj.size tmp = __serialize(obj[index]) str << " " unless BYTE_IS_DELIMITER[tmp.getbyte(0)] || BYTE_IS_DELIMITER[str.getbyte(-1)] str << tmp index += 1 end str << "]" end |
#serialize_date(obj) ⇒ Object
See: #serialize_time
299 300 301 |
# File 'lib/hexapdf/serializer.rb', line 299 def serialize_date(obj) serialize_time(obj.to_time) end |
#serialize_datetime(obj) ⇒ Object
See: #serialize_time
304 305 306 |
# File 'lib/hexapdf/serializer.rb', line 304 def serialize_datetime(obj) serialize_time(obj.to_time) end |
#serialize_falseclass(_obj) ⇒ Object
Serializes the false value.
See: PDF1.7 s7.3.2
171 172 173 |
# File 'lib/hexapdf/serializer.rb', line 171 def serialize_falseclass(_obj) "false" end |
#serialize_float(obj) ⇒ Object
Serializes a Float object.
See: PDF1.7 s7.3.3
195 196 197 198 199 200 201 202 203 |
# File 'lib/hexapdf/serializer.rb', line 195 def serialize_float(obj) if -0.0001 < obj && obj < 0.0001 && obj != 0 sprintf("%.6f", obj) elsif obj.finite? obj.round(6).to_s else raise HexaPDF::Error, "Can't serialize special floating point number #{obj}" end end |
#serialize_hash(obj) ⇒ Object
Serializes a Hash object (i.e. a PDF dictionary object).
See: PDF1.7 s7.3.7
250 251 252 253 254 255 256 257 258 259 260 261 |
# File 'lib/hexapdf/serializer.rb', line 250 def serialize_hash(obj) str = +"<<" obj.each do |k, v| next if v.nil? || (v.respond_to?(:null?) && v.null?) str << serialize_symbol(k) tmp = __serialize(v) str << " " unless BYTE_IS_DELIMITER[tmp.getbyte(0)] || BYTE_IS_DELIMITER[str.getbyte(-1)] str << tmp end str << ">>" end |
#serialize_integer(obj) ⇒ Object
Serializes an Integer object.
See: PDF1.7 s7.3.3
188 189 190 |
# File 'lib/hexapdf/serializer.rb', line 188 def serialize_integer(obj) obj.to_s end |
#serialize_nilclass(_obj) ⇒ Object
Serializes the nil value.
See: PDF1.7 s7.3.9
157 158 159 |
# File 'lib/hexapdf/serializer.rb', line 157 def serialize_nilclass(_obj) "null" end |
#serialize_numeric(obj) ⇒ Object
Serializes a Numeric object (either Integer or Float).
This method should be used for cases where it is known that the object is either an Integer or a Float.
See: PDF1.7 s7.3.3
181 182 183 |
# File 'lib/hexapdf/serializer.rb', line 181 def serialize_numeric(obj) obj.kind_of?(Integer) ? obj.to_s : serialize_float(obj) end |
#serialize_string(obj) ⇒ Object
Serializes a String object.
See: PDF1.7 s7.3.4
268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 |
# File 'lib/hexapdf/serializer.rb', line 268 def serialize_string(obj) obj = if @encrypter && @object.kind_of?(HexaPDF::Object) && @object.indirect? encrypter.encrypt_string(obj, @object) elsif obj.encoding != Encoding::BINARY if obj.match?(/[^ -~\t\r\n]/) "\xFE\xFF".b << obj.encode(Encoding::UTF_16BE).force_encoding(Encoding::BINARY) else obj.b end else obj.dup end obj.gsub!(/[()\\\r]/n, STRING_ESCAPE_MAP) "(#{obj})" end |
#serialize_symbol(obj) ⇒ Object
Serializes a Symbol object (i.e. a PDF name object).
See: PDF1.7 s7.3.5
219 220 221 222 223 224 225 226 |
# File 'lib/hexapdf/serializer.rb', line 219 def serialize_symbol(obj) NAME_CACHE[obj] ||= begin str = obj.to_s.dup.force_encoding(Encoding::BINARY) str.gsub!(NAME_REGEXP, NAME_SUBSTS) str.empty? ? "/ " : "/#{str}" end end |
#serialize_time(obj) ⇒ Object
The ISO PDF specification differs in respect to the supported date format. When converting to a date string, a format suitable for both is output.
See: PDF1.7 s7.9.4, ADB1.7 3.8.3
288 289 290 291 292 293 294 295 296 |
# File 'lib/hexapdf/serializer.rb', line 288 def serialize_time(obj) zone = obj.strftime("%z'") if zone == "+0000'" zone = '' else zone[3, 0] = "'" end serialize_string(obj.strftime("D:%Y%m%d%H%M%S#{zone}")) end |
#serialize_to_io(obj, io) ⇒ Object
Serializes the given object and writes it to the IO.
Also see: #serialize
147 148 149 150 151 152 |
# File 'lib/hexapdf/serializer.rb', line 147 def serialize_to_io(obj, io) @io = io @io << serialize(obj).freeze ensure @io = nil end |
#serialize_trueclass(_obj) ⇒ Object
Serializes the true value.
See: PDF1.7 s7.3.2
164 165 166 |
# File 'lib/hexapdf/serializer.rb', line 164 def serialize_trueclass(_obj) "true" end |