Sereal::Performance - Getting the most out of the Perl-Sereal implementation

Acknowledgment

       This module was originally developed for Booking.com.  With approval from Booking.com,  this  module  was
       generalized and published on CPAN, for which the authors would like to express their gratitude.

Authors And Contributors

       Yves Orton <demerphq@gmail.com>

       Damian Gryski

       Steffen Mueller <smueller@cpan.org>

       Rafaël Garcia-Suarez

       Ævar Arnfjörð Bjarmason <avar@cpan.org>

       Tim Bunce

       Daniel Dragan <bulkdd@cpan.org> (Windows support and bugfixes)

       Zefram

       Some  inspiration and code was taken from Marc Lehmann's excellent JSON::XS module due to obvious overlap
       in problem domain.

Bugs, Contact And Support

       For reporting bugs, please use the github bug tracker at <http://github.com/Sereal/Sereal/issues>.

       For support and discussion of Sereal, there are two Google Groups:

       Announcements           around            Sereal            (extremely            low            volume):
       <https://groups.google.com/forum/?fromgroups#!forum/sereal-announce>

       Sereal development list: <https://groups.google.com/forum/?fromgroups#!forum/sereal-dev>

Copyright And License

       Copyright (C) 2012, 2013, 2014 by Steffen Mueller Copyright (C) 2012, 2013, 2014 by Yves Orton

       The license for the code in this distribution is the following, with the exceptions listed below:

       This  library  is  free  software;  you can redistribute it and/or modify it under the same terms as Perl
       itself.

       Except portions taken from Marc Lehmann's code for the JSON::XS module, which is licensed under the  same
       terms as this module.  (Many thanks to Marc for inspiration, and code.)

       Also  except the code for Snappy compression library, whose license is reproduced below and which, to the
       best of our knowledge, is compatible with this module's license. The license for the enclosed Snappy code
       is:

         Copyright 2011, Google Inc.
         All rights reserved.

         Redistribution and use in source and binary forms, with or without
         modification, are permitted provided that the following conditions are
         met:

           * Redistributions of source code must retain the above copyright
         notice, this list of conditions and the following disclaimer.
           * Redistributions in binary form must reproduce the above
         copyright notice, this list of conditions and the following disclaimer
         in the documentation and/or other materials provided with the
         distribution.
           * Neither the name of Google Inc. nor the names of its
         contributors may be used to endorse or promote products derived from
         this software without specific prior written permission.

         THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
         "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
         LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
         A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
         OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
         SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
         LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
         DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
         THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
         (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
         OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

perl v5.40.0                                       2024-10-20                           Sereal::Performance(3pm)

Description

Using Sereal in the way that is optimally performant for your use case can make quite a significant
difference in performance. Broadly speaking, there are two classes of tweaks you can do: choosing the
right options during encoding (sometimes incurring trade-offs in output size) and calling the Sereal
encode/decode functions in the most efficient way.

If you are not yet using re-usable Sereal::Encoder and Sereal::Decoder objects, then read no further. By
switching from the "encode_sereal" and "decode_sereal" functions to either the OO interface or the
advanced functional interface, you will get a noticeable speed boost as encoder and decoder structures
can be reused. This is particularly significant for the encoder, which can re-use its output buffer. In
some cases, such a warmed-up encoder can avoid most memory allocations.

Irepeat,ifyoucareaboutperformance,thendonotusethe"encode_sereal"and"decode_sereal"interface.

The exact performance in time and space depends heavily on the data structure to be (de-)serialized.
Often there is a trade-off between space and time. If in doubt, do your own testing and most importantly
ALWAYSTESTWITHREALDATA. If you care purely about speed at the expense of output size, you can use the
"no_shared_hashkeys" option for a small speed-up, see below. If you need smaller output at the cost of
higher CPU load and more memory used during encoding/decoding, try the "dedupe_strings" option and enable
Snappy compression.

For ready-made comparison scripts, see the author_tools/bench.pl and author_tools/dbench.pl programs that
are part of this distribution. Suffice to say that this library is easily competitive in both time and
space efficiency with the best alternatives.

If switching to the OO interface is not enough, you may consider switching to the advanced functional
interface that avoids method lookup overhead, and by inlining as custom Perl OPs, may also avoid some of
the Perl function call overhead (Perl 5.14 and up). This additional speed-up is only a constant-offset,
avoiding said method/function call, rather than speeding up encoding itself and so will be most
significant if you are working with very small data sets.

"sereal_encode_with_object" and "sereal_decode_with_object" are optionally exported from the Sereal
module (or "Sereal::Encoder" and "Sereal::Decoder" respectively). They work the same as the object-
oriented interface except that they are invoked differently:

$srl_doc = $encoder->encode($data);

becomes

$srl_doc = sereal_encode_with_object($encoder, $data);

and

$data = $decoder->decode($srl_doc);

becomes

$data = sereal_decode_with_object($decoder, $srl_doc);

On Perl versions before 5.14, this will be marginally faster than the OO interface as it avoids method
lookup. This should rarely matter. On Perl versions starting from 5.14, the function call to
"sereal_encode_with_object" or "sereal_decode_with_object" will also be replaced with a custom Perl OP,
thus avoiding most of the function call overhead as well.

Tuningthe"Sereal::Encoder"
Several of the "Sereal::Encoder" options add or remove useful behaviour and some of them come at a
runtime performance cost.

"no_shared_hashkeys"
By default, Sereal will emit a "repetition" marker for hash keys that were already previously
encountered. Depending on your data structure, this can save quite a bit of space in the generated
document. Consider, for example, encoding an array of many objects of the same class. But it may not
save anything if you don't have a lot of repeated hash keys or don't even encode any hashes to begin
with.

In those cases, you can turn this feature off with the "no_shared_hashkeys" option for a small but
measurable speed-up.

"dedupe_strings"
If set, this option will apply the de-duplication logic to all strings that is only applied to hash
keys by default. This can be quite expensive in both memory and performance. The same is true for
"aliased_dedupe_strings".

"snappy" and "snappy_incr"
Enabling Snappy compression can (but doesn't have to) make your Sereal documents significantly smaller.
How effective this compression is for you depends entirely on the nature of your data. Snappy
compression is designed to be very fast. The additional space savings are very often worth the small
overhead.

"freeze_callbacks"
Using custom Perl "FREEZE" callbacks is very expensive. If enabled, the encoder has to do a method
lookup at least once per class of an object being serialized. If a "FREEZE" hook actually exists,
calling it will be even more expensive. If you care about ultimate performance, use with care.

"sort_keys"
This option forces the encoder to always "sort" the entries in a hash by its keys before writing them
to the Sereal document. This can be somewhat expensive for large hashes.

GeneralConsiderations
Perl variables (scalars specifically) can, at the same time, hold multiple representations of the same
data. If you create and integer and use it as a string, it will be cached in its string form. Sereal
attempts to detect the most compact of these representations for encoding, but can not always succeed.
For example, if a data structure was previously also traversed by certain other serialization modules
(such as Storable), then the scalars in the structure may have been irrevocably upgraded to a more
complex (and bigger) type. This is only an issue in crude benchmarks. So if you plan to benchmark
serialization, take care not to re-use the test data structure between serializers for results that do
not depend on the order of operations.

Name

       Sereal::Performance - Getting the most out of the Perl-Sereal implementation

Synopsis

         # This is different from the standard module synopsis in
         # that it chooses performance over ease-of-use.
         # Think twice before micro-optimizing your Sereal usage.
         # Usually, Sereal is a lot faster than most of one's code,
         # so unless you are doing bulk encoding/decoding, you are
         # better off optimizing for maintainability.

         use Sereal qw(sereal_encode_with_object
                       sereal_decode_with_object);
         my $enc = Sereal::Encoder->new();
         my $dec = Sereal::Decoder->new();

         my $big_data_structure = {...};

         my $srldoc = sereal_encode_with_object($enc, $big_data_structure);

         my $and_back = sereal_decode_with_object($dec, $srldoc);