BitMagic-C++
Public Types | Public Member Functions | Protected Member Functions
bm::serializer< BV > Class Template Reference

Bit-vector serialization class. More...

#include <bmserial.h>

Inheritance diagram for bm::serializer< BV >:
Inheritance graph
[legend]

Public Types

typedef BV bvector_type
 
typedef bvector_type::allocator_type allocator_type
 
typedef bvector_type::blocks_manager_type blocks_manager_type
 
typedef bvector_type::statistics statistics_type
 
typedef bvector_type::block_idx_type block_idx_type
 
typedef bvector_type::size_type size_type
 
typedef byte_buffer< allocator_typebuffer
 

Public Member Functions

 serializer (const allocator_type &alloc=allocator_type(), bm::word_t *temp_block=0)
 Constructor. More...
 
 serializer (bm::word_t *temp_block)
 
 ~serializer ()
 
void set_compression_level (unsigned clevel)
 Set compression level. More...
 
unsigned get_compression_level () const
 Get compression level (0-5), Default 5 (recommended) 0 - take as is 1, 2 - apply light weight RLE/GAP encodings, limited depth hierarchical compression, intervals encoding 3 - variant of 2 with different cut-offs 4 - delta transforms plus Elias Gamma encoding where possible legacy) 5 - binary interpolated encoding (Moffat, et al) More...
 
const size_typeget_compression_stat () const
 Return serialization counter vector. More...
 
void gap_length_serialization (bool value)
 Set GAP length serialization (serializes GAP levels of the original vector) More...
 
void byte_order_serialization (bool value)
 Set byte-order serialization (for cross platform compatibility) More...
 
Serialization Methods
size_type serialize (const BV &bv, unsigned char *buf, size_t buf_size)
 Bitvector serialization into memory block. More...
 
void serialize (const BV &bv, typename serializer< BV >::buffer &buf, const statistics_type *bv_stat=0)
 Bitvector serialization into buffer object (resized automatically) More...
 
void optimize_serialize_destroy (BV &bv, typename serializer< BV >::buffer &buf)
 Bitvector serialization into buffer object (resized automatically) Input bit-vector gets optimized and then destroyed, content is NOT guaranteed after this operation. More...
 

Protected Member Functions

void encode_header (const BV &bv, bm::encoder &enc)
 Encode serialization header information. More...
 
void encode_gap_block (const bm::gap_word_t *gap_block, bm::encoder &enc)
 
void gamma_gap_block (const bm::gap_word_t *gap_block, bm::encoder &enc)
 
void gamma_gap_array (const bm::gap_word_t *gap_block, unsigned arr_len, bm::encoder &enc, bool inverted=false)
 Encode GAP block as delta-array with Elias Gamma coder. More...
 
void encode_bit_array (const bm::word_t *block, bm::encoder &enc, bool inverted)
 Encode bit-block as an array of bits. More...
 
void gamma_gap_bit_block (const bm::word_t *block, bm::encoder &enc)
 
void gamma_arr_bit_block (const bm::word_t *block, bm::encoder &enc, bool inverted)
 
void bienc_arr_bit_block (const bm::word_t *block, bm::encoder &enc, bool inverted)
 
void bienc_gap_bit_block (const bm::word_t *block, bm::encoder &enc)
 encode bit-block as interpolated bit block of gaps More...
 
void interpolated_arr_bit_block (const bm::word_t *block, bm::encoder &enc, bool inverted)
 
void interpolated_gap_bit_block (const bm::word_t *block, bm::encoder &enc)
 encode bit-block as interpolated gap block More...
 
void interpolated_gap_array (const bm::gap_word_t *gap_block, unsigned arr_len, bm::encoder &enc, bool inverted)
 Encode GAP block as an array with binary interpolated coder. More...
 
void interpolated_encode_gap_block (const bm::gap_word_t *gap_block, bm::encoder &enc)
 
void encode_bit_interval (const bm::word_t *blk, bm::encoder &enc, unsigned size_control)
 Encode BIT block with repeatable runs of zeroes. More...
 
void encode_bit_digest (const bm::word_t *blk, bm::encoder &enc, bm::id64_t d0)
 Encode bit-block using digest (hierarchical compression) More...
 
unsigned char find_gap_best_encoding (const bm::gap_word_t *gap_block)
 Determine best representation for GAP block based on current set compression level. More...
 
unsigned char find_bit_best_encoding (const bm::word_t *block)
 Determine best representation for a bit-block. More...
 
void reset_compression_stats ()
 Reset all accumulated compression statistics. More...
 
void reset_models ()
 
void add_model (unsigned char mod, unsigned score)
 

Detailed Description

template<class BV>
class bm::serializer< BV >

Bit-vector serialization class.

Class designed to convert sparse bit-vectors into a single block of memory ready for file or database storage or network transfer.

Reuse of this class for multiple serializations (but not across threads). Class resue offers some performance advantage (helps with temp memory reallocations).

Examples:
bvsetalgebra.cpp, sample14.cpp, sample4.cpp, and xsample01.cpp.

Definition at line 77 of file bmserial.h.

Member Typedef Documentation

◆ allocator_type

template<class BV>
typedef bvector_type::allocator_type bm::serializer< BV >::allocator_type

Definition at line 81 of file bmserial.h.

◆ block_idx_type

template<class BV>
typedef bvector_type::block_idx_type bm::serializer< BV >::block_idx_type

Definition at line 84 of file bmserial.h.

◆ blocks_manager_type

Definition at line 82 of file bmserial.h.

◆ buffer

template<class BV>
typedef byte_buffer<allocator_type> bm::serializer< BV >::buffer

Definition at line 87 of file bmserial.h.

◆ bvector_type

template<class BV>
typedef BV bm::serializer< BV >::bvector_type

Definition at line 80 of file bmserial.h.

◆ size_type

template<class BV>
typedef bvector_type::size_type bm::serializer< BV >::size_type

Definition at line 85 of file bmserial.h.

◆ statistics_type

template<class BV>
typedef bvector_type::statistics bm::serializer< BV >::statistics_type

Definition at line 83 of file bmserial.h.

Constructor & Destructor Documentation

◆ serializer() [1/2]

template<class BV >
bm::serializer< BV >::serializer ( const allocator_type alloc = allocator_type(),
bm::word_t temp_block = 0 
)

Constructor.

Parameters
alloc- memory allocator
temp_block- temporary block for various operations (if NULL it will be allocated and managed by serializer class) Temp block is used as a scratch memory during serialization, use of external temp block allows to avoid unnecessary re-allocations.

Temp block attached is not owned by the class and NOT deallocated on destruction.

Definition at line 725 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::reset_models().

◆ serializer() [2/2]

template<class BV >
bm::serializer< BV >::serializer ( bm::word_t temp_block)

Definition at line 749 of file bmserial.h.

◆ ~serializer()

template<class BV >
bm::serializer< BV >::~serializer ( )

Definition at line 772 of file bmserial.h.

Member Function Documentation

◆ add_model()

template<class BV >
void bm::serializer< BV >::add_model ( unsigned char  mod,
unsigned  score 
)
protected

Definition at line 1025 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::reset_models().

◆ bienc_arr_bit_block()

template<class BV >
void bm::serializer< BV >::bienc_arr_bit_block ( const bm::word_t block,
bm::encoder enc,
bool  inverted 
)
protected

Definition at line 1449 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ bienc_gap_bit_block()

template<class BV >
void bm::serializer< BV >::bienc_gap_bit_block ( const bm::word_t block,
bm::encoder enc 
)
protected

encode bit-block as interpolated bit block of gaps

Definition at line 1476 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ byte_order_serialization()

template<class BV >
void bm::serializer< BV >::byte_order_serialization ( bool  value)

Set byte-order serialization (for cross platform compatibility)

Parameters
value- TRUE serialization format includes byte-order marker

Definition at line 803 of file bmserial.h.

Referenced by convert_bv2bvs(), bm::serializer< bvector_type >::get_compression_stat(), main(), and bm::serialize().

◆ encode_bit_array()

template<class BV >
void bm::serializer< BV >::encode_bit_array ( const bm::word_t block,
bm::encoder enc,
bool  inverted 
)
protected

Encode bit-block as an array of bits.

Definition at line 1397 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ encode_bit_digest()

template<class BV >
void bm::serializer< BV >::encode_bit_digest ( const bm::word_t blk,
bm::encoder enc,
bm::id64_t  d0 
)
protected

Encode bit-block using digest (hierarchical compression)

Definition at line 1302 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ encode_bit_interval()

template<class BV >
void bm::serializer< BV >::encode_bit_interval ( const bm::word_t blk,
bm::encoder enc,
unsigned  size_control 
)
protected

Encode BIT block with repeatable runs of zeroes.

Definition at line 1250 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ encode_gap_block()

template<class BV >
void bm::serializer< BV >::encode_gap_block ( const bm::gap_word_t gap_block,
bm::encoder enc 
)
protected

Encode GAP block

Definition at line 1194 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ encode_header()

template<class BV>
void bm::serializer< BV >::encode_header ( const BV &  bv,
bm::encoder enc 
)
protected

Encode serialization header information.

Definition at line 809 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ find_bit_best_encoding()

template<class BV >
unsigned char bm::serializer< BV >::find_bit_best_encoding ( const bm::word_t block)
protected

Determine best representation for a bit-block.

Definition at line 1033 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ find_gap_best_encoding()

template<class BV >
unsigned char bm::serializer< BV >::find_gap_best_encoding ( const bm::gap_word_t gap_block)
protected

Determine best representation for GAP block based on current set compression level.

Returns
set_block_gap, set_block_bit_1bit, set_block_arrgap set_block_arrgap_egamma, set_block_arrgap_bienc set_block_arrgap_inv, set_block_arrgap_egamma_inv set_block_arrgap_bienc_inv, set_block_gap_egamma set_block_gap_bienc

Definition at line 1152 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ gamma_arr_bit_block()

template<class BV >
void bm::serializer< BV >::gamma_arr_bit_block ( const bm::word_t block,
bm::encoder enc,
bool  inverted 
)
protected

Definition at line 1430 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ gamma_gap_array()

template<class BV >
void bm::serializer< BV >::gamma_gap_array ( const bm::gap_word_t gap_block,
unsigned  arr_len,
bm::encoder enc,
bool  inverted = false 
)
protected

Encode GAP block as delta-array with Elias Gamma coder.

Definition at line 935 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ gamma_gap_bit_block()

template<class BV >
void bm::serializer< BV >::gamma_gap_bit_block ( const bm::word_t block,
bm::encoder enc 
)
protected

Definition at line 1421 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ gamma_gap_block()

template<class BV >
void bm::serializer< BV >::gamma_gap_block ( const bm::gap_word_t gap_block,
bm::encoder enc 
)
protected

Encode GAP block with Elias Gamma coder

Definition at line 897 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ gap_length_serialization()

template<class BV >
void bm::serializer< BV >::gap_length_serialization ( bool  value)

Set GAP length serialization (serializes GAP levels of the original vector)

Parameters
value- when TRUE serialized vector includes GAP levels parameters

Definition at line 797 of file bmserial.h.

Referenced by convert_bv2bvs(), bm::serializer< bvector_type >::get_compression_stat(), main(), and bm::compressed_collection_serializer< CBC >::serialize().

◆ get_compression_level()

template<class BV>
unsigned bm::serializer< BV >::get_compression_level ( ) const
inline

Get compression level (0-5), Default 5 (recommended) 0 - take as is 1, 2 - apply light weight RLE/GAP encodings, limited depth hierarchical compression, intervals encoding 3 - variant of 2 with different cut-offs 4 - delta transforms plus Elias Gamma encoding where possible legacy) 5 - binary interpolated encoding (Moffat, et al)

Recommended: use 3 or 5

Definition at line 125 of file bmserial.h.

◆ get_compression_stat()

template<class BV>
const size_type* bm::serializer< BV >::get_compression_stat ( ) const
inline

Return serialization counter vector.

Definition at line 183 of file bmserial.h.

◆ interpolated_arr_bit_block()

template<class BV >
void bm::serializer< BV >::interpolated_arr_bit_block ( const bm::word_t block,
bm::encoder enc,
bool  inverted 
)
protected

Definition at line 1519 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ interpolated_encode_gap_block()

template<class BV >
void bm::serializer< BV >::interpolated_encode_gap_block ( const bm::gap_word_t gap_block,
bm::encoder enc 
)
protected

Encode GAP block with using binary interpolated encoder

Definition at line 855 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ interpolated_gap_array()

template<class BV >
void bm::serializer< BV >::interpolated_gap_array ( const bm::gap_word_t gap_block,
unsigned  arr_len,
bm::encoder enc,
bool  inverted 
)
protected

Encode GAP block as an array with binary interpolated coder.

Definition at line 979 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ interpolated_gap_bit_block()

template<class BV >
void bm::serializer< BV >::interpolated_gap_bit_block ( const bm::word_t block,
bm::encoder enc 
)
protected

encode bit-block as interpolated gap block

Definition at line 1467 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ optimize_serialize_destroy()

template<class BV>
void bm::serializer< BV >::optimize_serialize_destroy ( BV &  bv,
typename serializer< BV >::buffer buf 
)

Bitvector serialization into buffer object (resized automatically) Input bit-vector gets optimized and then destroyed, content is NOT guaranteed after this operation.

Effectively it moves data into the buffer.

The reason this operation exsists is because it is faster to do all three operations in one single pass. This is a destructive serialization!

Parameters
bv- input/output bitvector
buf- output buffer object

Definition at line 1381 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_level(), and main().

◆ reset_compression_stats()

template<class BV >
void bm::serializer< BV >::reset_compression_stats ( )
protected

Reset all accumulated compression statistics.

Definition at line 782 of file bmserial.h.

Referenced by bm::serializer< bvector_type >::get_compression_stat().

◆ reset_models()

template<class BV>
void bm::serializer< BV >::reset_models ( )
inlineprotected

Definition at line 288 of file bmserial.h.

◆ serialize() [1/2]

template<class BV>
serializer< BV >::size_type bm::serializer< BV >::serialize ( const BV &  bv,
unsigned char *  buf,
size_t  buf_size 
)

Bitvector serialization into memory block.

Parameters
bv- input bitvector
buf- out buffer (pre-allocated) No range checking is done in this method. It is responsibility of caller to allocate sufficient amount of memory using information from calc_stat() function.
buf_size- size of the output buffer
Returns
Size of serialization block.
See also
calc_stat

Definition at line 1608 of file bmserial.h.

Referenced by convert_bv2bvs(), bm::serializer< bvector_type >::get_compression_level(), main(), make_BLOB(), and bm::compressed_collection_serializer< CBC >::serialize().

◆ serialize() [2/2]

template<class BV>
void bm::serializer< BV >::serialize ( const BV &  bv,
typename serializer< BV >::buffer buf,
const statistics_type bv_stat = 0 
)

Bitvector serialization into buffer object (resized automatically)

Parameters
bv- input bitvector
buf- output buffer object
bv_stat- input (optional) bit-vector statistics object if NULL, serialize will compute the statistics

Definition at line 1359 of file bmserial.h.

◆ set_compression_level()

template<class BV >
void bm::serializer< BV >::set_compression_level ( unsigned  clevel)

Set compression level.

Higher compression takes more time to process.

Parameters
clevel- compression level (0-4)

Definition at line 790 of file bmserial.h.

Referenced by convert_bv2bvs(), main(), and make_BLOB().


The documentation for this class was generated from the following file: