BitMagic C++ Library: APIs

Jan, 2019 (last updated for BitMagic v.3.19.0)

BitMagic C++ is a header only library

Header Description Link(s)
bm.h Root of BitMagic APIs, implements bm::bvector<>. Iterators, methods and functions implementing container and main set algebraic operations. Very often, this is just enough to start working with bit sets. Main building block for inverted indexes, matrices of relationships, bloom filters, high dimentional binary classifications. bm.h
bmserial.h Classes and methods to transform bvector<> into a compact BLOB, ready for disk I/O, network transfer or database storage. Utilities to restore/deserialize bit-vector. Methods to perform logical operations between bit-vector and its compact, serialized BLOB. bmserial.h
bmalgo.h bmalgo_impl.h Algorithms with bit vectors to compute similarity/distance metrics and components of complex metrics efficiently, without producing intermediate temporary bit vectors. Logical operations to efficiently combine bit vectors and C arrays (or STL iterators). Template algorithms to visit bvector<> set elements. bmalgo.h bmalgo_impl.h
bmaggregator.h Fast logical operations on groups of bit-vectors. OR and AND aggregators. bmaggregator.h
bmrandom.h bm::random_subset<> - template utility to do random sampling of bit-vectors. bmrandom.h
bmsparsevec.h bm::sparse_vector<> - container for bit-transposed (integer) arrays. Each bit plain represented as compressed bit-vector, implements a form of in-memory compression for scalar integer types. Sparse vector supports NULL (unassigned) values semantics. Can be used for memory efficient time series vectors, events, histograms, pre-calculated relationship operations, elements of matrix storage, graph connect lists, bloom filters, b-bit indexing. bmsparsevec.h
bmsparsevec_compr.h bm::rsc_sparse_vector<> - rank-select compacted container for bit-transposed (integer) arrays. Just like bm::sparse_vector<> - each bit plain represented as compressed bit-vector, implements a form of in-memory compression for scalar integer types. Sparse vector supports NULL (unassigned) values semantics. Ranl-select succinct vector use this (not) NULL vector to drop missing values (bit columns in bit-transposed format) to improve memory efficiency. bmsparsevec_compr.h
bmsparsevec_algo.h Algorithms based on bit-transposed sparse vector properties. Dynamic range clipping, set-algebra on plains, efficient searches in unsorted vectors. bmsparsevec.h
bmsparsevec_serial.h Serialization and de-serialization of sparse vector into compact and portable BLOB. bmsparsevec_serial.h

API examples and technical notes


Bit-vector container, iterators, serialization and algorithms.

Algebra of Sets

Name Description Links Examples
Algebra of Sets Algorithms implementing various operations of Algebra of Sets. Tutorial bvsetalgebra.cpp


Name Description Links Examples
bm::bvector<> container Main container for inverted lists representation. Depending on settings it can store lists of integers or bit-vectors. More sample1.cpp sample2.cpp sample3.cpp
bm::bvector<>::reference Accessor to use regular C++/STL operator conventions to assign container content More sample1.cpp sample9.cpp
bm::combine_*() operations Set Alebra algorithms for interoprability between arrays and bit vectors More sample7.cpp
bm::bvector<>::enumerator Constant iterator to traverse inverted list without explicit conversion into unpacked list of integers More sample5.cpp sample8.cpp
bm::bvector<>::counted_enumerator Constant iterator to traverse inverted list without explicit conversion into unpacked list of integers. It offers extra functinality of keeping a running sum of all ON bits it traversed from position 0 to current. More sample11.cpp
bm::bvector<>::insert_iterator insert iterator to feed inverted list into bit-vector More sample8.cpp
Serialization Utilities and algorithms to compress and serialize bit-vector for database storage or network transfer More sample4.cpp
Allocator API to re-define standard bvector<> alocator type to create special pool allocation strategy, thread local mamory pools, collect bit-vector related memory activity statistics, etc. More sample6.cpp
Bit counting Various methods and techniques to do bit counting for the whole bit-vector, ranges, via accelerated structures or iterations. Performance benchmarking for different methods. Details sample11.cpp
Setting bits Various methods and techniques to set bits in bit-vectors, ranges of bits, bits flipping/swapping, load bits from a set of ints, extraction of bits. Performance benchmarking for different methods. sample12.cpp
Logical operations on serialized bit-vectors Example on how to perform logical operations on bit-vecor BLOBs without necessarily materializing them into a random-access bit-vector. sample14.cpp
Finding first and last elements of bit-vectors Example on how to determine bit-vector start and end bits (dynamic range). sample15.cpp
bm::aggregator<> - algorithms for fast AND, OR aggregation of groups of vectors. Example on how to initialize, setup and use bm::aggregator<> Tech.notes sample16.cpp
bm::bvector<> - methods for rank and select Example on how to build and use rank-select index Tech.notes sample17.cpp
bm::bvector<>::bulk_insert_iterator Example on how to set bits fast using STL compatible bulk insert iterator. sample18.cpp
bm::bvector<>::merge() Merge operation is an equivalent of logical OR, but faster because it moves memory blocks from the agrument vector. sample19.cpp
bm::bvector<>::shift_right(), bm::bvector<>::insert() Bit-shift and insert operations. sample20.cpp

Algorithms for bm::bvector<>

Name Description Links Examples
Distances Algorithms optimized for computing binary distances for clusterization of multi-dimentional binary spaces. More sample9.cpp
Random sub-set Algorithm binary subset randomization. More sample10.cpp
Miscelleneous Algorithms for combining(OR, XOR, SUB, AND) bvector<> with STL-compatible iterators. Count number of intervals in bit-vector, etc. More

Bit-transposed sparse vector bm::sparse_vector<>

Name Description Links Examples
bm::sparse_vector<> container Store array of integers using compact bit-transposed implementation and the arsenal of compression methods provided by bm::bvector<> More svsample01.cpp svsample02.cpp svsample03.cpp
Serialization Algorithms to compress and serialize bm::sparse_vector<> for storage or network transfer More svsample02.cpp
Algorithms. Dynamic range clipping. Algorithms on bit-transposed sparse vector. Dynamic range clipping. More
sparse vector with NULL support How to configure and use sparse vector with unassigned (NULL) elements svsample04.cpp
Algorithms. Set to set image function. Theory of groups image function using sparse vector as an association container defining set-to-set translation. Tech.notes svsample05.cpp
Algorithms. Search in unordered set. Search in sparse vector, using scanner class. How to bm::sparse_vector<> iterators. Benchmarks. Tech.notes svsample06.cpp

Rank-Select Compacted bit-transposed sparse vector. bm::rsc_sparse_vector<>

Name Description Links Examples
bm::rsc_sparse_vector<> container Introduction to rsc_sparse_vector<>. Example shows how to load data, serialize and deserialize it. rscsample01.cpp

Advanced Examples (mixed techniques)

Name Description Links Examples
Memory consumption / sparse sets Comparison of different methods of handling of sparse sets. Memory and performance benchmarks. Described techniques can be used for in-memory acceleration of ER JOIN operations. Tech.notes xsample01.cpp
Counting Sort, construction of histograms Example, shows a how to make fast and memory efficient integer counting sort / histogram construction using BitMagic sparse_vector<>. Speed benchmark of various techniques based on QuickSort, R-B tree (stl::map<>), parallel version of counting sort. Tech.notes xsample02.cpp
SNP search with succinct sparse vector Example, shows a how to build and search memory compact vector of mutations (SNPs), using real data from NCBI. Memory and performance benchmark comparisons. Tech.notes xsample03.cpp
Substring search with fingerprints and SHIFT-AND Example, shows a how to build DNA fingerprints and use different algorithms to search for a k-mer subsequence. Benchmarking comparisons with more straight-forward approaches. Tech.notes xsample04.cpp
Fast bit-vector index construction Example, shows a how to build DNA fingerprints. This is related to xsample04.cpp but focuses of fast index construction, using SIMD and multi-threading. Tech.notes xsample04a.cpp
Memory efficient dictionary Example, shows a how to use bm::str_sparse_vector<>. Sample uses data from NASA NED extragalactic database to build an efficient dictionary using bit-transposed sparse matrix and characted remappings. Sample illustrates binary search and scan search techniques. Tech.notes xsample05.cpp