Powerful Tool for Big Data Problems
BitMagic Library helps to develop high-throughput search systems, promote combination of hardware optimization and
on the fly compression to fit inverted indexes and binary fingerprints into memory, minimize disk and network footprint.
- compressed bit-vector container, implements random access methods,
with range of set-algebraic functions, ranks, find and traverse methods,
- set algebraic operations: AND, OR, XOR, MINUS for bit-vectors and integer sets.
Interoperable with low level C arrays and STL compatible containers (via iterators).
- serialization/hybernation of containers into compressed BLOBs for database persistence
or in-memory compression.
- memory management with focus on optimization (avoiding) allocations/de-allocations,
minimization of heap fragmentation, custom allocators.
- set algebraic operations on compressed bit-vector BLOBs
- statistical engine to efficiently construct binary similarity and distance
metrics (Tevrsky, Hamming, Tanamoto or your own).
- containers for sparse vectors and collections for native integer types.
Works throug bit-transposition and compression of each separate bit-plain.
Supports for NULL semantics. Can be used for memory-compresses vector/columnar
search systems with focus on memory efficiency.
- algorithms on sparse vectors: dynamic range clipping (work in progress!)
- functional operations on integer sets (theory of groups): translations between sets,
mathematical images (work in progress!).
- binary compressed matrices for ER-operations, materialized joins,
one-to-many and many-to-many relationships, materialized RDBMS joins, graphs, etc.
(work in progress!)
- portable C-library layer as a bridge to Python, Java, .Net (work in progress!)
C and C++
BitMagic C++ Templates library offers STL friendly containers and iterators, all portable yet investing into low level optimizations.
Our templates are header-only designed for easy integration into your big project. We provide lean (no RTTI, no STL, no exceptions)
mapping into C language (JNI into Java and Scala - work in progress).
Storage and communications
Efficient serialization algorithms for saving containers. Serialization tools are provided for all containers, you can use
it with embedded systems (like Berkeley DB), large scale RDBMS systems (Oracle, MS SQL, MySQL) or NoSQL (memcached).
Bit-vectors can be serialized and sent over network for cross-platform data exchange and streaming,
used for construction of network middleware, appliances and micro-services.
The mission of our project is to share tools, and expertise, use cases and know-how
of search systems, bit-vectors, inverted lists, compression techniques, libraries, programming language bindings, etc.
BitMagic C++ Library implements easy, header only programming model.
Public code repository
BitMagic Library is hosted on GitHub and SourceForge.
Use cases for various applications of BitMagic
Articles about design and performance optimizations.