BitMagic C++ Library: Getting Started
Build model
BitMagic C++ Library uses simple build model - it is all headers.
All library sources are in the src directory. You can copy all files to your project, make include path reachable to your compiler and you are ready to go.
However if you want to use our makefiles you need to follow the next simple instructions:
Unix/Linux:
- Apply environment variables by runing bmenv.sh :
$ . bmenv.sh
- use GNU make (gmake) to build installation.
$gmake rebuild
or (DEBUG version)
$gmake DEBUG=YES rebuild
The default compiler on Unix and CygWin is g++. If you want to change the default you can do that in makefile.in (should be pretty easy to do). We regularly test this library with GCC/G++, XCODE, Intel C++, MSVC.
Basic usage
BitMagic uses STL fiendly concept of containers, iterators and algorithms to implement inverted lists and bit-transposed sparse vectors.
Sample 1: bm::bvector<>
This example shows how to start with bm::bvector<>
- container and how to iterate over the list
of values with bm::bvector<>::enumerator iterator
#include <iostream>
#include <algorithm>
#include "bm.h"
using namespace std;
void Print(unsigned n)
{
cout << n << endl;;
}
int main(void)
{
bm::bvector<> bv;
bv[10] = true;
bv[100] = true;
bv[10000] = true;
bm::bvector<>::enumerator en = bv.first();
bm::bvector<>::enumerator en_end = bv.end();
while (en < en_end)
{
cout << *en << endl;
++en; // pre-increment - fastest way to increment enumerator
}
en = bv.first();
// This is not the fastest way to do the job, because for_each
// often will try to calculate difference between iterators,
// which is expensive for enumerators.
// But it can be useful for some STL loyal applications.
std::for_each(en, en_end, Print);
return 0;
}
STL compatibility note
STL is designed around algorithms and we employ std::for_each
.
There is a problem though. Enumerator is not a truly random access iterator.
Some STL algorithms can try to calculate difference between iterators.
Sometimes it is not obvious if algorithm needs "-" or not.
Often it is done as a part of loop unrolling for optimization.
For enumerator difference is very expensive and "optimization" like this we can end up with performance
degradation. Be extra careful when you use enumerator with the standard algorithms.
Sample 2. Search for next bit
BM bvector template class provides get_first()
and get_next()
functions.
bm::bvector<> bv;
bv.set_bit(10);
bv.set_bit(100);
bv.set_bit(1000000);
unsigned value = bv.get_first();
do
{
cout << value << endl;
value = bv.get_next(value);
if (!value)
{
break;
}
} while(1);
Possible advantage of get_next()
is that it can start its scan forward from any
position in bvector.
Sample 3: bm::sparse_vector<>
Next example illustrates use of bit-transposed array, which offers on the fly
compression of integer values. It is based on bm::bvector<>
#include <iostream>
#include "bmsparsevec.h"
using namespace std;
int main(void)
{
bm::sparse_vector > sv1;
unsigned arr[3] = {1,2,3};
sv1.import(arr, 3); // import from a C-style array (fastest way to populate)
// optimize memory allocation of sparse vector
{
BM_DECLARE_TEMP_BLOCK(tb)
sv1.optimize(tb);
}
cout << "sv1.size() = " << sv1.size() << endl;
cout << "sv[]:";
// print the vector elements using direct access operator
for (unsigned i = 0; i < sv1.size(); ++i)
{
cout << sv1.at(i) << ",";
}
cout << endl;
return 0;
}