BitMagic-C++
Typedefs | Functions | Variables
xsample03.cpp File Reference

Example: SNP search in human genome. More...

#include <iostream>
#include <sstream>
#include <chrono>
#include <regex>
#include <time.h>
#include <stdio.h>
#include <vector>
#include <map>
#include <utility>
#include "bm.h"
#include "bmalgo.h"
#include "bmserial.h"
#include "bmrandom.h"
#include "bmsparsevec.h"
#include "bmsparsevec_compr.h"
#include "bmsparsevec_algo.h"
#include "bmsparsevec_serial.h"
#include "bmalgo_similarity.h"
#include "bmsparsevec_util.h"
#include "bmdbg.h"
#include "bmtimer.h"
#include "bmundef.h"
Include dependency graph for xsample03.cpp:

Go to the source code of this file.

Typedefs

typedef bm::sparse_vector< unsigned, bm::bvector<> > sparse_vector_u32
 
typedef bm::rsc_sparse_vector< unsigned, sparse_vector_u32rsc_sparse_vector_u32
 
typedef std::vector< std::pair< unsigned, unsigned > > vector_pairs
 

Functions

static void show_help ()
  More...
 
static int parse_args (int argc, char *argv[])
  More...
 
static int load_snp_report (const std::string &fname, sparse_vector_u32 &sv)
  More...
 
static void generate_random_subset (const sparse_vector_u32 &sv, std::vector< unsigned > &vect, unsigned count)
  More...
 
static void build_vector_pairs (const sparse_vector_u32 &sv, vector_pairs &vp)
  More...
 
static bool search_vector_pairs (const vector_pairs &vp, unsigned rs_id, unsigned &pos)
  More...
 
static void run_benchmark (const sparse_vector_u32 &sv, const rsc_sparse_vector_u32 &csv)
  More...
 
int main (int argc, char *argv[])
  More...
 

Variables

std::string sv_out_name
  More...
 
std::string rsc_out_name
  More...
 
std::string sv_in_name
  More...
 
std::string rsc_in_name
  More...
 
std::string isnp_name
  More...
 
bool is_diag = false
  More...
 
bool is_timing = false
  More...
 
bool is_bench = false
  More...
 
bm::chrono_taker ::duration_map_type timing_map
  More...
 

Detailed Description

Example: SNP search in human genome.

Brief description of used method:

  1. Parse SNP chromosome report and extract information about SNP number and location in the chromosome
  2. Use this information to build bit-transposed sparse_vector<> where vector position matches chromosome position and SNP ids (aka rsid) is kept as a bit-transposed matrix
  3. Build rank-select compressed sparse vector, dropping all NULL columns (this data format is pretty sparse, since number of SNPs is significantly less than number of chromosome bases (1:5 or less) Use memory report to understand memory footprint for each form of storage
  4. Run benchmarks searching for 500 randomly selected SNPs using
    • bm::sparse_vector<>
    • bm::rsc_sparse_vector<>
    • std::vector<pair<unsigned, unsigned> >

This example should be useful for construction of compressed columnar tables with parallel search capabilities.

Definition in file xsample03.cpp.