Tlv\btree data structure + Berkeley DB

I am using blacklist of SquidGuard for content filtering.
SquidGuard uses Berkeley DB to store: Domains, urls and regex.
now i am using redis + mysql DB redis for memory cache and mysql to
store the data.

SquidGuard search is really fast and faster then mysql many times and i
want to try to store the Domains in Berkeley DB file as persistent
storage.
I have domains blacklist file which contains on each line one domain
that I want to store in the DB file.
I have tried to read about Berkeley DB how it works but I dont really
understand yet how they use the DB to store domains.

the original file is 17+ MB and i want to benefit from the DB for fast
lookup.
in mysql the size of the DB + INDEX is about 100MB.
a Berkeley DB of the same data the was made by SquidGuard is about
50-60MB size.

I want to benchmark the Berkeley DB and mysql or other DB.

so:

  1. basic suggestions on how to organize TLV domains DB?
  2. how do i organize the domains in a “Ordered key-value” DB such as
    Berkeley?
  3. ways to benchmark key lookup in DB?
  4. other DB you can recommend for the task?

The API i want to use is “add(domain)” “exist(domain)” “remove(domain)”.
I am looking for code snippets and examples on usage of Berkeley DB in
ruby using the ruby-bdb(0.2.6.5).
I have seen the example in the github repo but some more examples for
real-world usage is what i am looking for.

Thanks,
Eliezer

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs