I am using blacklist of SquidGuard for content filtering.
SquidGuard uses Berkeley DB to store: Domains, urls and regex.
now i am using redis + mysql DB redis for memory cache and mysql to
store the data.
SquidGuard search is really fast and faster then mysql many times and i
want to try to store the Domains in Berkeley DB file as persistent
storage.
I have domains blacklist file which contains on each line one domain
that I want to store in the DB file.
I have tried to read about Berkeley DB how it works but I dont really
understand yet how they use the DB to store domains.
the original file is 17+ MB and i want to benefit from the DB for fast
lookup.
in mysql the size of the DB + INDEX is about 100MB.
a Berkeley DB of the same data the was made by SquidGuard is about
50-60MB size.
I want to benchmark the Berkeley DB and mysql or other DB.
so:
- basic suggestions on how to organize TLV domains DB?
- how do i organize the domains in a “Ordered key-value” DB such as
Berkeley? - ways to benchmark key lookup in DB?
- other DB you can recommend for the task?
The API i want to use is “add(domain)” “exist(domain)” “remove(domain)”.
I am looking for code snippets and examples on usage of Berkeley DB in
ruby using the ruby-bdb(0.2.6.5).
I have seen the example in the github repo but some more examples for
real-world usage is what i am looking for.
Thanks,
Eliezer