Using hash as database? Secure databases with ruby?

dubstep · April 3, 2012, 7:29pm

I am new to programming and am learning ruby as my first language. I
want to use a database to keep track of client information in a program
I am working on. I considered just using text files and folders to keep
track of data, but this creates an ugly solution. I need to balance
having a nice program with security though, and I think adding database
software is going to increase complexity and thus inherently decrease
the over all security of the finished program. I have thought of a
potential way to have both minimal complexity and nice organization of
data.

For example lets say I have a multi user program that needs to keep
track of each users contact list, amongst other things. Is there
anything wrong with making a hash for user data, and keying it with
things like

database = {}
database.store(user + “_contact”, contact)

and then when it is time to get the users contact list, I will search
the hash with a regular expression looking for the current users chunk
of _contact entries, and then return each of them to get the full list
of contacts. But I will have multiple sorts of information stored in one
hash, that I will also write to the disk and load from disk into a hash
when the program is running (or maybe when it is time to use it only?).

I think this sort of database structure will be much less complex than
trying to use something like MySQL, and since I already use ruby hashes
in other parts of the program it really isn’t even adding any more
complexity (and potential for things to go wrong) than I already had
before. I think my entire hash will never get over 100,000 entries, and
that is being generous in itself.

Hm one issue I just noticed is the problem of only being able to map one
key to one value, where as I will need to do multiple in some cases
(although not contacts). Maybe I can put array inside the hash?

Anyway I just want to know if I am going about this in the right way,
for a secure , simple, not complex database with Ruby, or should I go
with something else? Any suggestions?

learningruby · April 3, 2012, 8:30pm

If you only need something this simple, and if you somehow save
this information to disk in the meantime (easy to do), so data doesn’t
get lost on restart, and if you are sure that you will only need
this data in this particular Ruby app and not anywhere else, then yes,
I guess this is going to work.

And yes, you can put an array into a hash. (In fact, in Ruby you can
basically put anything into anything ;).)

– Matma R.

learningruby · April 3, 2012, 8:42pm

On Tue, Apr 3, 2012 at 10:30 AM, ruby rocks [email protected]
wrote:

… and I think adding database
software is going to increase complexity and thus inherently decrease
the over all security of the finished program.

Sorry, that’s wrong. Storing your data as plain text in a file is hardly
“secure”, and writing your own data access routines is unlikely to be
more secure than using well-used and -tested DB libraries.

I think this sort of database structure will be much less complex than
trying to use something like MySQL

You’re trying to recreate the functionality of a relational database for
no good reason. If you want something simpler to install/administer
than MySQL you can use SQLIte, or even a non-SQL datastore like
Redis.

FWIW,

learningruby · April 3, 2012, 9:59pm

Hassan S. wrote in post #1054862:

On Tue, Apr 3, 2012 at 10:30 AM, ruby rocks [email protected]
wrote:

… and I think adding database
software is going to increase complexity and thus inherently decrease
the over all security of the finished program.

Sorry, that’s wrong. Storing your data as plain text in a file is hardly
“secure”, and writing your own data access routines is unlikely to be
more secure than using well-used and -tested DB libraries.

I think this sort of database structure will be much less complex than
trying to use something like MySQL

You’re trying to recreate the functionality of a relational database for
no good reason. If you want something simpler to install/administer
than MySQL you can use SQLIte, or even a non-SQL datastore like
Redis.

FWIW,

Well, I do not plan to store it as plaintext but symmetrically encrypted
with the users password as the secret key. The security I am talking
about is from flaws in programming. It is widely recognized that bugs in
code and code length have a correlation, and a general rule of thumb is
to use as few lines of code as possible to lower the amount of
potentially exploitable bugs. Since my program already uses ruby hashes
and arrays in other places, adding more is not going to increase the
complexity of the program, versus adding an entire database program like
MySQL to the system. I don’t seem to need all of the features of MySQL,
and using MySQL is certainly going to significantly increase program
complexity, and will inherently bring along all of the security problems
of MySQL with it.

I just wonder if there is anything that I am overlooking, but I am
pretty convinced that using less code is good from a security
perspective, especially as it seems like I do not need all of the
features of MySQL. Why include potential security bugs to get features
that I do not require?

I have no issues with installing or managing MySQL for what I need to
do, I just don’t want to have the added security risk from it when I can
apparently accomplish the same thing using ruby features that I am
already using in other places anyway.

I am open to criticism…but adding complexity needlessly doesn’t seem
like a smart choice to make from a security point of view and that is
all I am concerned with currently.

I think a decent analogy may be this:

OpenOffice has had a lot more auditing done on it than my simple text
editing program, but I can be pretty confident that it has more remote
code execution vulnerabilities in it than my simple text editor:

input = gets.chop
puts input

if I only need the features of mine, why go with openoffice?

learningruby · April 3, 2012, 10:10pm

On Apr 3, 2012, at 9:59 PM, ruby rocks wrote:

I just wonder if there is anything that I am overlooking, but I am
pretty convinced that using less code is good from a security
perspective, especially as it seems like I do not need all of the
features of MySQL. Why include potential security bugs to get features
that I do not require?

Yep, you do overlook that widely used code is also less likely to have
critical
bugs, because there were a ton of people thinking about exactly those
issues
before you.

learningruby · April 3, 2012, 10:14pm

Florian G. wrote in post #1054875:

On Apr 3, 2012, at 9:59 PM, ruby rocks wrote:

I just wonder if there is anything that I am overlooking, but I am
pretty convinced that using less code is good from a security
perspective, especially as it seems like I do not need all of the
features of MySQL. Why include potential security bugs to get features
that I do not require?

Yep, you do overlook that widely used code is also less likely to have
critical
bugs, because there were a ton of people thinking about exactly those
issues
before you.

Good point, but would it be fair enough for me to say that since I
already have to use arrays and hashes in other areas, that it does not
really produce additional complexity to use them as a sort of minimalist
database system?

I don’t try to argue with anyone, I really want what is best for
security, I just want to make sure that everyone including and
especially myself, has thought of every piece of information, before
coming to conclusions.

learningruby · April 3, 2012, 11:36pm

Hassan S. wrote in post #1054884:

On Tue, Apr 3, 2012 at 1:14 PM, ruby rocks [email protected] wrote:

Good point, but would it be fair enough for me to say that since I
already have to use arrays and hashes in other areas, that it does not
really produce additional complexity to use them as a sort of minimalist
database system?

No. While arrays and hashes are native objects, you are proposing
to create your own methods to read/write/search them (as well as,
apparently, decrypt/encrypt with every access) when you could just
use library methods created by people with much more experience
than you.

Regardless, good luck!

Okay it seems MySQL is the way to go then, but just want to confirm one
thing to make absolutely sure: I am talking about client side data
storage. The list of contacts and other information, is stored on the
client itself, not on a remote server. Do you still think that using a
MySQL database is appropriate for this? Last question I swear and
then I will get back to work using MySQL instead of trying to write my
own database system.

I have experience using MySQL on servers, but not for client data
storage. I have no experience doing client side data storage/management
as I am working on my first non-trivial application. So far I have been
simply writing output encrypted with OpenSSL to text files in a complex
directory hierarchy, but I want to make it less “ugly” than that by
having a single database, or maybe multiple databases one for each
username the user of the program has. However, it did seem to me like
MySQL is major overkill for this. However, I am smart enough to take the
advice of those more experienced than me, and want a good piece of
software not to be stubborn :).

learningruby · April 3, 2012, 11:41pm

On Tue, Apr 3, 2012 at 2:37 PM, ruby rocks [email protected] wrote:

Okay it seems MySQL is the way to go then, but just want to confirm one
thing to make absolutely sure: I am talking about client side data
storage. The list of contacts and other information, is stored on the
client itself, not on a remote server. Do you still think that using a
MySQL database is appropriate for this? Last question I swear and
then I will get back to work using MySQL instead of trying to write my
own database system.

Or PostgreSQL ;-p

Jos

learningruby · April 3, 2012, 11:17pm

On Tue, Apr 3, 2012 at 1:14 PM, ruby rocks [email protected] wrote:

Good point, but would it be fair enough for me to say that since I
already have to use arrays and hashes in other areas, that it does not
really produce additional complexity to use them as a sort of minimalist
database system?

No. While arrays and hashes are native objects, you are proposing
to create your own methods to read/write/search them (as well as,
apparently, decrypt/encrypt with every access) when you could just
use library methods created by people with much more experience
than you.

Regardless, good luck!

learningruby · April 3, 2012, 11:45pm

W dniu 3 kwietnia 2012 23:37 użytkownik ruby rocks
[email protected] napisał:

Okay it seems MySQL is the way to go then, but just want to confirm one
thing to make absolutely sure: I am talking about client side data
storage. The list of contacts and other information, is stored on the
client itself, not on a remote server. Do you still think that using a
MySQL database is appropriate for this? Last question I swear and
then I will get back to work using MySQL instead of trying to write my
own database system.

If your data is simple and non-relational, there’s no need for SQL.
(Or for NoSQL, for that matter.)

But then, your data will inevitably become relational as your
application grows.

– Matma R.

learningruby · April 3, 2012, 11:47pm

Or a database that understands hashes, like MongoDB or Redis…

learningruby · April 3, 2012, 11:48pm

On 04/04/12 09:37, ruby rocks wrote:

I am talking about client side data storage. The list of contacts and
other information, is stored on the client itself, not on a remote
server.
Then maybe you want to use SQLite?

http://viewsourcecode.org/why/hacking/aQuickGuideToSQLite.html

Sam

learningruby · April 3, 2012, 11:50pm

On Tue, Apr 3, 2012 at 2:37 PM, ruby rocks [email protected] wrote:

Okay it seems MySQL is the way to go then, but just want to confirm one
thing to make absolutely sure: I am talking about client side data
storage. The list of contacts and other information, is stored on the
client itself, not on a remote server. Do you still think that using a
MySQL database is appropriate for this?

No, I never said “use MySQL”; I said use an existing datastore like
MySQL OR SQLite OR Redis OR any number of others.

The exact solution depends on the details of your app, what kind of
“client”, how much administration you’re willing to do or want to avoid,
whether you want the DB to handle encryption natively, etc.

The main point was to not reinvent the wheel in the mistaken guise
of “security”.

learningruby · April 4, 2012, 5:23pm

ruby rocks wrote in post #1054850:

database = {}
database.store(user + “_contact”, contact)

and then when it is time to get the users contact list, I will search
the hash with a regular expression looking for the current users chunk
of _contact entries, and then return each of them to get the full list
of contacts.

That is not using the Hash properly. If you have a key value store
(which is another way to look at a Hash) you want to do lookups based on
key and not iterate the whole thing.

But I will have multiple sorts of information stored in one
hash, that I will also write to the disk and load from disk into a hash
when the program is running (or maybe when it is time to use it only?).

The “multiple sorts of information” bit is a tad too vague to properly
comment. Can you explain what are the keys (beyond “user” which you did
mention already) and what sort of information do you want to store?

For a simplistic database I would define a class and write down the
public interface, e.g.

UserInfo = Struct.new :name, :age, :contacts do
def add_contact(contact)
(self.contacts ||= [] ) << contact
end
end

class DB

def add_user(user_info)
end

def get_user_by_name(user_name)
end

…

end

Then you know what access paths you need for the various bits of
information and can decide how many Hashes you have inside a single DB
instance.

Persistence can be done via Marshal which is pretty fast.

irb(main):003:0> x=Hash[100_000.times.map{|i| [i.to_s, “*”*100]}];
x.size
=> 100000
irb(main):004:0> Benchmark.measure { File.open(“x”,“wb”)
{|io|Marshal.dump(x,io)} }
=> 0.500000 0.015000 0.515000 ( 0.515622)

irb(main):005:0> Benchmark.measure { File.open(“x”,“rb”)
{|io|Marshal.load(io)} }
=> 0.437000 0.016000 0.453000 ( 0.453122)

If you need it to be faster and / or save space on disk you could write
custom serialization which only writes the raw info but not the indexes
(Hashes) and creates those when reading.

Kind regards

robert

learningruby · April 4, 2012, 12:21am

On Wed, Apr 04, 2012 at 06:44:58AM +0900, Sam D. wrote:

On 04/04/12 09:37, ruby rocks wrote:

I am talking about client side data storage. The list of contacts
and other information, is stored on the client itself, not on a
remote server.
Then maybe you want to use SQLite?

_why's Estate - A Quick Guide to SQLite and Ruby

Yes.

MySQL is way too heavy for what problem has been described. Use SQLite,
or even something simpler like YAML.

learningruby · April 5, 2012, 11:01pm

On Apr 3, 2012, at 12:59 , ruby rocks wrote:

I just wonder if there is anything that I am overlooking, but I am
pretty convinced that using less code is good from a security
perspective, especially as it seems like I do not need all of the
features of MySQL. Why include potential security bugs to get features
that I do not require?

I think you have oversimplified the equation.

I would submit that [a large block of widely used aggressively tested
code created by specialists] has a very high probability of being more
reliable, and more secure, than [a small amount of brand new code tested
by a single person].

Personally, I would never try to manage storing data in an encrypted
form by coding my own I/O routines. I would be looking for a mature
library or storage system with a good interface library.

OK, personally, I would go right to PostgreSQL, but that’s because I’m
already familiar with it. If I were starting from scratch, I might pick
something else.