Undecided on how to approach a problem of storing html tables in a database

Hi guys
I was hoping to get some advice from you all on 3 things:

  1. Storing arbitrary hashes in a mysql database
  2. Parsing HTML tables into well-formed text and putting them into
    hashes or arrays
  3. as above, but including stripping links and replacing embedded images
    with more useful text (for indexing purposes)

I’m trying to solve a problem where we report loads of interesting disk
stats, including health values in a series of webpages for our servers,
so I’m aiming to collect the information like unique array number, disk
position, serial number, firmware value, current health value and
probably chuck it into a hash, then store that in a more reasonable form
in a database, against a server serial number and date
The database schema is relatively straightforward, and what I’ve managed
so far appears a bit more sane in ruby than in python or perl.

Even though we stash alt-text in the table for the images, there doesn’t
seem to be a way to access it in ruby cleanly, and the bars indicate
rough health out of ten, so I’m inclined to just use a regex - but not
sure of what’s the best way to approach it.
The monster checkmarks indicate pass/fail on overall health, so I’d kind
of like to keep those
I’m also tempted to carry on using key-value pairs so I don’t have to
worry as much about table structure over time, and I need to really
store serial, data, object (in this case disk array), and the hash
itself

Cheers
Scott

On 2012-03-26 11:22:30, Scott H. wrote:

I was hoping to get some advice from you all on 3 things:

  1. Storing arbitrary hashes in a mysql database

You can store arbitrary data in text fields, but you lose many of
benefits (schema, searching, indexing, joins etc). There are
mysql plugins, I believe, to deal with json in the database.
There are other serialization formats you could consider (say,
protobuf).

Have you looked into any of the NoSQL databases yet? Redis might
be a better fit for key-value, or you could go the document
database route with CouchDB or MongoDB.

I’m trying to solve a problem where we report loads of interesting disk
stats, including health values in a series of webpages for our servers,
so I’m aiming to collect the information like unique array number, disk
position, serial number, firmware value, current health value and
probably chuck it into a hash, then store that in a more reasonable form
in a database, against a server serial number and date
The database schema is relatively straightforward, and what I’ve managed
so far appears a bit more sane in ruby than in python or perl.

Can you store the data in mysql, templates elsewhere and render
the combination either on demand or periodically?

Any particular reason why you building this from scratch? There
are a ton of systems out there already (nagios, munin, cacti,
etc). Most of these make it fairly easy to write custom plugins.
serverfault might be a good place to cruise for ideas.

/Allan

On 2012-03-26 12:15:03, Scott H. wrote:

Munin and Cacti look really interesting, so it’s worth me spending a bit
of time looking to see what I can collect and store.

Nagios can report something called performance data. This very
similar to cacti and munin. There are a ton of plugins including
snmp.

Good luck.

/Allan

Allan Wind wrote in post #1053267:

You can store arbitrary data in text fields, but you lose many of
benefits (schema, searching, indexing, joins etc). There are
mysql plugins, I believe, to deal with json in the database.
There are other serialization formats you could consider (say,
protobuf).

Have you looked into any of the NoSQL databases yet? Redis might
be a better fit for key-value, or you could go the document
database route with CouchDB or MongoDB.

Hi Allan,
I’ve not looked at anything other than MySQL right now - good idea, I’d
not considered any of the other plugins or NoSQL DBs - that might be
interesting

Can you store the data in mysql, templates elsewhere and render
the combination either on demand or periodically?

Any particular reason why you building this from scratch? There
are a ton of systems out there already (nagios, munin, cacti,
etc). Most of these make it fairly easy to write custom plugins.
serverfault might be a good place to cruise for ideas.

I suppose the main thing I’m really looking at is how to solve a
particular business problem that we’ve got - ideally we’d ship a simple
product that we can use to generate historical data on our products -
the end goal will be to have the product collect transient logs when
certain events occur - but these tend to be non-trappable so SNMP is out
for approx 60% of what we want to do
Munin and Cacti look really interesting, so it’s worth me spending a bit
of time looking to see what I can collect and store.

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs