Detecting width overflow in serialized column with mysql


#1

So I’ve got an ActiveRecord model pointing to a MySQL db, with an
auto-serialized column (“serialize :columnName”).

Thing is, MySQL, depending on how it’s configured (like, by default),
has a bad habit of just truncating your data if it’s too wide for the
column, with no error raised. Yeah, I can probably reconfigure MySQL
and/or my AR connection to it. But I’m wanting to distribute this code
to all kinds of people who won’t think of that or won’t want to do that,
I’d really like it to work with the default setup.

But obviously, if a yaml serialization gets truncated, it’s not going to
unserialize very well.

At the point of insert/update, is there any good way for me to detect
that the serialization was too big for the column, even though MySQL
isn’t complaining?


#2

Jonathan R. wrote:

But obviously, if a yaml serialization gets truncated, it’s not going to
unserialize very well.

At the point of insert/update, is there any good way for me to detect
that the serialization was too big for the column, even though MySQL
isn’t complaining?

This may do it:

def validate
if columnName.to_yaml > column_for_attribute(:columnName).limit
errors.add(:columnName, ‘columnName is too big to be serialized’)
end
end

It’d be good if something like this was added automatically by
serialize.


Rails Wheels - Find Plugins, List & Sell Plugins -
http://railswheels.com


#3

Interesting, thanks, lots of tricks there I wouldn’thave figured out for
myself. But am I being really inefficient now by calling to_yaml an
extra time? I’ll call it once to validate, and then throw away the
results. Then serialize will call it again when it actually wants to
save. Calling it twice for no reason. But I guess to_yaml shouldn’t be
that expensive?

Now I wonder if I should abandon the automated serialization altogether,
and just handle it myself manually through callbacks. Or maybe I should
try monkey-patching the auto-serialization stuff to perform this check
itself?

Jonathan

Mark Reginald J. wrote:

Jonathan R. wrote:

But obviously, if a yaml serialization gets truncated, it’s not going to
unserialize very well.

At the point of insert/update, is there any good way for me to detect
that the serialization was too big for the column, even though MySQL
isn’t complaining?

This may do it:

def validate
if columnName.to_yaml > column_for_attribute(:columnName).limit
errors.add(:columnName, ‘columnName is too big to be serialized’)
end
end

It’d be good if something like this was added automatically by
serialize.


Rails Wheels - Find Plugins, List & Sell Plugins -
http://railswheels.com


#4

Jonathan R. wrote:

itself?
Yeah, the double yaml is a waste, but don’t worry about it
unless this is being done sufficiently often to slow down
your app in a way that generates more work or cost.

See if you can come up for a patch for Rails that automatically
adds a such a validation for serialized attributes, where this
validation method and the attributes_with_quotes method that
generates the DB request both draw from a cache of YAMLized
attribute values.


Rails Wheels - Find Plugins, List & Sell Plugins -
http://railswheels.com


#5

Here’s a different (easier :slight_smile: ) tack to monkey patching AR to solve this
problem efficiently. Turns out the actual serialization is done in
quote_value (and interestingly, done whether or not the column is
declared serialized; the serialized decleration only effects
de-serialization). Does anyone think this following approach is a good
idea?

Does anyone think it’s a good enough idea I should submit it as a patch
to AR? I’ve never submitted a patch before, not sure how it’s done. I’m
still using Rails 1.8.x, not sure if this is still a problem in Rails 2,
so that might make it hard to submit a patch.

class ActiveRecord::Base
alias :old_quote_value :quote_value
def quote_value(value, column = nil)
if column && serialized_attributes[column.name]
serialization = value.to_yaml
raise new ActiveRecord::StatementInvalid(“Can not serialize
column #{column.name}, length #{serialization.length} is greater than
column limit of #{column.limit}”) if if serialization.length >
column.limit
“#{connection.quoted_string_prefix}’#{connection.quote_string(serialization)}’”
else
old_quote_value
end
end
end

Mark Reginald J. wrote:

Jonathan R. wrote:

itself?
Yeah, the double yaml is a waste, but don’t worry about it
unless this is being done sufficiently often to slow down
your app in a way that generates more work or cost.

See if you can come up for a patch for Rails that automatically
adds a such a validation for serialized attributes, where this
validation method and the attributes_with_quotes method that
generates the DB request both draw from a cache of YAMLized
attribute values.


Rails Wheels - Find Plugins, List & Sell Plugins -
http://railswheels.com


#6

On 22 Oct 2008, at 10:23, Jonathan R. wrote:

patch
to AR? I’ve never submitted a patch before, not sure how it’s done.
I’m
still using Rails 1.8.x, not sure if this is still a problem in
Rails 2,
so that might make it hard to submit a patch.

You’re almost certainly not using Rails 1.8.x because that version has
never existed (but you are probably running ruby 1.8.x (where x is
probably 6))
Patches can be submitted at rails.lighthouseapp.com and/or discussed
on the rubyonrails-core google group mailing list

Fred


#7

Jonathan R. wrote:

so that might make it hard to submit a patch.
“#{connection.quoted_string_prefix}’#{connection.quote_string(serialization)}’”
else
old_quote_value
end
end
end

Jonathan, that’s a fine solution if a database exception
is a better way to handle it than an automatic validation.
The validation route may however be more compatible with
the current save! idiom.

Also note that ActiveRecord trunk does the YAML conversion
in the attributes_with_quotes method instead.


Rails Wheels - Find Plugins, List & Sell Plugins -
http://railswheels.com