You should also make sure that the browser is submitting the
information as UTF8. If I recall, it’s sufficient to declare the page
with the form in it as utf8, but you might want to double-check that.
Search and replace is going to be tough unless there’s an existing
script around for doing that.
One thing you could do is try writing a script in Java (or Ruby) that
reads a known bad row from the database and converts it various ways
and prints it out until you know exactly what conversion you’re going
Programmatically detecting the messed-up strings seems like it would
be more difficult, though, unless there are some clear constraints on
what those strings should contain (i.e., something you can run a regex
on), which might be the case if it’s, say, a validated form field.