Cleaning up HTML markup from external sources

I’m using acts_as_amazon_product to pull content about books I own.
This makes it very easy for me to pull down lots of data about books I
own without having to enter all that stuff myself.

It even includes a description of sorts. The problem is, while this
is some form of HTML markup, it is not XHTML. For instance, I see
things like

  • … and

    Perhaps I’m trying too hard, but I’d love this to be valid XHTML.

    Can anyone recommend a way to clean this up in a fairly reliable way?
    I don’t mind doing this at display time.