Forum: Ruby encoding problem?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
F78c8891881e6fb993ed28b51928636d?d=identicon&s=25 Jesus Roncero (Guest)
on 2006-02-05 17:41
(Received via mailing list)
Hi,

Trying to learn ruby, I am writing a script to migrate from a pybloxsom
to wordpress. As you may know, pybloxsom stores all entries and comments
in text files under a directory hierachy. Mi idea is to read all those
files (the subdirectories store the categories) and inject them in the
mysql database wordpress uses.

So far, I have been able to read all the posts and comments but I am
having some problems injecting them in mysql (BTW, I am using the mysql
module). The problem, I guess, is with some sort of encoding with the
text.
Basicaly I have two problems:

- Accented characters. For example, if I have a accented vowel like "í"
they are not properly inserted into the mysql table and would get weird
characters. I guess that if I do a function that substitute every single
of these characters for its html entity (ie. í) would work, but I
guess there must be a more appropriately way to do it, right? Anything
to do with the encoding?

- Also, I have this problem that wordpress interprets \n characters (I
guess). For example, if I have a post like the following:
*****************
This is an example of an <img
src="image.jpg"> image.
*****************

would turn into:
*****************
This is an example of an <img <br />
src="image.jpg"> image.
*****************

interpreting the \n character right after <img, inserting the br tag
which breaks the HTML. I thought that If I would delete all the \n
characters it would be fine, but the thing is that there are some posts
with pre labels where \n are required.
Any idea on this?

Anyway, thanks in advance! :-)
Bc6d88907ce09158581fbb9b469a35a3?d=identicon&s=25 James Britt (Guest)
on 2006-02-05 17:57
(Received via mailing list)
Jesus Roncero wrote:
> Hi,
>
> ...

>
> - Accented characters. For example, if I have a accented vowel like "í"
> they are not properly inserted into the mysql table and would get weird
> characters. I guess that if I do a function that substitute every single
> of these characters for its html entity (ie. &iacute;) would work, but I
> guess there must be a more appropriately way to do it, right? Anything
> to do with the encoding?

It may be that you need to tell MySQL to use a particular character set

http://dev.mysql.com/tech-resources/articles/4.1/u...

might have some info.



--
James Britt

http://www.ruby-doc.org       - Ruby Help & Documentation
http://www.artima.com/rubycs/ - The Journal By & For Rubyists
http://www.rubystuff.com      - The Ruby Store for Ruby Stuff
http://www.30secondrule.com   - Building Better Tools
F78c8891881e6fb993ed28b51928636d?d=identicon&s=25 Jesus Roncero (Guest)
on 2006-02-05 18:18
(Received via mailing list)
James Britt wrote:

>
> It may be that you need to tell MySQL to use a particular character set
>
> http://dev.mysql.com/tech-resources/articles/4.1/u...
>
> might have some info.
>
Umm, but I would like to do it from within ruby. I mean, all the new
posts in the database (inserted using the web form) work ok, so, I
guess, the thing would be to do it programmatically in my script, right?

Thanks
This topic is locked and can not be replied to.