Wordpress converter


#1

[I sent this a couple of days ago, but it doesn’t seem to have made it to the archive, which makes me think it didn’t go out to the list. Apologies if you get it twice. The update is, I’ve fixed the first two issues, and some others, in my fork of the code (http://github.com/eostrom/typo/tree/master). Haven’t pursued the BlueCloth performance problem.]

Hi. I’m checking out Typo, with an eye to switching my WordPress blog
away
from it. It’s neat that there’s a WordPress converter, but it doesn’t
seem
to go as far as it could. The first three issues I’ve found:

  • It didn’t honor the ‘more’ line in my WordPress posts, although it
    did
    copy the ‘more’ line over. If I go to each post’s edit page and then
    save it
    without making any edits, the ‘more’ line takes effect.
  • It copied all my spam comments over from WordPress, but didn’t mark
    them as spam in Typo. That’s how I got 304 comments on a pretty
    obscure
    post.
  • When I went to look at the post with 304 comments, it took long
    enough
    that I opted to shut down the server instead. It’s not surprising
    that 304
    comments would take a long time, but it turns out generating HTML for
    just
    one comment takes roughly 30 seconds. It was a long comment, but
    that’s
    excessive.

I’d be happy to try to help fix any or all of these - I think I have a
grip
on the first two problems already. I wanted to get some guidance on a
few
things:

  • It seems the slow part of rendering those comments is BlueCloth.
    First,
    I don’t think these comments are even in Markdown (the legit ones or
    the
    spam). It looks like the default in WordPress is “HTML plus
    blank-lines-indicate-paragraphs”. Should I just set the filter for
    these
    comments to something else? (If so, what?)
  • Is BlueCloth really this slow? I noticed that last week marked the
    release of BlueCloth 2, which speeds things up dramatically by using
    Discount (written in C). Is there any interest in updating Typo to
    include
    this new version?
  • There don’t seem to be any tests for the converters, and I intend
    to
    uphold that tradition. I also don’t plan to get my hands on a
    WordPress 2.5
    database; I’m using 2.6. Should I just copy the wp25 converter to
    wp26, make
    it work for me, and leave wp25 alone?

Thanks
–Erik Ostrom
removed_email_address@domain.invalid


#2

Hi Erik,

Erik Ostrom wrote:

[I sent this a couple of days ago, but it doesn’t seem to have made it to the archive, which makes me think it didn’t go out to the list. Apologies if you get it twice. The update is, I’ve fixed the first two issues, and some others, in my fork of the code (http://github.com/eostrom/typo/tree/master). Haven’t pursued the BlueCloth performance problem.]

I did get this twice, but I don’t see either the original or this
version in the
archive either. How strange. Another message is showing up.

Anyway, thanks for working on the converter!

Hi. I’m checking out Typo, with an eye to switching my WordPress blog
away from it.

Yay! :slight_smile:

[…]
* When I went to look at the post with 304 comments, it took long
enough that I opted to shut down the server instead. It’s not
surprising that 304 comments would take a long time, but it turns
out generating HTML for just one comment takes roughly 30 seconds.
It was a long comment, but that’s excessive.

That seems ridiculously long. Can you isolate this to a test case, as
in, make a
little ruby program that just generates HTML for that one comment?

I’d be happy to try to help fix any or all of these - I think I have a
grip on the first two problems already. I wanted to get some guidance on
a few things:

* It seems the slow part of rendering those comments is BlueCloth.
  First, I don't think these comments are even in Markdown (the
  legit ones or the spam). It looks like the default in WordPress is
  "HTML plus blank-lines-indicate-paragraphs". Should I just set the
  filter for these comments to something else? (If so, what?)

You may have to write a little filter for that.

* Is BlueCloth really this slow? I noticed that last week marked the
  release of BlueCloth 2, which speeds things up dramatically by
  using Discount (written in C). Is there any interest in updating
  Typo to include this new version?

If you can show us that the current version is slow, then yes, of
course.

* There don't seem to be any tests for the converters, and I intend
  to uphold that tradition. I also don't plan to get my hands on a
  WordPress 2.5 database; I'm using 2.6. Should I just copy the wp25
  converter to wp26, make it work for me, and leave wp25 alone?

I have no idea what the differences between 2.5 and 2.6 are, but if it’s
called
wp25 … Anyway, I have no opinion on this. Maybe one of the other typo
team
members can say something about this? Mephisto seems to just have a
generic
wordpress converter.

Too bad there aren’t any tests. If you can add some, that would be even
better,
since I don’t have any wordpress databases lying around.

Regards,
Matijs.


#3

Le 16 avr. 09 à 06:50, Erik Ostrom a écrit :

• It didn’t honor the ‘more’ line in my WordPress posts, although
It was a long comment, but that’s excessive.
using Discount (written in C). Is there any interest in updating
Typo to include this new version?
• There don’t seem to be any tests for the converters, and I intend
to uphold that tradition. I also don’t plan to get my hands on a
WordPress 2.5 database; I’m using 2.6. Should I just copy the wp25
converter to wp26, make it work for me, and leave wp25 alone?
Thanks
–Erik Ostrom
removed_email_address@domain.invalid

Hi Erik,

My apologies. I had your message and seen your patch, but I honnestly
had no time to have a look at it. I know Matisj did it, and since he
has the commit bit, I think I’ll leave this part to him. He already
replied sooner this morning.

Cheers,
Frédéric


Frédéric de Villamil
“What’s mine is mine. What’s yours is still unsetteled” – Go player
proverb
removed_email_address@domain.invalid tel: +33 (0)6 62 19 1337
http://t37.net Typo : http://typosphere.org


#4
* When I went to look at the post with 304 comments, it took long
  enough that I opted to shut down the server instead. It's not
  surprising that 304 comments would take a long time, but it turns
  out generating HTML for just one comment takes roughly 30 seconds.
  It was a long comment, but that's excessive.

That seems ridiculously long. Can you isolate this to a test case, as in,
make a
little ruby program that just generates HTML for that one comment?

I’m sure I can. I decided it was BlueCloth to blame by running code in
the
console - it was definitely the BlueCloth HTML generation method.

Unfortunately I’m about to get busier, so I may not get back to Typo for
a
little while.

I have no idea what the differences between 2.5 and 2.6 are, but if it’s

called
wp25 …

After writing, I investigated. WordPress had no schema changes between
2.5
and 2.6, so I integrated my code into wp25.

There were schema changes for 2.7, and I don’t think any of them were
relevant to the converter, but I’d have to check more thoroughly to be
certain. Maybe when I need to convert my other, newer blog…

–Erik Ostrom
removed_email_address@domain.invalid


#5

Le 16 avr. 09 à 17:34, Erik Ostrom a écrit :

That seems ridiculously long. Can you isolate this to a test case,
I have no idea what the differences between 2.5 and 2.6 are, but if
it’s called
wp25 …

After writing, I investigated. WordPress had no schema changes
between 2.5 and 2.6, so I integrated my code into wp25.

There were schema changes for 2.7, and I don’t think any of them
were relevant to the converter, but I’d have to check more
thoroughly to be certain. Maybe when I need to convert my other,
newer blog…

I’ve backported your commits in master today.
Thank you for contributing.

Fred


Frédéric de Villamil
“What’s mine is mine. What’s yours is still unsetteled” – Go player
proverb
removed_email_address@domain.invalid tel: +33 (0)6 62 19 1337
http://t37.net Typo : http://typosphere.org


#6

Erik Ostrom wrote:

That seems ridiculously long. Can you isolate this to a test case,
as in, make a
little ruby program that just generates HTML for that one comment?

I’m sure I can. I decided it was BlueCloth to blame by running code in
the console - it was definitely the BlueCloth HTML generation method.

Apparently, Bluecloth is indeed very slow, and uses a lot of memory:

http://eigenclass.org/R2/writings/fast-extensible-simplified-markdown-in-ocaml


#7

Le 29 avr. 09 à 08:32, Erik Ostrom a écrit :

Yes. Although the recently released BlueCloth 2 is built on
Discount, which should bring it into the “fast and slim” category.

http://www.deveiate.org/bluecloth2-announcement.html

Of course there are drawbacks to depending on C code, which Discount
is. This might lead back to the question in that other thread about
including unpacked gems in typo…

Including unpacked gems into Typo was a way for us to make the install
process easier with by including lots of dependencies into the core
application. It sounded like a good idea, until it crashed somewhere.

For bluecloth2, I’m going to check if we can integrate it without
breaking everything, but it may help to fasten Typo, which is always
good. I’ve opened a ticket at
https://fdv.lighthouseapp.com/projects/11171-typo-blog/tickets/97-replace-bluecloth-with-bluecloth-2
you can follow if you’re interested in this topic.

Cheers,
Frédéric

Frédéric de Villamil
“What’s mine is mine. What’s yours is still unsetteled” – Go player
proverb
removed_email_address@domain.invalid tel: +33 (0)6 62 19 1337
http://t37.net Typo : http://typosphere.org


#8

de Villamil Frédéric wrote:

Including unpacked gems into Typo was a way for us to make the install
process easier with by including lots of dependencies into the core
application. It sounded like a good idea, until it crashed somewhere.

This may count as “vaguest suggestion ever”, but didn’t I see Something
Somewhere Recently about application templates, development vs.
production
gem dependencies, rails app-installers, etc. - maybe as part of Rails
2.3
-that would create a better compromise between “we can’t possibly ensure
that typo works on an N*M matrix of gem combinations” and “I want to
take
advantage of the latest gems automatically”?

I just went through a similar problem importing Movable Type into
first-time
Typo blog; typo includes RedCloth3 (3.0.3 or 3.0.4 I think), and the
newest
RedCloth 4 fixes a number of long-standing bugs. Since MT uses a
non-Ruby
version of Textile, all my entries were formatted for “proper” Textile,
and
RedCloth 3 made them look really ugly… took a while to figure out why.

Jay


#9

Jay L. wrote:

possibly ensure that typo works on an N*M matrix of gem combinations"
and “I want to take advantage of the latest gems automatically”?

I think you refer to the fact that you can, in rails 2.3, put gem
dependencies in the common environment.rb, and also in the
environment/test.rb etc. This way you can specify, we need rspec, but
only in testing. Very useful.

But to solve the problem of “we know typo works for gem such-and-such
version 3.0.3, but someone wants to use 10.0.1”, that won’t help. What
will help, is putting a minimum version or a range of versions in the
gem dependencies (you would put something like “>=3.0.3”.

I just went through a similar problem importing Movable Type into
first-time Typo blog; typo includes RedCloth3 (3.0.3 or 3.0.4 I think),
and the newest RedCloth 4 fixes a number of long-standing bugs. Since
MT uses a non-Ruby version of Textile, all my entries were formatted for
“proper” Textile, and RedCloth 3 made them look really ugly… took a
while to figure out why.

Ouch, that’s nasty.

Regards,
Matijs.


#10

Le 29 avr. 09 à 09:22, de Villamil Frédéric a écrit :

thread about including unpacked gems in typo…
you can follow if you’re interested in this topic.

Cheers,
Frédéric

I’ve replaced bluecloth 1.0 by bluecloth 2.0 on master tonight. Seems
to work smoothly. I’ve also removed it from vendor/ and added as a gem
dependency instead. Tell me if perfs are better and if nothing goes
wrong for you.

Frédéric


Frédéric de Villamil
“What’s mine is mine. What’s yours is still unsetteled” – Go player
proverb
removed_email_address@domain.invalid tel: +33 (0)6 62 19 1337
http://t37.net Typo : http://typosphere.org


#11

Tried this on one of the spam comments I mentioned earlier, selected at
random. I didn’t do a scientific test, but generating HTML for that
comment
with BlueCloth 1 took about 20 seconds on my devevelopment machine; with
BlueCloth 2, about 0.14. I like it!

–Erik

2009/4/29 de Villamil Frédéric removed_email_address@domain.invalid


#12

Yes. Although the recently released BlueCloth 2 is built on Discount,
which
should bring it into the “fast and slim” category.
http://www.deveiate.org/bluecloth2-announcement.html

Of course there are drawbacks to depending on C code, which Discount is.
This might lead back to the question in that other thread about
including
unpacked gems in typo…