Atom Export Proposal

Alastair_R · August 15, 2006, 2:19pm

Typographers,

As promised, here is a proposal for the use of Atom as a blog export
format:

Note that there is nothing specific to Typo here (*), but I believe
it should be implementable in Typo with minimal effort.

Comments appreciated.

(*) apart from shamelessly stealing the typosphere.org domain for the
namespace, hopefully temporary.

Alastair_R · August 15, 2006, 2:19pm

I haven’t had time to look over the whole document yet, but I like
what I’ve seen so far. I have a few suggested changes :-).

First, I think the content body model is going to cause problems.
Here’s how you’re doing it now:

<export:contentsyntax

scheme=“Daring Fireball: Markdown”
label=“Markdown”/>
I just wrote this brilliant Java code. Have a
look!

public String getPaula() {
    return paula;
}</content>

Can we change this to something like this:

<export:body type="http://daringfireball.net/projects/markdown/"

label=“Markdown”>I just wrote this brilliant Java code. Have a look!

public String getPaula() {
    return paula;
}</export:content>
<content type="html">I just wrote this <em>brilliant Java

code. Have a look!

public String getPaula() {
    return paula;
}</content>

Basically, provide both the original format (for importers that
understand Markdown) and the processed format (for importers that
don’t understand the original format).

Next, can you rename export:page to export:static or something
similar? I’d rather not be too Typo-specific :-).

Also, how does this represent things like:

per-article trackback/comment open/closed?
extended entries/excerpts/keywords

Scott

Alastair_R · August 15, 2006, 2:21pm

Hey Scott,

Thanks for the comments.

On 14/08/2006, at 11:32 PM, Scott L. wrote:

public String getPaula() {
    return paula;
}</content>
Basically, provide both the original format (for importers that
understand Markdown) and the processed format (for importers that
don’t understand the original format).

OK I’ve thought about this a bit, so let me outline some of the
reasoning here, and why I proposed a single representation for the
content.

I’m assuming that most blog systems are like Typo in that they store
a single, canonical, representation of each entry. This is then
transformed to HTML (or XHTML, or … ?) by the blogging engine at
run time. Or not, maybe the single representation is XHTML. In any
case it seems just a consequence of the DRY principle that a single
representation is desired.

But, if we have two alternative representations for an entry in the
export file, it forces the importer code to choose which
representation is the best match for the target blogging engine. It
is not obvious to me that the code will always be able to make this
decision in the best manner. Also, it’s not obvious whether or not
the redundant (“cooked”) HTML representation will be “good enough”.
Lastly, it complicates the job of the importer.

I thought it might be simpler for the exporter to choose a content
format - perhaps with the help of the user - which is deemed to be
the most interoperable, and yet closest to the original format.

For example, a Typo implementation would probably export in the
source Textile/Markdown/whatever, and yet expand all the typo:xxx
macros because these are by definition not interoperable. And yet if
the user is exporting from one Typo installation to another, they
probably don’t want to expand these macros, so the decision probably
needs to be under user control.

I take the view that the user prettymuch always knows what is the
right level of translation to be done on the data when moving their
data from one system to another.

Even in the worst case, namely the user has authored their data in a
format that is just not supported at all on the target platform,
should it necessarily be the exporter that resolves this problem? It
could be an intermediate tool that converts the content from one
format to another, on behalf of the target platform. For example, a
tool that renders all Markdown-formatted content into HTML would be
very simple to write and would also achieve the required goal.

So despite all the above I’m not totally against the idea of
supplying multiple representations in the export file, but I do think
there are alternatives which should be considered.

[On a separate issue, I think it’s probably a good idea to include a
xml:space=“preserve” attribute when using a format where whitespace
is signficant.]

Next, can you rename export:page to export:static or something
similar? I’d rather not be too Typo-specific :-).

They’re called pages in Wordpress too so I figured it was universal

But yes, a good suggestion.

Also, how does this represent things like:

per-article trackback/comment open/closed?

Good point, will need to add an element for this. It should probably
include the trackback URL, also.

extended entries/excerpts/keywords

Keywords are handled by using an atom:category which is not in
the set of categories defined in the atom:feed. This seems to match
Typo’s implementation of tags, but I don’t know if this is a good
idea in general… ?

Excerpts are handled using the existing atom:summary element.

Extended entries are a bit trickier. Off the top of my head we could
possibly add a export:extended/ tag into the content to indicate
the position of the start of the extended text. This corresponds to
the “tag” that Wordpress uses.