Perl and the culture of libraries

Martin_DeMello · August 5, 2008, 7:20pm

On Tue, Aug 5, 2008 at 1:07 PM, ara.t.howard [email protected]
wrote:

isn’t a great writer - at least he doesn’t spend his time doing great
writing. i don’t think it detracts one bit from the ruby language. it’s
ok to be good at one thing and let others be good at other things - it’s
the beauty if open source to a certain degree that we can all row together
to get the project done, docs included.

Fair enough, I definitely didn’t mean to overgeneralize. But I still
stand by the point that programmers should have a decent technical
writing skills. You’re absolutely right that when software is
interesting enough, people will contribute documentation. However,
the farther away from the author of the software you go, the more
difficult it is for someone to write about the topic. I’m not talking
about tutorials and walk-throughs here, but API documentation. Eric
Hodel kicks ass for a reason, and that’s because he’s willing to read
code that he didn’t write and document it so others don’t have to.
This is something we can appreciate, but not something that should be
expected. (Again, I use ‘should’ in the sense of ‘would be best’, not
in the sense of ‘absolutely must’)

i don’t think we disagree on this point - rather i think that it’s important
to distinguish between ‘libraries’ and ‘applications’ without lumping both
into the category of ‘software’. in other words to recognize that
documentation needs are different for different projects. this is part of
what makes a one-shoe-fits-all approach hard i think. it’s also why the
documentation needs of a rails plugin are quite different for a developer
installing the the plugin vs a graphic designer doing the same. it’s
unrealistic to expect the plugin developer to cater to both.

This is a good point. I definitely overgeneralized by lumping
everything into ‘software’. I’m talking about the things that
actually do need documentation, such as complicated libraries or
things at that level of complexity.

ps. also worth noting (not to you greg), but for posterity, is that this
thread will consume more energy and words than contributing documentation to
one or two medium sized ruby projects currently just sitting there in svn or
git in all there undocumented glory

Or continuing to document Ruby itself, as much of the standard library
still leaves something to be desired for. (Despite the hard work of
those who’ve made great progress on this over the last couple years)

-greg

Martin_DeMello · August 5, 2008, 10:35pm

I agree with Advi and Martin that the landing page of a project is
important. The CPAN authors’ documentation quality improved because of
CPAN, not the other way around. Rubyforge would lead to better Ruby
libraries if it looked less like Sourceforge and more like the module
pages at http://search.cpan.org. This is not to knock Rubyforge; it
filled a community need. But I think the community deserves more now.

On that note, let’s look at a sample distribution page (http://
search.cpan.org/~abw/Template-Toolkit-2.19/). For those unfamiliar
with CPAN, a distribution is sort of the “top level”; it can contain
many modules. CPAN is a lot more than well-documented modules. There
are ratings and reviews, which can help when your searches bring up
multiple modules that do the same thing. I find it useful for
identifying abandon-ware: sometimes you’ll see in a review that the
module has been superceded by another. There is also a gaggle of
volunteers who test Perl modules on various platforms. This is
invaluable if you are using a non-mainstream environment (e.g. cygwin)
or if the module has C extensions. And it’s fairly automated; there’s
a CPAN module called Test::Reporter and a special version of CPAN
(CPANPLUS) that can automatically download a module, run its test
suite, and report results. You can also find dependencies prior
versions. Like Rubyforge, there are discussions and bug reporting that
are sometimes used, sometimes not.

Module authors tend to use Module::Starter which is just like hoe, in
that it generates a nice module skeleton, ready to submit to CPAN.
Another thing I noticed about module authors is that they tend to use
more pragmatic, namespaced module names, i.e. they would tend to use a
name like “PDF::Simple” instead of “prawn”. Though this may be due to
the fact that 14,000 modules require a bit more organization and
findability.

If anything could make CPAN better it is more user-generated content.
It would be neat to let users add additional usage examples to the
synopses, perhaps comment on (or even update) the docs (like DocBox,
sorta).

Anyway I think the Ruby community would benefit from taking a page or
two from the CPAN playbook.

– Mark.

Martin_DeMello · August 5, 2008, 10:46pm

On Tue, Aug 5, 2008 at 4:33 PM, Mark T. [email protected] wrote:

Module authors tend to use Module::Starter which is just like hoe, in
that it generates a nice module skeleton, ready to submit to CPAN.
Another thing I noticed about module authors is that they tend to use
more pragmatic, namespaced module names, i.e. they would tend to use a
name like “PDF::Simple” instead of “prawn”. Though this may be due to
the fact that 14,000 modules require a bit more organization and
findability.

There is a (hair brained) reason why I chose the name. It is to keep
Prawn out of the PDF namespace until it is ready to replace the
current defacto standard PDF::Writer library.

Details at:
http://groups.google.com/group/prawn-ruby/msg/f6aee1d625600146

Martin_DeMello · August 5, 2008, 11:20pm

There is a (hair brained) reason why I chose the name. It is to keep
Prawn out of the PDF namespace until it is ready to replace the
current defacto standard PDF::Writer library.

hee hee, I didn’t mean to snipe at your name in particular, it’s just
the first example that came to mind (I had just looked at your latest
release). Nice work, by the way.

Martin_DeMello · August 5, 2008, 10:49pm

On Aug 5, 6:56 am, “Martin DeMello” [email protected] wrote:

http://blog.jrock.us/articles/You%20are%20missing%20the%20point%20of%…

Great article on perl and cpan. I was all ready to say “yeah, ruby has
libraries too - maybe not the tens of thousands cpan has, but i can
usually find what i want” until I read this paragraph:

There are about 14,000 libraries on CPAN, not “tens of thousands”.
There are about 8,000 Ruby libraries between RubyForge and the RAA.

The important thing about Perl is that we have a culture of writing
good libraries. No Perl programmer would write a few lines of code,
post it to a blog, and call it a “library”. Everyone feels obligated
to create a CPAN distribution, with documentation (sometimes a bit on
the minimal side, but not everyone is a writer), a test suite, a
Makefile, etc. I’m not sure why, but this always happens. I think it’s
because there is a strong convention, and tools that make following
the convention easy.

Between RubyForge, RDoc and Rubygems I think we’ve got a pretty good
culture going. He mostly seems to be complaining about the fact that
some Rails users have a habit of posting code online instead of
packaging it.

He’s right

He’s wrong. He’s also a Perl programmer deeply steeped in Perl (has
many modules on CPAN), is not a polyglot programmer as far as I can
tell, and has a book on Catalyst to sell you.

Plus, 212 modules? Oof.

Regards,

Dan, former Perl programmer

Martin_DeMello · August 6, 2008, 12:03am

On Tue, Aug 5, 2008 at 5:18 PM, Mark T. [email protected] wrote:

There is a (hair brained) reason why I chose the name. It is to keep
Prawn out of the PDF namespace until it is ready to replace the
current defacto standard PDF::Writer library.

hee hee, I didn’t mean to snipe at your name in particular, it’s just
the first example that came to mind (I had just looked at your latest
release). Nice work, by the way.

No worries, that question came up a few times in the last couple days
and I felt like I needed to retrofit an answer to it that sounded
good. Truly I just like the word.

Martin_DeMello · August 6, 2008, 2:46am

On Aug 5, 2008, at 9:04 AM, Peter F. wrote:

I’m on board.
Where do we start ?

Re-face RubyForge to apply the changes you suggest? Who’s
maintainer for
RubyForge ?

I’m the current maintainer for RubyForge; the site/hardware/domain/etc
are all owned by Ruby Central.

Yours,

Tom

Martin_DeMello · August 6, 2008, 12:43am

On Tue, Aug 5, 2008 at 1:46 PM, Daniel B. [email protected]
wrote:

He’s right

He’s wrong. He’s also a Perl programmer deeply steeped in Perl (has
many modules on CPAN), is not a polyglot programmer as far as I can
tell, and has a book on Catalyst to sell you.

No, he really is right. I’m not coming at this from the perspective of
someone who bought into his entire article, just the one paragraph I
quoted. After which, I went and picked five random packages from
CPAN’s category tree, and five from rubyforge’s. Of the five from
rubyforge, four had no documentation, and one had a link to an
external site which was either down or no longer available. I’m not
pointing fingers; my own projects are in no better shape. I’m just
saying that we as a community don’t have the documentation reflex, and
that the perlers do.

martin

Martin_DeMello · August 6, 2008, 11:09am

I’ve seen some things posted in this thread (overnight for me) that I
mildly disagree with. So, pardon me while I insert my opinion briefly
without quoting:

I’m not convinced that packages and applications need to be treated
differently in this discussion. The size of the documentation depends
on the complexity of the thing being documented, obviously; but the
way the documentation looks and behaves – at least to the user –
should probably be the same.
I’m not convinced that CPAN does documentation any better than
rubyforge/gem does. This isn’t a matter of having standard fields or
of rewriting our packaging system. We actually have to get our hands
dirty and write in … you know … English.
The fact that ls and grep are used by most people without reading
the documentation (assuming that it’s true) is not an excuse for ls
and grep not to have documentation. Go ahead and read the man pages
for ls and grep. Notice how well they explain how to use the
commands.

Martin_DeMello · August 6, 2008, 4:50am

On Tuesday 05 August 2008 07:56:22 Martin DeMello wrote:
(excerpt quoted from the linked article)

The important thing about Perl is that we have a culture of writing
good libraries. No Perl programmer would write a few lines of code,
post it to a blog, and call it a “library”. Everyone feels obligated
to create a CPAN distribution, with documentation (sometimes a bit on
the minimal side, but not everyone is a writer), a test suite, a
Makefile, etc. I’m not sure why, but this always happens. I think it’s
because there is a strong convention, and tools that make following
the convention easy.

This is somewhat true, and somewhat not. I think the best way to
encourage it
would be to build a public, web-facing index of gems, and evolve some
community pressure. It could be as simple as a rating system – actually
embarrass people for putting up poorly-documented, poorly-tested code.

Some automated tools could help with that, but this is really a social
project. As you said:

Hoe and friends
[http://nubyonrails.com/articles/tutorial-publishing-rubygems-with-hoe]
are a great step forward, but their use doesn’t seem to be widespread
yet.

Before I start my rant, I’m going to provide my suggestion. I think this
is
more important than the rant, and I realize some people will get bored.

I suggest that we work on improving the gem system itself. Specifically,
we
need to make it as flexible, powerful, and easy as possible, so that
people
stop developing things like Rails Plugins.

Right away, the biggest feature I miss is reverse dependencies. (Not
that CPAN
had them…) I want to be able to safely install a gem, test it out, and
remove it, and not have to go hunting in my gems directory for other
things
I’d like to remove.

Second biggest would be virtual dependencies – only second biggest
because I
don’t know if they exist or not. If not, they should. I should be able
to
depend on “any Rack engine”, not just Rack itself, and certainly not
specific
things like Mongrel or Thin.

Now, there is another disturbing thing about Ruby, at least, coming from
a
formerly-Perl perspective:

Perl encourages people to modularize. CPAN packages are (at least, in my
limited experience) consistently named, and the actual namespace within
the
language generally matches the name of the package on CPAN.

In Ruby, I’ve mostly seen polar opposites: Either tiny little gems with
few
external dependencies, which will play nice with almost anything –
metaid is
an extreme example – or HUGE gems like Rails.

Yes, I realize Rails is broken up into several gems. Well, three of
them,
really – actionpack, activerecord, and activesupport.

I’m going to argue that activesupport should be at least 10, maybe 20
different packages – there are so many little pieces of it that I wish
I
could use for a one-off script, without pulling in the whole library.
Things
like symbol-to-proc, 3.days.ago, etc. But it’s going to add
significantly to
the startup time of my script even to require activesupport – that’s
bloat.

And ActiveRecord – why, exactly, is validation included? Why not make
it a
mixin? That way, with duck typing, I could pull at least some of the
default
Rails validations into another ORM. (validates_uniqueness_of might even
work,
if the #find methods are similar.)

Things like Rack are a refreshing step forward – but there are very few
things like Rack.

In Perl, there’s an XML namespace, and most XML things are done through
XML::Parser, which provides a common frontend to several actual XML
parsers.
The actual interfaces to those parsers – that is, the pure Perl one,
the
LibXML one, and the expat one – are all separate modules.

Because of this, new things like XML::SAX can be written to fairly
easily plug
into all of those libraries.

In Ruby, well, rfeedparser has its own interfaces to the expat gem, the
libxml
gem (whatever their names are), and its own internal (sloppy) parser, in
case
the others can’t be found. Anyone else who just wants to parse XML is
going
to have to do the same thing – or worse, just pick a library.

As long as I’m ranting, may as well talk about plugins. I know about
gem_plugin – why isn’t it used more? Why, instead, are plugins
distributed
as things tightly-bound to Rails? Suppose I like ActiveRecord, and I
need
has_many_polymorphs – if I’m in Merb, there’s no clear indication of
whether
that will work or not.

Well, that, and plugins are a whole separate (and inferior!) package
management system that I have to learn, in addition to gems. The one
advantage I can buy is that you can then distribute your entire, working
app
to a web host who simply has Ruby, no external dependencies required –
but
you’d still have to freeze Rails to your app to accomplish that. Why not
freeze other gems, too?

Here’s a crazy idea (not sarcasm; I realize it’s quite possibly insane):
When
you build a framework like Rails, designed to aid in rapid development
of an
app, design it around building components, not whole applications. That
is,
design it around building gems – possibly lots of tiny gems (would one
per
model be too much?) – so that when it comes time to “extract
functionality”
from your app, you’ll already know what depends on what, and what you’ll
need
to disentangle.

If you made it this far, thanks for reading. Hope I made some kind of
sense.

Martin_DeMello · August 6, 2008, 12:06pm

A critical thing for me is that when programming Perl the POD usually
contains sufficient information and EXAMPLES OF USAGE to throw the
basics of an application by simply cutting and pasting from the POD. The
Ruby documentation is wonderful when you know what you are doing and
just want to brush up on the details but practically useless if you are
new to it. This discourages people who might have an interest in the
project but don’t have the time or skills to work out what is going on
from the source.

Just as an example ANTFARM for just announced, the synopsis from the
post was:

NAME

ANTFARM

DESCRIPTION

Passive network mapping tool.

URI

http://antfarm.rubyforge.org
http://rubyforge.org/projects/antfarm

INSTALL

gem install ANTFARM

HISTORY

0.2.0

Initial release as open source

SYNOPSIS

$> antfarm help

Not discouraged I went to the project page which was just as helpful.
Then I walked away. Now ANTFARM maybe the coolest project there is (it
recalled to me the use of ALife ‘ants’ to optimise network routing which
was why it piqued by interest) but I don’t have a fucking clue about how
to do anything with this project. Other than you initialise it with
“antfarm db —initialize”. Sometimes people shoot themselves in the foot,
others seem to go for the head.

I’m not picking on the ANTFARM team in particular there are plenty of
projects, both Ruby and non Ruby, that assume that you are psychic or
like wading through the source code as some sort of rite of passage.
This project seems to be completely free of vapour judging by the amount
of code that has been developed but it is almost as if they are actively
trying to drive potential users away.

EXAMPLES OF USAGE people, were need more of them!

Martin_DeMello · August 6, 2008, 1:22pm

A critical thing for me is that when programming Perl the POD usually
contains sufficient information and EXAMPLES OF USAGE to throw the
basics of an application by simply cutting and pasting from the POD.

I think we should talk to the rubyforge and RAA guys to encourage
“better” behaviour. I am not stating it should be mandatory whatsoever,
but rubyforge really is becoming the “ruby cpan” and it has a huge
influence on various libraries and its shape and thus the overall
ecosystem.

I still think that CPAN as such is overrated. On the other hand I
acknowledge that there will be various bindings or libaries (read
“functionality”) which ruby simply does not have yet, and this leads me
to the second proposal.
Rubyforge could make a “porting project”. I.e. it focuses on the “most
popular” projects which should be ported. This would be a SIMPLE voting
system and it serves NO OTHER PURPOSE than to see which projects would
be important.

Martin_DeMello · August 6, 2008, 12:57pm

Peter H. wrote:

A critical thing for me is that when programming Perl the POD usually
contains sufficient information and EXAMPLES OF USAGE to throw the
basics of an application by simply cutting and pasting from the POD.

Hear, hear. Coming from a Perl background myself, I put a synopsis-like
comment block at the top of every Ruby class or module I write.

I’ve been surprised at how poor the Ruby language documentation in
general is. Or maybe I haven’t found it yet.

As someone who has no problem writing technical stuff in English, I
wouldn’t mind contributing to some sort of open documentation project.
Does such a thing exist for Ruby?

Dave

Martin_DeMello · August 6, 2008, 1:41pm

Hi,

Shadowfirebird wrote:

I’m not convinced that CPAN does documentation any better than
rubyforge/gem does. This isn’t a matter of having standard fields or of
rewriting our packaging system. We actually have to get our hands dirty
and write in … you know … English.>

You are somewhat right. Of course no or little documentation is a
general
problem at all, not a ruby or rubyforge/gem specific one. But the main
difference is, how CPAN encourages the developer to provide a
documentation too: The documentation is the first thing you see and
searching on CPAN is searching the documentation. And by reading the
documentation top-down you very quickly (often after a couple of lines)
find out, what the lib/package is about.

Every developer on CPAN uses CPAN himself and sees the benefits it
brings
to him when he is using CPAN as a user of libs from other users.

When I try to find something on rubyforge, the first thing that drives
me
nuts is, that I find a name and a more, but often less informative
description, and very often no code. That last never happens on CPAN, at
least I have not seen something like that until now.

So after finding out, which projects really exist, you cannot browse the
documentation and only very few of the projects provide a more or less
informative project-homepage. And if they provide a homepage, each looks
different and for each one you have to find out, where the information
is, you are looking for. (There are even people, who use blog-entries as
homepages).

So to decide, whether the project could be useful to me, I have to
install it and read the generated documentation. So my CPAN for ruby is
‘gem server’.

But even here the documentation is very different in quality. You can be
very happy, if most of the methods are documented, but even then, it is
very often hard to find out the big picture.

So at the end you spent a lot of time and you are not quite sure,
whether
you found the right package since this depends strongly on the quality
of
the description and wether they then show up in the search you did.

On the other end, browsing CPAN is real fun, even if you are not
searching for something. And this is because (I think) the way CPAN
presents the packages. If you publish something here, you do not want to
be ashamed for not to provide at least a minimal useful documentation.
Big projects even provide tutorials etc., which provide even more then
just a brief API-documentation.

But when I write all this, it will become more and more clear to me,
that
it makes no sense to blame rubyforge for this, since as the name
indicates, it is more ment to be something like sourceforge than CPAN.
So
what Ruby is missing is at all, is something, that might be called
CRbAN.

Regards,
Thomas

Martin_DeMello · August 6, 2008, 2:00pm

On Aug 6, 2008, at 7:38 AM, Thomas Volkmar W. wrote:

So to decide, whether the project could be useful to me, I have to
install it and read the generated documentation. So my CPAN for ruby
is
‘gem server’.

It would be nice to have browseable docs for each project on
RubyForge. The gems are there; so it’d be a matter of generating the
rdoc in a safe way.

Didn’t someone write a rather nice AJAXy gem browser/searcher web app
a while back? I can’t recall enough about it to Google it up… but
it’d be a start, especially if it were hooked into a tab for each
RubyForge project.

Yours,

Tom

Martin_DeMello · August 6, 2008, 3:40pm

Gregory B. wrote:

On Tue, Aug 5, 2008 at 4:33 PM, Mark T. [email protected] wrote:

Module authors tend to use Module::Starter which is just like hoe, in
that it generates a nice module skeleton, ready to submit to CPAN.
Another thing I noticed about module authors is that they tend to use
more pragmatic, namespaced module names, i.e. they would tend to use a
name like “PDF::Simple” instead of “prawn”. Though this may be due to
the fact that 14,000 modules require a bit more organization and
findability.

There is a (hair brained) reason why I chose the name. It is to keep
Prawn out of the PDF namespace until it is ready to replace the
current defacto standard PDF::Writer library.

There’s no need to replace.

Prawn::PDF::Writer

Then a user can

include Prawn

That’s a solid practice.

T.

Martin_DeMello · August 6, 2008, 4:10pm

On Aug 6, 2008, at 6:20 AM, Marc H. wrote:

I think we should talk to the rubyforge and RAA guys to encourage
“better” behaviour. I am not stating it should be mandatory
whatsoever,
but rubyforge really is becoming the “ruby cpan” and it has a huge
influence on various libraries and its shape and thus the overall
ecosystem.

I would say that github seems to have taken a huge bite out of the
Rubyforge and RAA circles of influence in a very short time. It also
displays the README on the main project master tree page.

James Edward G. II

Martin_DeMello · August 6, 2008, 4:30pm

On Wednesday 06 August 2008 04:07:48 Shadowfirebird wrote:

I’ve seen some things posted in this thread (overnight for me) that I
mildly disagree with. So, pardon me while I insert my opinion briefly
without quoting:

I’m not convinced that packages and applications need to be treated
differently in this discussion.

I’ll agree with that – but then, I was the one proposing that
applications
should be split into packages.

Perl (again, in my limited experience) seems to have very few
“applications”
as such – rather, everything is structured into some sort of “library”,
and
the “application” itself is a script of less than 100 lines. mod_perl
only
encourages this.

Ruby, on the other hand, makes it very easy to inline things – you can
have a
single file with a few thousand lines of code, representing a somewhat
complex class hierarchy – and people do (Rails does, often.)

On top of that, barring plugins, Rails tends to encourage putting all
functionality into your app, as app-specific, until you can figure out
what
to plugin-ize – and almost no one namespaces anything. (Example: It may
have
been fixed by now, but last time I tried namespace’d models, the default
XML
output was invalid.)

I’m not convinced that CPAN does documentation any better than
rubyforge/gem does.

It probably doesn’t. It does, however, encourage documentation.

Maybe I’m unusual, but the first thing I do when I want a CPAN module
is, I
head over to search.cpan.org. This works surprisingly well, and once I
do
find a module, the very first thing I see is a README, and the very
first
thing in that README is a “Hello, World” example.

Now, other people have talked about the Synopsis. I’m also talking about
the “search” functionality. Part of making this obvious is emphasizing
how
people find out about gems.

This isn’t a matter of having standard fields or
of rewriting our packaging system. We actually have to get our hands
dirty and write in … you know … English.

Hopefully, nothing I’ve suggested needs a full rewrite…

And I agree. The trick is, how do we embarrass people into writing that
English? We can’t force them to, and we can’t simply ignore their
library
(how would they know?) – my favorite idea so far is to drag them into
the
cold, harsh light of crowd review, and allow ratings and reviews on
Rubyforge.

Martin_DeMello · August 6, 2008, 4:45pm

Hi –

On Wed, 6 Aug 2008, James G. wrote:

README on the main project master tree page.
It has indeed. I think that at this point, it’s unlikely to impossible
that a system or standard is going to emerge that everyone follows, at
least at the granularity of what the section headings in documentation
are called and all that, and probably even at much coarser
granularity. To some extent, having seen a million discussions about
this, I have the feeling that if it were going to happen, it would
have happened by now. Also, since something like github can arise and
change the landscape, simply by being there, I believe it’s
fundamentally wrong to look at the whole thing as based on consensus.
We can whip up a bunch of “standards” here and on various wikis, and
that will have nothing to do with how things actually play out.

So the real question, I would say, is: how can the various sites,
philosophies, etc., interact with and connect to each other? I don’t
think that cherry-picking CPAN features is all that useful, since it
doesn’t address the bigger questions. (To which I have no answer, by
the way. I think it’s very possible that there is no set of steps A,
B, …, n, such that at the end of those steps, there will be any kind
of uniformity or long-term interoperation. I don’t even know what step
A would be.)

David

Martin_DeMello · August 6, 2008, 5:34pm

And I agree. The trick is, how do we embarrass people into writing that
English? We can’t force them to, and we can’t simply ignore their library
(how would they know?) – my favorite idea so far is to drag them into the
cold, harsh light of crowd review, and allow ratings and reviews on
Rubyforge.

Well, you can’t force them; this is the internet.

But we can provide a good example to follow, and promote other good
examples when we see them.

My 10 cents? A good first step would be to Fix Ruby Garden.