is scarce currently though, so if you beat me to a solution mail it
to me, along with a count of how many hours it took to build…
James Edward G. II
Parsing English is what I wrote my thesis on at university. I wonder
if a general purpose English tokenizing and parsing library would be
of any use to anyone. I’m thinking of taking it up again.
I would LOVE to see a pure Ruby library for something like this. I
can’t imagine it wouldn’t get used.
Unfortunately I think I’d have to do it in C. My linguistic skills
aren’t good enough to take a purely grammatical approach. I’d need to
take a learning approach. The parser I wrote at university took 72
hours to process the input corpus which it learned all the rules from.
I’d hate to think how long this would take in pure ruby. Writing this
as a ruby extension would be the way to go. For me at least anyway.
I would LOVE to see a pure Ruby library for something like this. I
can’t imagine it wouldn’t get used.
Unfortunately I think I’d have to do it in C. My linguistic skills
aren’t good enough to take a purely grammatical approach. I’d need to
take a learning approach. The parser I wrote at university took 72
hours to process the input corpus which it learned all the rules from.
I’d hate to think how long this would take in pure ruby. Writing this
as a ruby extension would be the way to go. For me at least anyway.
of solving it myself, to prove it’s not too much work. My free time
TO would need to be understood to get the
direct-object/indirect-object right:
feed bird to lion vs. feed bird lion.
I’ll start on such a solution and let you know how it goes…
In my research I happened on a part-of-speech tagger by Mark W.:
That would certainly make Mark a bit of an expert in this area! He
posted
here in the forum about 10 days ago but I see his site says he is on
holiday.
Anyway the tagger uses a 92000 line lexicon and some rules to transform
the results of the lexicon lookup. It’s very interesting indeed.
I studied some linguistics at university but not enough to produce
something like this. It seems there may be a standard way to accomplish
such things if one does study the field further.