Ruby + Apache Lucene using XMLRPC?


#1

Please excuse me if this has been answered before.

Is there a "how-to" guide or a walkthrough on integrating Lucene 

with ROR via XMLRPC?

I got Ferret to work but I am worried that it's not going to be as 

scalable.

I highly appreciate your assistance.

Thanks
Frank

#2

On Feb 19, 2006, at 9:44 PM, softwareengineer 99 wrote:

Please excuse me if this has been answered before.

Is there a “how-to” guide or a walkthrough on integrating Lucene
with ROR via XMLRPC?

I got Ferret to work but I am worried that it’s not going to be as
scalable.

I highly appreciate your assistance.

A week or so ago I posted the details of my custom XML-RPC server
(check the archives for it). There isn’t one built into Lucene,
though there are certainly a number of projects that do wrap Lucene
behind web services of one sort or another including the now
incubating Solr project.

Ferret may be suitable - only testing (or real world use!) will tell.

Erik

#3

Check out my little write up here:
http://blog.nicholasstuart.com/articles/2006/02/11/a-little-bit-of-both

If you have any more specific questions I would be happy to help out.

-Nick


#4

Hey Nick,
Thanks for your reply.

Though I have studied Java quite a bit, it’s been terribly long time
since I programmed in that language (back in the university days).

I have a few questions:
What version of Apache are you using?
How long did it take you to implement Lucene
Do I need to install TomCat to have it running?
Why are you using Apache’s XML-RPC server? Isn’t there XML-RPC
functionality with Rails? Just curious.

I will eventually buy Erik H.'s book but want to weigh my options
and get my test project working with Lucene.

Thanks for your help

Frank

Nick S. removed_email_address@domain.invalid wrote: Check out my little
write up here:
http://blog.nicholasstuart.com/articles/2006/02/11/a-little-bit-of-both

If you have any more specific questions I would be happy to help out.

-Nick


#5

On 2/20/06, softwareengineer 99 removed_email_address@domain.invalid wrote:

Thanks Erik for your reply.

Do I need Tomcat running for this to work? Do I have any option of not
using Tomcat?

You don’t need Tomcat. Just a JDK to compile and run. As you can see,
the
code creates its own webserver on port 8080.

Where do I put the XML-RPC server code?

Anywhere you want your Java code to live.

Must I install the Apache XML-RPC server? Is the installation
reversible?

At most you probably need the Apache XMLRPC Jar file(s), and probably
the
Lucene Jars to run the server code. It’s just Java, it won’t bite :slight_smile:

Tony


#6

Thanks Erik for your reply.

Do I need Tomcat running for this to work? Do I have any option of
not using Tomcat?
Where do I put the XML-RPC server code?
Must I install the Apache XML-RPC server? Is the installation
reversible?

Thanks
Frank

Erik H. removed_email_address@domain.invalid wrote:
On Feb 19, 2006, at 9:44 PM, softwareengineer 99 wrote:

Please excuse me if this has been answered before.

Is there a “how-to” guide or a walkthrough on integrating Lucene
with ROR via XMLRPC?

I got Ferret to work but I am worried that it’s not going to be as
scalable.

I highly appreciate your assistance.

A week or so ago I posted the details of my custom XML-RPC server
(check the archives for it). There isn’t one built into Lucene,
though there are certainly a number of projects that do wrap Lucene
behind web services of one sort or another including the now
incubating Solr project.

Ferret may be suitable - only testing (or real world use!) will tell.

Erik


Rails mailing list
removed_email_address@domain.invalid
http://lists.rubyonrails.org/mailman/listinfo/rails

Rails Blog: http://railsruby.blogspot.com
MySQL Blog: http://mysqldatabaseadministration.blogspot.com
Linux / Security Blog: http://frankmash.blogspot.com


#7

If you go with a straight Java solution for this all your server code
is going just your main program, and thats it (see my blog, what you
see there is all the server code I wrote).

And you need to really ‘install’ the server. Its just a simple JAR
file that your include/reference in your own program.


#8

On 2/20/06, softwareengineer 99 removed_email_address@domain.invalid wrote:

Hey Nick,
Thanks for your reply.

Though I have studied Java quite a bit, it’s been terribly long time since
I programmed in that language (back in the university days).

I have a few questions:
What version of Apache are you using?
For the Web Server I’m running Apache 2.5.x (can’t remember specific)

How long did it take you to implement Lucene
Not long at all, maybe a day or two. But, I’ve also been actively
programming in Java for awhile, and still am, so that helps me out a
bit.

Do I need to install TomCat to have it running?
No, I wanted to avoid just this situation. I didn’t want a whole
servlet container just for searching.

Why are you using Apache’s XML-RPC server? Isn’t there XML-RPC
functionality with Rails? Just curious.
I’m actually using both. The reason for Apache’s XML-RPC server is so
that I can directly interact with Lucene through Java and not have to
worry about any cross language issues.

And actually, Rails itself doesn’t have any XML-RPC, but Ruby comes
bundled with both a client and a server, of which only the client is
needed in this case. Incredibly handy, I must say.

Hope this helps!
-Nick


#9

Thank you Nick.
It seems like I will get started on this tonight and will ask if I
have any questions.

Thank you very much

Nick S. removed_email_address@domain.invalid wrote:Not long at all, maybe a
day or two. But, I’ve also been actively
programming in Java for awhile, and still am, so that helps me out a
bit.

Rails Blog: http://railsruby.blogspot.com
MySQL Blog: http://mysqldatabaseadministration.blogspot.com
Linux / Security Blog: http://frankmash.blogspot.com


#10

Thanks Tony for your reply.

So should I install JDK with Netbeans from the following location?
http://java.sun.com/j2se/1.5.0/download-netbeans-50.html

Any tips or gotcha’s I need to be aware of?

Must the installation be done in a directory accessible over the
Internet if I want Lucene to work?

Thanks
Frank

Tony C. removed_email_address@domain.invalid wrote: You don’t need Tomcat. Just a
JDK to compile and run. As you can see, the code creates its own
webserver on port 8080.

Where do I put the XML-RPC server code?
Anywhere you want your Java code to live.

Must I install the Apache XML-RPC server? Is the installation
reversible?
At most you probably need the Apache XMLRPC Jar file(s), and probably
the Lucene Jars to run the server code. It’s just Java, it won’t bite
:slight_smile:

Tony


Rails mailing list
removed_email_address@domain.invalid
http://lists.rubyonrails.org/mailman/listinfo/rails

Rails Blog: http://railsruby.blogspot.com
MySQL Blog: http://mysqldatabaseadministration.blogspot.com
Linux / Security Blog: http://frankmash.blogspot.com


#11

Sorry for asking a basic question but which JDK should I install? With
or without netbeans?

Thanks
Frank

Rails Blog: http://railsruby.blogspot.com
MySQL Blog: http://mysqldatabaseadministration.blogspot.com
Linux / Security Blog: http://frankmash.blogspot.com


#12

On 2/20/06, softwareengineer 99 removed_email_address@domain.invalid wrote:

Thanks Tony for your reply.

So should I install JDK with Netbeans from the following location?
http://java.sun.com/j2se/1.5.0/download-netbeans-50.html

Sure, that seems reasonable.

Any tips or gotcha’s I need to be aware of?

Must the installation be done in a directory accessible over the Internet
if I want Lucene to work?

I’m not sure about the internals of Lucene – you’ll probably want to
check
the docs.

Tony


#13

Thank you very much.

Frank

Tony C. removed_email_address@domain.invalid wrote: On 2/20/06, softwareengineer 99
removed_email_address@domain.invalid wrote: Thanks Tony for your reply.

So should I install JDK with Netbeans from the following location?
http://java.sun.com/j2se/1.5.0/download-netbeans-50.html
Sure, that seems reasonable.

Any tips or gotcha's I need to be aware of?

Must the installation be done in a directory accessible over the
Internet if I want Lucene to work?

I’m not sure about the internals of Lucene – you’ll probably want to
check the docs.

Tony


Rails mailing list
removed_email_address@domain.invalid
http://lists.rubyonrails.org/mailman/listinfo/rails

Rails Blog: http://railsruby.blogspot.com
MySQL Blog: http://mysqldatabaseadministration.blogspot.com
Linux / Security Blog: http://frankmash.blogspot.com


#14

You don’t need netbeans… that’s an IDE… like Eclipse. Should make
your download much
smaller. Just the latest version of the JDK should do ya.

Sun’s site can be extremely confusing, so go here:

http://java.sun.com/j2se/1.5.0/download.jsp

and click on the “Download JDK 5.0 Update 6” link below the netbeans and
j2ee sections.

Then, click on the “accept” radio button and then click on one of the
linux downloads.
Being on Red Hat, the RPM would probably do ya just fine.

b


#15

I keep getting the following errors (I have tried re-downloading three
times)

      Initializing InstallShield Wizard........
        Extracting Bundled JRE.

    Bundled JRE is not binary compatible  with host OS/Arch or it is 

corrupt. Testing bundled JRE failed.

I downloaded the version for Linux (jdk-1_5_0_06-nb-5_0-linux.bin)

I am using RHEL 3
Any ideas?

Thanks
Frank
Tony C. removed_email_address@domain.invalid wrote: On 2/20/06, softwareengineer 99
removed_email_address@domain.invalid wrote: Thanks Tony for your reply.

So should I install JDK with Netbeans from the following location?
http://java.sun.com/j2se/1.5.0/download-netbeans-50.html
Sure, that seems reasonable.

Any tips or gotcha's I need to be aware of?

Must the installation be done in a directory accessible over the
Internet if I want Lucene to work?

I’m not sure about the internals of Lucene – you’ll probably want to
check the docs.

Tony


Rails mailing list
removed_email_address@domain.invalid
http://lists.rubyonrails.org/mailman/listinfo/rails

Rails Blog: http://railsruby.blogspot.com
MySQL Blog: http://mysqldatabaseadministration.blogspot.com
Linux / Security Blog: http://frankmash.blogspot.com


#16

Almost forgot my favorite hidden feature:

  • You can get Atom or RSS feeds for search results from the master UI.

Now that’s slick.

Zed A. Shaw


#17

Just use Hyper Estraier. http://hyperestraier.sourceforge.net/

  • Much easier to setup and access via a simple REST API from Ruby, Java,
    or
    C.
  • Easily clusterable with the ability segregate nodes or join them for
    combined results.
  • Fast as hell. We use it at the NYC Dept. of Correction to find stuff
    and
    not only does it blow Lucene out of the water even though it’s a server,
    it’s easier to setup–even on windows.
  • Gives good results with attributes, vectors, and other goodies
    included.
  • Has a REST interface but you can just use the pure Ruby, Java, or C
    API to
    access it.
    • I’m mentioning this again because it’s the best feature.
  • Runs on Windows right out of the box.
  • Takes 10 minutes to setup, less on windows. That’s right, less. No
    JVM.
    All fast binary goodness baby.
  • You can embed it as a library via Ruby or Java if you want the
    absolute
    fastest.
    • Yes, this means that HE out of the box does local/embedded, remote
      distributed, and linkable P2P clustered search infrastructure with
      minimal
      setup for three languages.
  • Has a decent web UI for managing an estraier master (that’s the server
    that controls the thing).
  • Don’t need to buy an O’Reilly book to figure it out. Plenty of code
    examples and decent docs considering the author is Japanese (better than
    most English speaker’s docs really).
  • Supports phrase search, regular expressions, attribute search, and
    similarity search with Unicode support.

Only thing I’ve ran into is that OSX people will find that darwin port’s
version is super old (man those guys need a kick in the ass). I’m
currently
trying to build it on OSX and should have some instructions for it soon.

Otherwise HE is the freaking bomb for Ruby projects simply because it’s
so
damn easy to access from any language via the rest interface. And when
I
say access I don’t mean just doing searches but also administering the
master servers.

I figure if it’s good enough for half of rails-core and nearly every
other
Rails shop I’ve shown it to then it’s good enough for you folks to at
least
try.

Zed A. Shaw


#18

Il giorno 20/feb/06, alle ore 23:30, softwareengineer 99 ha scritto:

I keep getting the following errors (I have tried re-downloading
three times)

    Initializing InstallShield Wizard........
      Extracting Bundled JRE.

  Bundled JRE is not binary compatible with host OS/Arch or it  

is corrupt. Testing bundled JRE failed.

Download the one without NetBeans (NetBeans is just an IDE. Do you
need an IDE for what you’re trying to do? If you do, you can always
download it later). It comes as a self-extracting executable that you
run anywhere you like, than you simply set the JAVA_HOME environment
to the directory where you unpacked it into and add $JAVA_HOME/bin to
your PATH. If you then type “java -version” it should tell you what
version you have installed.

Ugo


Ugo C.
Blog: http://agylen.com/
Open Source Zone: http://oszone.org/
Evil or Not?: http://evilornot.info/
Company: http://www.sourcesense.com/


#19

Hi Ugo,

Thanks for your reply.
I have been trying to find the download location to download JDK
without Netbeans but so far I am unsuccessful.

I will keep searching for it.
I will appreciate if oyu can provide any pointers.

Thanks
Frank

Ugo C. removed_email_address@domain.invalid wrote:
Download the one without NetBeans (NetBeans is just an IDE. Do you
need an IDE for what you’re trying to do? If you do, you can always
download it later). It comes as a self-extracting executable that you
run anywhere you like, than you simply set the JAVA_HOME environment
to the directory where you unpacked it into and add $JAVA_HOME/bin to
your PATH. If you then type “java -version” it should tell you what
version you have installed.

Ugo


Ugo C.
Blog: http://agylen.com/
Open Source Zone: http://oszone.org/
Evil or Not?: http://evilornot.info/
Company: http://www.sourcesense.com/


Rails mailing list
removed_email_address@domain.invalid
http://lists.rubyonrails.org/mailman/listinfo/rails

Rails Blog: http://railsruby.blogspot.com
MySQL Blog: http://mysqldatabaseadministration.blogspot.com
Linux / Security Blog: http://frankmash.blogspot.com


#20

On Feb 21, 2006, at 3:07 AM, Zed S. wrote:

Only thing I’ve ran into is that OSX people will find that darwin
port’s
version is super old (man those guys need a kick in the ass). I’m
currently
trying to build it on OSX and should have some instructions for it
soon.

It should build cleanly on OSX, just remember to use ‘make mac’ (or
something like that) instead of regular ‘make all’. I think that
applies to the QDBM
package, too.

Rails shop I’ve shown it to then it’s good enough for you folks to
at least
try.

Definitely. Use it and love it.

-Scott