Really, I am expecting it to be rather slow with how much data I am
throwing at it. My worry is most of the tools I have seen store every
“word” in memory. The dataset I am loading up is an experiment with
using text clouds on data sets that are not really “words” so to speak.
Thing of running strings on a binary, then loading that output in. That
is similar to what I am doing, just on a larger scale.
If I was doing real words, I would totally agree with you about so many
words out there. In this case however, it will be a bit different.
As for a fancy word/phrase frequency table, yeah, pretty much that just
with a fancier output.
Thanks
Jim
----- Original Message ----
From: Jano S. [email protected]
To: ruby-talk ML [email protected]
Sent: Friday, October 12, 2007 10:24:25 AM
Subject: Re: Text/Tag Cloud generation
On 10/12/07, Jim Og [email protected] wrote:
I need a tool which can process multiple Gig of data, without crashing,
and output a text cloud.
Isn’t a text cloud just a fancy word/phrase frequency table, or am I
missing something here? If it indeed is, how fast do you need it to
be?
I wouldn’t worry about stability, but speed instead… There can be
only so many words there.
J.
____________________________________________________________________________________
Check out the hottest 2008 models today at Yahoo! Autos.
http://autos.yahoo.com/new_cars.html