Cost of using many fields

WatersS_Chris · March 5, 2007, 6:23pm

Hi,

In ferret, and especially when using acts_as_ferret, it is easy to
specify many fields. What is the cost of using a lot of fields from a
performance perspective? Is each field searched separately, or are they
combined together in the inverted index.

As an extreme example, if I made every word in my documents a separate
field (so the first word in each document was field 1 and the second
word was field 2, etc) would this be significantly less efficient than
treating the entire document as a single field?

I am not doing something quite as bad as this hypothetical example, but
I am investigating different ways to organize some data.

Thanks,

Chris.

WatersS_Chris · March 6, 2007, 4:23am

On 3/6/07, Waters, Chris [email protected] wrote:

In ferret, and especially when using acts_as_ferret, it is easy to specify
many fields. What is the cost of using a lot of fields from a performance
perspective? Is each field searched separately, or are they combined
together in the inverted index.

Hi Chris,

Each field is searched separately so the more fields you search the
longer the search will take. Also note that there shouldn’t be any
difference in the time to search a single field whether you have 1
field or 1 million. It will only take longer if you search all 1
million fields.

As an extreme example, if I made every word in my documents a separate field
(so the first word in each document was field 1 and the second word was
field 2, etc) would this be significantly less efficient than treating the
entire document as a single field?

I am not doing something quite as bad as this hypothetical example, but I am
investigating different ways to organize some data.

I’m not sure exactly what you want to do but you may want to look at
span queries. These queries allow you to search based on the positions
of the terms in the document. But perhaps your hypothetical is
misleading me.

Cheers,
Dave

WatersS_Chris · March 6, 2007, 5:49am

Thanks, that answers my question. My example was purely hypothetical,
but I really am contemplating having hundreds of fields.

Regards,

Chris