Properly escaping special characters in AAF?

For most cases, I’ve got search working in Rails as follows:

controller:

term = params[:search][:term]
@results = MyModel.find_by_contents “#{term}*”

The ‘*’ character is appended to the search term so that searches match
anything that begins with ‘term’. For the most part, this is great, but
let’s say term is equal to “Title: Some subtitle”. This will match
anything
that has a ‘title’ attribute equal to “some subtitle”, instead of any
attribute equal to “Title: Some subtitle”, which is what I’m hoping for.

If I run my search from within a double-quotes expression, like
MyModel.find_by_contents “’”#{term}"’", then it looks like I can get
matches for “Title: Some subtitle”, but I can’t get matches if I search
for
“Titl” without the ‘e’, presumably because the '
’ is escaped as well?
I’m
not quite sure.

I want something that works in all cases, where I can include a search
term
that has a special character, but still get matches when my search term
isn’t equal to an entire word. I’m hoping that my situation is a typical
one, and that someone out there has already dealt with this? Thanks very
much for any advice.

Liam M.

Hey,

these are two separate problems in fact. Let me try to explain.

Throwing a query like “title: foo bar” straight into Ferret, when that
gets turned into FQL, it becomes basically “look for ‘foo bar’ in the
‘title’ field”, as you figured. Quoting the whole lot will throw the
query towards the index, and the default_field value will be used to
decide which fields should be queried for “title: foo bar”. Since it
defaults to “*”, it’ll query all fields in every document.

Now the reason why a query like “Title: foo bar” won’t match any
results with “title” in it is, simply put, the analyzer you’re using.
If you’re using the StandardAnalyzer (if you didn’t specify otherwise,
then that’s what you’re using), the behavior you can expect is it will
catch whole words, separated by spaces, minus stop words (or, and, by,
etc…). So “titl” will never match “title”.

If you’re looking for something that gives you half-string matches,
I’d go for a RegexpAnalyzer and use a regex like “*”, which would turn
every character into a token. This is a bit nightmarish because you’ll
get an insane number of matches for everything, but right now I can’t
think of a better way (maybe declare a mininum number of chars for a
query and filter out results with very low score?).

Or if you’re looking for stemming (query for “titles”, “titling”
returning results with “title”), have a look at

http://rubyforge.org/pipermail/ferret-talk/2007-March/002782.html

Hope that helps.