Searching/Sorting an Array of Hashes


#1

I have an array of hashes that contains several fields, including
first_name and last_name. Unfortunately, since its the result of an API
call, I have no other ways to work with it. Regardless, I’m trying to
build a basic search function where a user can enter a name and it will
display the results from a newly created array.

I’m guessing that sort_by will be the best route to go, but I’ve been
unsuccessful in finding out how to use it with multiple fields. Any
guesses?

The second part to the question is how you structure the sort_by, if
that is the best way, to find objects that are similar to the requested
query. It’s not so much that a user would mispell a name (although that
would be helpful) but if they put in a firstname + lastname pair, it
wouldn’t technically match with either field on its own.

Thanks in advance. :slight_smile:


#2

Your question needs a bit of clarification–please use a examples of
the query, the data set format to be searched, and the expected
result. And then repeat for the case that you are having trouble
with.

But I’ll throw out some thoughts even though I am confused about what
you are trying to do…

Here is an example of how to sort_by last_name on an array of hashes
with first_name and last_name keys .
array.sort_by{|hash| hash[:last_name]}

irb

test = [{:first_name=>‘tim’, :last_name=>‘rand’},{:first_name=>‘jim’, :last_name=>‘band’},{:first_name=>‘him’, :last_name=>‘crand’}]
=> [{:first_name=>“tim”, :last_name=>“rand”},
{:first_name=>“jim”, :last_name=>“band”},
{:first_name=>“him”, :last_name=>“crand”}]

test
=> [{:first_name=>“tim”, :last_name=>“rand”},
{:first_name=>“jim”, :last_name=>“band”},
{:first_name=>“him”, :last_name=>“crand”}]

test.sort_by{|hash| hash[:first_name]}
=> [{:first_name=>“him”, :last_name=>“crand”},
{:first_name=>“jim”, :last_name=>“band”},
{:first_name=>“tim”, :last_name=>“rand”}]

test.sort_by{|hash| hash[:last_name]}
=> [{:first_name=>“jim”, :last_name=>“band”},
{:first_name=>“him”, :last_name=>“crand”},
{:first_name=>“tim”, :last_name=>“rand”}]

To find misspelled names is a bit trickier–I would probably use the
text rubygem as it has the ability to calculate the Levenshtein
distance (basically number of substitutions, deletions and insertions)
required to spell the target using a query. You would have to compare
the query to all names and sort based on the levenshtein distance and
then pull the closest match.I have used that strategy in the past and
it works. Here is a quick demo of the syntax for the levenshtein
distance:

irb

require ‘text’
=> true

Text::Levenshtein.distance(‘this’, ‘that’)
=> 2

Text::Levenshtein.distance(‘query’, ‘queen’)
=> 2

To the extent that I think I understand your question, I bet having
some verification is going to be unavoidable. Something like the
following to catch cases when people type in a space separated first
and last name.

if query.match(" ") #query is something like “first last”
query_first, query_last = “first last”.split(/ /)[0], “first last”.split(/ /)[1]
else
query_first = query_last = query
end

Hope that helps,
Tim

On May 23, 4:49 pm, Robert S. removed_email_address@domain.invalid


#3

Tim,

I really appreciate the time and thoughtfulness you put in your reply.
To clarify further from my original question, building from your
example.

contacts = [{:first_name=>‘tim’, :last_name=>‘rand’, :id =
1},{:first_name=>‘jim’, :last_name=>‘band’, :id =>
2},{:first_name=>‘him’, :last_name=>‘crand’, :id => 3}]

Using a search form, the user will submit a string, looking for a
particular contact in the array. Unfortunately, this might be just “tim”
or “rand” or “tim rand”. If a match is found in the array, I need to
return the id number associated with the match.

Now, if I was accessing the information from a database table directly
instead of an array, something like this would probably suffice.

@contacts = Contact.find(:all, :conditions => [ ‘LOWER(lastname) LIKE ?
OR LOWER(firstname) LIKE ?’, ‘%’ + value.downcase + ‘%’,’%’ +
value.downcase + ‘%’])

Unfortunately, I’m not sure how to build the equivalent query for an
existing array. Sort_by helps, but I haven’t found a way to allow it to
search both :first_name and :last_name - only one at a time.


#4

On 25/05/2009, at 6:23 PM, Robert S. wrote:

item is a hash, contacts in an array - using the Array#find or

Array#detect method (they’re synonymous)

assuming search_string contains the string you want to find

found_item = contacts.detect{|item| item.values.any?{|value|
value.include?(search_string)}}

or

found_item = contacts.detect{|item| item.values.join.include?
(search_string)}

then, it’s simply a matter of getting the id value from the
found_item. As this is a hash, just found_item[:id] should suffice.

Julian.


Learn: http://sensei.zenunit.com/
Last updated 20-May-09 (Rails, Basic Unix)
Blog: http://random8.zenunit.com/
Twitter: http://twitter.com/random8r

Using a search form, the user will submit a string, looking for a
particular contact in the array. Unfortunately, this might be just
“tim”
or “rand” or “tim rand”. If a match is found in the array, I need to
return the id number associated with the match.


#5

Hi again Robert,
There might be methods build into rails for doing this, but when you
have a very specific case, you might just roll out your own methods to
get exactly what you want:

=begin
given a data structure like @contacts =
[{:first_name=>‘tim’, :last_name=>‘rand’, :id =
1},{:first_name=>‘jim’, :last_name=>‘band’, :id =>
2},{:first_name=>‘him’, :last_name=>‘crand’, :id => 3}]
and given a query may be first, last, or both names
return id number for matches
=end

#here is our search array
@contacts = [{:first_name=>‘tim’, :last_name=>‘rand’, :id =>
1},{:first_name=>‘jim’, :last_name=>‘band’, :id =>
2},{:first_name=>‘him’, :last_name=>‘crand’, :id => 3},
{:first_name=>‘shim’, :last_name=>‘crand’, :id => 4}]

#method to separate names if more than one is given
def parse_query(query)
if query.match(" ")
name1, name2 = query.split(/ /)
else
name1 = query
return name1.to_a
end
return [name1, name2]
end

#find any name in hash field and return the ids
def search_array_with_hashes(array_with_name_or_names)
@hits = []
#search first names
array_with_name_or_names.each do |name|
@contacts.each do |hash|
@hits << hash[:id] if hash.values.include?(name)
end
end
@hits.uniq
end

#usage/test case examples
p search_array_with_hashes(parse_query(“band”))
p search_array_with_hashes(parse_query(“tim rand”))
p search_array_with_hashes(parse_query(“crand”))

>> [2]

>> [1]

>> [3, 4]

Will that do the trick?
Tim

On May 25, 1:23 am, Robert S. removed_email_address@domain.invalid


#6

Tim,

Thanks again for the detailed thoughts. It looks like your approach is
structured in a way that should handle what I’m looking for, and provide
a framework for future expansion. I went ahead and implemented it in,
and am trying to resolve an issue now (NoMethodError (undefined method
values_at') forsearch_array_with_hashes’ but I’ll let you know once
it gets working.

Cheers!

timr wrote:

Hi again Robert,
There might be methods build into rails for doing this, but when you
have a very specific case, you might just roll out your own methods to
get exactly what you want:

=begin
given a data structure like @contacts =
[{:first_name=>‘tim’, :last_name=>‘rand’, :id =
1},{:first_name=>‘jim’, :last_name=>‘band’, :id =>
2},{:first_name=>‘him’, :last_name=>‘crand’, :id => 3}]
and given a query may be first, last, or both names
return id number for matches
=end

#here is our search array
@contacts = [{:first_name=>‘tim’, :last_name=>‘rand’, :id =>
1},{:first_name=>‘jim’, :last_name=>‘band’, :id =>
2},{:first_name=>‘him’, :last_name=>‘crand’, :id => 3},
{:first_name=>‘shim’, :last_name=>‘crand’, :id => 4}]

#method to separate names if more than one is given
def parse_query(query)
if query.match(" ")
name1, name2 = query.split(/ /)
else
name1 = query
return name1.to_a
end
return [name1, name2]
end

#find any name in hash field and return the ids
def search_array_with_hashes(array_with_name_or_names)
@hits = []
#search first names
array_with_name_or_names.each do |name|
@contacts.each do |hash|
@hits << hash[:id] if hash.values.include?(name)
end
end
@hits.uniq
end

#usage/test case examples
p search_array_with_hashes(parse_query(“band”))
p search_array_with_hashes(parse_query(“tim rand”))
p search_array_with_hashes(parse_query(“crand”))

>> [2]

>> [1]

>> [3, 4]

Will that do the trick?
Tim

On May 25, 1:23�am, Robert S. removed_email_address@domain.invalid


#7

Sorry, Tim. I should have clarified. This is the line that is having
issues:

@hits << hash[:id] if hash.values.include?(name)

Specifically, the .values part.

timr wrote:

It looks like a typo got introduced as you were moving the method into
your rails app. There is no values_at call in my method, perhaps you
accidentally tab completed and inadvertently introduced the _at.
Good luck.
Tim

On May 26, 2:07�am, Robert S. removed_email_address@domain.invalid


#8

It looks like a typo got introduced as you were moving the method into
your rails app. There is no values_at call in my method, perhaps you
accidentally tab completed and inadvertently introduced the _at.
Good luck.
Tim

On May 26, 2:07 am, Robert S. removed_email_address@domain.invalid


#9

Sorry, but that code below is really unidiomatic ruby.

Given the following:

given a data structure like @contacts =
[{:first_name=>‘tim’, :last_name=>‘rand’, :id =
1},{:first_name=>‘jim’, :last_name=>‘band’, :id =>
2},{:first_name=>‘him’, :last_name=>‘crand’, :id => 3}]
and given a query may be first, last, or both names
return id number for matches

@contacts = [{:first_name=>‘tim’, :last_name=>‘rand’, :id => 1},
{:first_name=>‘jim’, :last_name=>‘band’, :id =>2},
{:first_name=>‘him’, :last_name=>‘crand’, :id => 3}]

keywords = “ran”
@contacts.select{|hash| keywords.split.any?{|keyword|
hash.values.join.include?(keyword)}}.map{|hash| hash[:id]}

or more prettily:

@contacts.select do |hash|
keywords.split.any? do |keyword|
hash.values.join.include?(keyword)
end
end. # note the period at the end of this line… indicating we still
want to send the result of this select method another message yet…
(ie the map message below).
map do |hash|
hash[:id]
end

=> [1, 3]

if you really need to make a method of it (tho I don’t know why you
would), you can do so thusly:

class ArrayOfHashes < Array
def search_array_with_hashes(keywords)
found_hashes = self.select{|hash| keywords.split.any?{|keyword|
hash.values.join.include?(keyword)}}
found_hashes.map{|hash| hash[:id]}
end
end

@contacts = ArrayOfHashes.new(@contacts)

@contacts.search_array_with_hashes(“ran”)
=> [1, 3]
@contacts.search_array_with_hashes(“band”)
=> [2]
@contacts.search_array_with_hashes(“tim rand”)
=> [1, 3]
@contacts.search_array_with_hashes(“crand”)
=> [3]
@contacts.search_array_with_hashes(“jam”)
=> []

p search_array_with_hashes(parse_query(“band”))
p search_array_with_hashes(parse_query(“tim rand”))
p search_array_with_hashes(parse_query(“crand”))


Learn: http://sensei.zenunit.com/
Last updated 20-May-09 (Rails, Basic Unix)
Blog: http://random8.zenunit.com/
Twitter: http://twitter.com/random8r


#10

Julian solution is more elegant. I like it. It has a functional
difference in that it catches partial names–for instance tim would
match to timothy (probably good in this case)–but rand would match
crand.

That being the case searching tim matches [1], but tim rand matches
[1,3]. More information leads to less specificity. when a perfect
match is available, that should be the only item returned–i would
think.