How to group similar elements in an array


#1

Hi all,

Just wonder if there is way to group similar(not identical)
elements in an array?

For example, changing old_array into new_array as follow:

old_array = [“John”, “Mike1”, “Bob1”, “Mike2”, “Bob2”]

new_array=[“John”, [“Mike1”, “Mike2”],[ “Bob1”, “Bob2”]]

Thanks,


#2

I don’t have a solution ad-hoc but this sounds as if it could
be solved via a Levensthein formula.

Example:

require ‘rubygems/text’
include Gem::Text
levenshtein_distance ‘shevy’, ‘chevy’ # => 1

I think with that, you can build your criteria.

In your above example, “Mike1” and “Mike2”
would have a distance of 1; so perhaps you can
use this as a criterion.

I don’t know which one would be the ideal, but
try .group or .group_by and .select perhaps.

Perhaps you may have to create intermediary arrays
via different methods and then merge them back
in again, to have an expanded Array.


#3

#Actually I want to group all elements contain ‘Mike’ into one group.

#I write the following script for my own purpose as a prove of
principle.

#Since I cannot find an easy way to group similar elements in one array

#I use two arrays to achieve the purpose.

#Any comments?

#Thanks,

require ‘pp’

def read_files
array1=[‘1_s’,‘2_a’,‘3_’,‘4_’,‘5_’]
array2=[‘11’,‘22’,‘33’,‘111’,‘11111’]
return array1, array2
end

def group_element(a1,a2)
array1=a1
array2=a2

array3=[]
temp=[]

array1.each do |a1|
#pattern to match
m=a1.split(’_’)[0]
array2.each do |a2|
if a2.match("#{m}")
temp<<[a1,a2]
end
end

temp= temp.flatten.uniq
array3<< temp if temp.size>0

single elelemt

if temp.size<1
array3<<a1

end
#empyt temp array
temp=[]

end#array1.each

pp array3

end#def

###########main################

read_files()
array1,array2=read_files()
group_element(array1,array2)


#4

#after read about ‘group_by’ method in ruby and google from Python,

#I come across with my own grouping similar elements in one array as
follow:

def groupby_similar_elements#with some criteria

array1=[‘1_s’,‘2_a’,‘3_’,‘4_’,‘5_’]
array2=[‘11’,‘22’,‘33’,‘111’,‘11111’]

     array3=array1+array2

array3=array3.group_by{|e|e[0]}.values
pp array3
end

#the point is to add a user specific condition.

#output

ruby group2.rb

[[“1_s”, “11”, “111”, “11111”], [“2_a”, “22”], [“3_”, “33”], [“4_”],
[“5_”]]

Exit code: 0