Comparing String-portions Within Arrays

Hey guys,

Is there are simple way to compare two arrays of string, then delete
elements that don’t match? But I only want to compare the first five
characters contained in each element.

Example:

a = [ "55555 anything at all ", “22222any thing at all”]
b = [“22222 some other something”]

I want to only delete a[0]. Any simple, clean way of doing this?

Thanks,
Jake

& is what u are looking for

ruby-1.9.2-p180 :001 > [1,2,3] & [3]
=> [3]

Lus Landeiro R.

sorry I skimed through your email,
a quick way would be to create an hash with the numbers like

hash = b.inject({}){|h,v| h[v[0…4]]=v ; h}

and then select only values of a present in the hash

a.select {|x| hash[x[0…4]]}

Sent from my iPhone

Thanks. That’s great for when looking for elements that you are looking
for
are exactly the same. The problem is that I am looking for
similarelements, with just the first five characters of the string
equal.

2011/6/23 Lus Landeiro R. [email protected]

Thanks! Will try ASAP.

2011/6/23 Lus Landeiro R. [email protected]

I wanted to reason it out, so I attempted a brute-force method. I am
just
about to go crazy, I think. Been studying this method all day to see why
I
am getting incorrect results. Here is the method:

def strip_mismatches parsed_old, parsed_new

  • #create two arrays to hold operations*
  • old, new1 = [], []*
  • #create temp array for storing pieces of op*
  • temp = []*
  • if !parsed_old.empty? && !parsed_new.empty? *
  • #the following takes an array of lines and using*
  •                    #a delimiter, groups certain elements together*
    
  • parsed_old.each_index do |i|*
  • temp << parsed_old[i]*
  • #if next element is op num*
  • if !parsed_old[i+1].to_s.scan(@@reg_exp).empty? ||*
  • i == parsed_old.length - 1*
  • #store array as single-element text in “old” array*
  • old << temp.to_s*
  • temp = []*
  • end*
  • end*
  • #same as above but for second parameter*
  • parsed_new.each_index do |i|*
  • temp << parsed_new[i]*
  • #if next element is op num*
  • if !parsed_new[i+1].to_s.scan(@@reg_exp).empty? ||*
  • i == parsed_new.length - 1*
  • #store array as single-element text in “old” array*
  • new1 << temp.to_s*
  • temp = []*
  • end*
  • end *
  • #get rid of whitespace*
  • old.each_index {|i| old[i].to_s.gsub! /\s/, “”}*
  • new1.each_index {|i| new1[i].to_s.gsub! /\s/, “”}*
  • #create arrays holding first five characters of each element*
  • first_five_old = []*
  • first_five_new = []*
  • old.each_index do |i|*
  • first_five_old << old[i][0…5]*
  • end*
  • new1.each_index do |i|*
  • first_five_new << new1[i][0…5]*
  • end*
  • #create an array holding all elements*
  • all = first_five_old + first_five_new*
  • #create array to hold similar elements*
  • same = first_five_old & first_five_new*
  • #create array holding difference*
  • diff = all - same*
  • diff = diff & diff #in case of multiple occurrences of item*
  • #for all differences*
  • diff.each_index do |i|*
  •                            #if difference exists in old operation*
    
  • if first_five_old.index(diff[i].to_s) != nil*
  •                                     #delete element containing 
    

that
operation*

  • old.delete_at(first_five_old.index(diff[i].to_s))*
  • end*
  •                            #same for other file*
    
  • if first_five_new.index(diff[i].to_s) != nil*
  • new1.delete_at(first_five_new.index(diff[i].to_s))*
  • end*
  • end*
  • end*
    return old, new1 #should be same length, but aren’t always
  • end*

I just can’t seem to find the bug(s) in this code. I am going blind.
Here
are example input parameters:

parsed_old:
*
*
*
[“0001 [”,
“blah blah blah”,
“0015 [”,
**
“blah blah blah”,
**
“blah blah blah”,
**
“0017 [”,
**
“blah blah blah
**
blah blah blah”,
**
“0018 [”,
**
“blah blah blah”,
**
“blah blah blah”,
**
“blah blah blah”,
**
“0019 [”,
**
“blah blah”,
**
“0057 [”,
**
“blah”]

parsed_new:

**
[“0001 [”,
**

**
“blah blah blah”,
“0015 [”,
**
“blah blah blah”,
**
“blah blah blah”,
**
“0057 [”,
*
“blah”]

Can anyone help me with this? Let me know if I left anything out…
very
tired.

I want to only delete a[0]. Any simple, clean way of doing this?

With no thought to efficiency, here is my effort:

class Array

Returns new array with unmatched elements removed

“matchiness” is determined by comparing elements

returned by the block or the elements themselves if

no block is provided

def strip_unmatched
classifed = inject(Hash.new{|h,k| h[k] = []}) do |collector, item|
key = block_given? ? yield(item) : item
collector[key] << item
collector
end
classifed.reject {|k,v| v.size == 1}.values.flatten
end

Modifies self by removing elements that don’t match

other elements. see #strip_unmatched.

def strip_unmatched!(&blk)
replace(strip_unmatched(&blk))
end
end

a = [‘12345 keep’, ‘12345 keep 2’, ‘11111 i should be deleted’,
‘99393 delete me too’, ‘22233 keep’, ‘22233 keep me as well’,
‘78444 getting deleted’]

new_a = a.strip_unmatched { |i| i[0…4] }
puts “STRIPPED: #{new_a.inspect}”
puts “ORIGINAL: #{a.inspect}”

try version that modifies receiver

a.strip_unmatched!{|i| i[0…4] }
puts “ORIGINAL: #{a.inspect}”

$ jruby strip_mismatches.rb
STRIPPED: [“12345 keep”, “12345 keep 2”, “22233 keep”, “22233 keep me as
well”]
ORIGINAL: [“12345 keep”, “12345 keep 2”, “11111 i should be deleted”,
“99393
delete me too”, “22233 keep”, “22233 keep me as well”, “78444 getting
deleted”]
ORIGINAL: [“12345 keep”, “12345 keep 2”, “22233 keep”, “22233 keep me as
well”]

Hi Jake,

What was wrong with Luis’s solution?

a = [ "55555 anything at all ", “22222any thing at all”]
b = [ “22222 some other something” ]

b_hashed = b.inject({}) {|h, e| h[e[0…[4, e.size].min]] = e ; h}
puts a.select {|x| b_hashed[x[0…4]] }.inspect

For the original example you gave it removes "55555 anything at all "

Good luck.

For the original example you gave it removes "55555 anything at all "

Good luck.

For the record: Luis’s solution meets the requirements, mine does not.
I
misread the original email completely not realizing it dealt with two
arrays.

-Doug

Thanks, guys. There is nothing wrong with it. My solution was a lot
sloppier. To be honest, I saw it and it scared me off. Now I see how
simple
it is. Sorry about that. I appreciate it. That will work great.

On Fri, Jun 24, 2011 at 8:00 AM, Jake J. [email protected]
wrote:

Can anyone help me with this? Let me know if I left anything out… very
tired.

Did you take a look at Arrary#reject[0]?

[0] class Array - RDoc Documentation


Phillip G.

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibnitz