.each skipping elements

I have an array of links from a webpage. I need to clean up the links
so it only has city links in it. So I do a .each and test for regex.

page.links.each{

if link.text =~ /(Blog|About Us|Status|Help|TOS|Privacy|Are we missing
an area?)/
page.links.delete(link)
end
end

For some reason the deleting of a link/element causes it to skip the
next link/element

in the array the last 7 links are

About us
Blog
Status
Help
TOS
Privacy
Are we missing an area?

but after running the .each on the array I still end up with

blog
help
privacy

if I run it again

help

lol, so why can’t I do this on just one run threw with .each?
or why would deleting an element cause it to skip the next one.

I rebuilt it with an ugly while loop with a counter …same problem

Hi –

On Sun, 12 Sep 2010, Cameron V. wrote:

I have an array of links from a webpage. I need to clean up the links
so it only has city links in it. So I do a .each and test for regex.

page.links.each{

if link.text =~ /(Blog|About Us|Status|Help|TOS|Privacy|Are we missing
an area?)/
page.links.delete(link)
end
end

That can’t be the code you’re actually running; it doesn’t assign
anything to the link variable.

Privacy
help

lol, so why can’t I do this on just one run threw with .each?
or why would deleting an element cause it to skip the next one.

I rebuilt it with an ugly while loop with a counter …same problem

You’re doing a destructive operation on the array while you’re iterating
over it, which is going to give odd results. Ruby’s internal counter is
going to be pointing to the wrong array entry if one of them disappears.

You’re also doing too much work. Try this:

page.links.delete_if {|link| link =~ /…/ }

David


David A. Black, Senior Developer, Cyrus Innovation Inc.

The Ruby training with Black/Brown/McAnally
Compleat Philadelphia, PA, October 1-2, 2010
Rubyist http://www.compleatrubyist.com

Thanks for the reply and I think I get it now

if we delete element 50
element 51 gets sloted into 50
then the pointer moves to 51 never addressing the original 51…ok

page.links.delete_if{|link| link =~ /(Blog|About
Us|Status|Help|TOS|Privacy|Are we missing an area?)/
}

I tried it … it runs… no errors… It loops threw all the link
elements… but never does any thing… nothing gets deleted

I see how it should work …but it doesn’t

thanks for the help though

a = [“About us”, “Blog”, “Status”, “Help”, “TOS”, “Privacy”, “Are we
missing
an area?”]

a.delete_if{|link| link =~ /(Blog|About Us|Status|Help|TOS|Privacy|Are
we
missing an area?)/ }

p a # return [“About us”]

it works. maybe you need try {|link| link.text=~/…/}

Guten Tag. Linux Ruby

Signature powered by
http://www.wisestamp.com/email-install?utm_source=extension&utm_medium=email&utm_campaign=footer
WiseStamphttp://www.wisestamp.com/email-install?utm_source=extension&utm_medium=email&utm_campaign=footer

On Sun, Sep 12, 2010 at 1:28 AM, Cameron V.
[email protected] wrote:

I tried it … it runs… no errors… It loops threw all the link
elements… but never does any thing… nothing gets deleted

I see how it should work …but it doesn’t

Maybe page.links returns a new array every time you call it instead of
an access to an internal structure. This would mean you modify a copy
and not the original data. You could verify by doing

3.times do
puts page.links.object_id
end

If you see different object ids chances are that you get a copy and
are not modifying the original structure.

Kind regards

robert

Hi –

On Sun, 12 Sep 2010, Cameron V. wrote:

Thanks for the reply and I think I get it now

if we delete element 50
element 51 gets sloted into 50
then the pointer moves to 51 never addressing the original 51…ok

page.links.delete_if{|link| link =~ /(Blog|About
Us|Status|Help|TOS|Privacy|Are we missing an area?)/

(Note that the ? in that regex is a special character and will not match
an actual question mark. It’s a zero-or-one quantifier, operating on the
“a” character before it.)

}

I tried it … it runs… no errors… It loops threw all the link
elements… but never does any thing… nothing gets deleted

I see how it should work …but it doesn’t

Do you need to make it case insensitive? It definitely works:

$ cat del.rb
array = [“Keep1”, “Blog”, “Status”, “Keep2”, “TOS”, “Help”, “Keep3”]
array.delete_if {|word| word =~ /Blog|Status|TOS|Help/ }
p array

$ ruby del.rb
[“Keep1”, “Keep2”, “Keep3”]

so something else must be going on.

David


David A. Black, Senior Developer, Cyrus Innovation Inc.

The Ruby training with Black/Brown/McAnally
Compleat Philadelphia, PA, October 1-2, 2010
Rubyist http://www.compleatrubyist.com

Yep yep!

needed to add the .text at the end…

Thanks you guys are great…

def city_update
city_name = []
agent = Mechanize.new
page = agent.get(‘craigslist > sites’)
8.times {page.links.delete_at(0)}
page.links.delete_if{|link|

 link.text =~ /(Blog|About Us|Status|Help|TOS|Privacy|Are we missing 

an area?)/
}
page.links.each{|link|
city_name << link.text
}
end
puts city_update

Thats the whole method… basicly you want to make sure you have a
current list of availible Craigslist cities… and you want to cut out
all the non needed links… thanks again