String scanning woes :(

Hello all,

I am trying to get a file into an array using .scan and I can’t seem
to get anything to work properly.

I am reading in a file of email addresses (1 per line) and it all
seems to come in as 1 long string some how. I am trying to use scan
to break it up into an array of emails so that I can do some uniq
checks and validation with other arrays. But I just don’t seem to get
it right.

My code right now is as follows:

emails = File.open("/users/lem/desktop/test/
POCs_DNB.txt",“r”).readlines.map! {|x| x.chomp} # Read in the list
of emails
email.scan(/\S+/) # To mach on spaces (I assume). I thought I would
be matching on new lines
puts email # To verify

When I did an inspect on the email variable The address appeared as
such

[email protected]\[email protected]\[email protected]…”

This is my absolute first time working with .scan and regular
expressions so I have a little bit of a learning curve with this one.

Any help is greatly appreciated.

On Feb 4, 7:50 pm, Vell [email protected] wrote:

such

[email protected]\[email protected]\[email protected]…”

This is my absolute first time working with .scan and regular
expressions so I have a little bit of a learning curve with this one.

Any help is greatly appreciated.

You could try this to make it a bit easier:

File.open(“/users/lem/desktop/test/POCs_DNB.txt”, “r”).each_line do |
line|
line.chomp!

now you have a single line (sans newline) from your file

end

no need to close the file either :smiley:

On Feb 5, 2008 8:54 AM, Vell [email protected] wrote:

I am trying to get a file into an array using .scan and I can’t seem
to get anything to work properly.

to avoid doubt, try slowly.

this is a first mod/run of your posted code, eg,

botp@pc4all:~$ cat test.txt
[email protected]
[email protected]
[email protected]

botp@pc4all:~$ cat test.rb
p File.readlines(“test.txt”).map{|x| x.chomp}

botp@pc4all:~$ ruby test.rb
[“[email protected]”, “[email protected]”, “[email protected]”]

that is just one way. there are many ways if using ruby.

kind regards -botp

Lovell Mcilwain wrote:

Hello all,

I am trying to get a file into an array using .scan and I can’t seem
to get anything to work properly.

I am reading in a file of email addresses (1 per line) and it all
seems to come in as 1 long string some how. I am trying to use scan
to break it up into an array of emails so that I can do some uniq
checks and validation with other arrays. But I just don’t seem to get
it right.

My code right now is as follows:

emails = File.open("/users/lem/desktop/test/
POCs_DNB.txt",“r”).readlines.map! {|x| x.chomp}

email.scan(/\S+/)

scan() returns an array. You don’t assign the array to any variable, so
it is discarded.

When I did an inspect on the email variable The address appeared as
such

[email protected]\[email protected]\[email protected]…”

Nowhere in the code you posted does a variable named email exist.

This is my absolute first time working with .scan and regular
expressions so I have a little bit of a learning curve with this one.

Any help is greatly appreciated.

If you expect to get relevant help, you should post a short example
progrram that demonstrates your problem, i.e. an example program that
anyone can run and get the same results you do.

Lovell Mcilwain wrote:

The code I posted is exactly what I ran aside for giving you
hundrededs of lines of email. The example is exactly what I ran to
get the results I posted.

emails = File.open(“data.txt”).readlines.map! {|x| x.chomp}
email.scan(/\S+/)

–output:–
r1test.rb:2: undefined local variable or method `email’ for main:Object
(NameError)

On Feb 4, 9:07 pm, 7stud – [email protected] wrote:

it right.

When I did an inspect on the email variable The address appeared as
such

[email protected]\[email protected]\[email protected]…”

Nowhere in the code you posted does a variable named email exist.

Very first line of my code is what I thought to be a variable…

This is my absolute first time working with .scan and regular
expressions so I have a little bit of a learning curve with this one.

Any help is greatly appreciated.

If you expect to get relevant help, you should post a short example
progrram that demonstrates your problem, i.e. an example program that
anyone can run and get the same results you do.

The code I posted is exactly what I ran aside for giving you
hundrededs of lines of email. The example is exactly what I ran to
get the results I posted.

On Feb 4, 6:50 pm, Vell [email protected] wrote:

emails = File.open(“/users/lem/desktop/test/
POCs_DNB.txt”,“r”).readlines.map! {|x| x.chomp}
email.scan(/\S+/)

First you say “emails”; then you say “email”.
This is not code that will run.

Didn’t you copy and paste? Don’t tell us that you
retyped the code because you were eager for the chance
to introduce errors.

p IO.readlines( “data” ).map{|x| x.strip }
p IO.read( “data” ).split
p IO.read( “data” ).scan(/\S+/)

On Feb 5, 3:52 am, Robert K. [email protected] wrote:

require ‘set’
addresses = Set.new

File.foreach “data.txt” do |line|
line.chomp!
line.downcase!

puts “Duplicate: #{line}” unless addresses.add? line
end

h = {}
File.foreach(“data”){|e|
e = e.strip.upcase
puts “Duplicate: #{ e }” if h.include? e
h[ e ] = true
}

2008/2/5, 7stud – [email protected]:

Lovell Mcilwain wrote:

The code I posted is exactly what I ran aside for giving you
hundrededs of lines of email. The example is exactly what I ran to
get the results I posted.

emails = File.open(“data.txt”).readlines.map! {|x| x.chomp}
email.scan(/\S+/)

Though shalt use the block form of File.open to ensure proper cleanup!

Apart from that there is another way:

require ‘set’
addresses = Set.new

File.foreach “data.txt” do |line|
line.chomp!
line.downcase!

puts “Duplicate: #{line}” unless addresses.add? line
end

Cheers

robert

On Feb 5, 7:20 am, William J. [email protected] wrote:

end

h = {}
File.foreach(“data”){|e|
e = e.strip.upcase
puts “Duplicate: #{ e }” if h.include? e
h[ e ] = true

}

Thanks guys for all the helpful hints.

On Feb 5, 6:40 am, William J. [email protected] wrote:

retyped the code because you were eager for the chance
to introduce errors.

I’m a beginner, lighten up James.