Forum: Ruby pop3 body email

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Erika (Guest)
on 2008-12-09 11:28
(Received via mailing list)
Hi,

I have to check the body of the email which is a html code, like the
following:
<tr><td>Text1: </td>

<td>Text2</td>

I have to check that for "Text1" what "Text2" is shown, which can vary
for different reasons.

I managed to get the emails, to check the subject of the email to
identify the correct email, but I can't get to resolve this problem
which I mentioned previously.

Thank you,
Erika

 require 'net/pop'

    pop = Net::POP3.new('smtp server name')
    pop.start('user', 'parola')
    if pop.mails.empty?
      puts 'No mail.'
    else
      i = 0
      pop.each_mail do |m|
        sbj=m.header.split("\r\n").grep(/^Subject:/)
        rate=m.pop.grep(/^Test1/)
        puts sbj
        puts m.pop
        i += 1
      end
      puts "#{pop.mails.size} mails popped."
    end
    pop.finish
Brian C. (Guest)
on 2008-12-09 22:36
Erika wrote:
> Hi,
>
> I have to check the body of the email which is a html code, like the
> following:
> <tr><td>Text1: </td>
>
> <td>Text2</td>
>
> I have to check that for "Text1" what "Text2" is shown, which can vary
> for different reasons.

If you are already able to read the E-mail via POP3, and the body
consists of a single text/html part, then this just becomes a question
about parsing HTML. Hpricot is usually cited as the best library for
doing that. I'd say it's not worth digging about with regexps when you
can do the job properly.

If your E-mail is a multipart/alternative then you may need a bit more
work to extract the text/html part first. The rubymail library may help
here.
Robert D. (Guest)
on 2008-12-09 23:12
(Received via mailing list)
On Tue, Dec 9, 2008 at 9:29 PM, Brian C. <removed_email_address@domain.invalid>
wrote:
>> for different reasons.
>
> If you are already able to read the E-mail via POP3, and the body
> consists of a single text/html part, then this just becomes a question
> about parsing HTML. Hpricot is usually cited as the best library for
> doing that. I'd say it's not worth digging about with regexps when you
> can do the job properly.
Thx a lot there, but please do not underestimate regexen, especially
as they will become more powerful in 1.9.

Are you aware of dependencies of Hpricot notwithstanding that it is a
wonderful tool.
Being an old Unix guy however I feel that you do not need a full
fledged library + dependencies if a three liner
can do the job.
However if this was only an example and OP needs more parsing Hpricot
is a very sensible way to go.
Cheers
R
Erika (Guest)
on 2008-12-10 10:22
(Received via mailing list)
Hi,

I tried using Hpricot in the following way:
.....
Connect with POP3 and get the email
.....
email = TMail::Mail.parse(m.pop)
body2=Hpricot(email.body)
elements = body2.search("/html/body/table//td").collect{|k|
k.inner_html.split(',') unless k.inner_html =~ /</ }.flatten.compact
puts elements

....

Is there any better way to extract the info?

My html email looks like something like this:
<table>
<tr><td>Info1</td><td>Info2</td></tr>
</table>
<table>
<tr><td>Info1</td><td>Info2</td></tr>
</table>
<table>
<tr><td>Info1</td><td>Info2</td></tr>
</table>
<table>
<tr><td>Info1</td><td>Info2</td></tr>
<tr><td>Info1</td><td>Info2</td></tr>
<tr><td>Info1</td><td>Info2</td></tr>
</table>

So the general rule is that i have 3 tables and I need to check that for
Info1 the correct Info2 is shown. Every Info1 / Info2 can vary.

Is there a better way to have in to arrays for example Info1 and Info2.
Because my solution is ok, only I subtract all the info from the html
code which I need to parse one more time.

Thanks,
Erika


________________________________
From: Robert D. <removed_email_address@domain.invalid>
To: ruby-talk ML <removed_email_address@domain.invalid>
Sent: Tuesday, December 9, 2008 11:02:39 PM
Subject: Re: pop3 body email

On Tue, Dec 9, 2008 at 9:29 PM, Brian C. <removed_email_address@domain.invalid>
wrote:
>> for different reasons.
>
> If you are already able to read the E-mail via POP3, and the body
> consists of a single text/html part, then this just becomes a question
> about parsing HTML. Hpricot is usually cited as the best library for
> doing that. I'd say it's not worth digging about with regexps when you
> can do the job properly.
Thx a lot there, but please do not underestimate regexen, especially
as they will become more powerful in 1.9.

Are you aware of dependencies of Hpricot notwithstanding that it is a
wonderful tool.
Being an old Unix guy however I feel that you do not need a full
fledged library + dependencies if a three liner
can do the job.
However if this was only an example and OP needs more parsing Hpricot
is a very sensible way to go.
Cheers
R
Poornima D. (Guest)
on 2009-02-17 10:00
Hi,

   I tried with pop to get my inbox messages from my gmail account.but i
got "execution expired" error.plz help me.My code is

require 'net/pop'
puts  "Running Mail Importer..."
begin
if Net::POP3.start("pop.gmail.com", nil, "gmailaccount",
"gmailpassword")
  Net::POP3.start("pop.gmail.com", nil, "gmailaccount", "gmailpassword")
do |pop|
  puts "After....."
  if pop.mails.empty?
    puts  "NO MAIL"
  else
    pop.mails.each do |email|
      begin
        puts  "receiving mail..."
        Notifier.receive(email.pop)
        email.delete
      rescue Exception => e
        puts "Error receiving email at " + Time.now.to_s + "::: " +
e.message
      end
    end
  end
end
else
 puts "Not Connecting..."
end
rescue Exception => e
  puts e
end

puts  "Finished Mail Importer."
This topic is locked and can not be replied to.