Forum: Ruby Question regarding automating some Outlook/IMAP and pdf parsing functions w/ ruby?

Posted by Ed Zimmerman (eztech)
on 2013-03-07 05:44
Hello,

I am looking to automate a software process as follows: I would like to
know if it's possible to create an rb that could look at an email inbox,
filter the results from the current day to 1 senders domain, any of
those emails w/ pdf attachments would get certain information parsed out
for use in the next processes, I would like a directory created on our
NAS and named from parts of the parsed info, I would like the original
pdf to be copied to that directory and renamed using parts of the parsed
info, finally I would like ruby to create a calendar event using parts
of the parsed info. The finished script can run on Linux or Windows.

Thanks in advance
Posted by Carlo E. Prelz (Guest)
on 2013-03-07 07:11
(Received via mailing list)
Subject: Question regarding automating some Outlook/IMAP and pdf parsing 
functions w/ ruby?
  Date: gio 07 mar 13 01:44:18 +0900

Quoting Ed Zimmerman (lists@ruby-forum.com):

> I am looking to automate a software process as follows: I would like to
> know if it's possible to create an rb that could look at an email inbox,
> filter the results from the current day to 1 senders domain, any of
> those emails w/ pdf attachments would get certain information parsed out
> for use in the next processes, I would like a directory created on our
> NAS and named from parts of the parsed info, I would like the original
> pdf to be copied to that directory and renamed using parts of the parsed
> info, finally I would like ruby to create a calendar event using parts
> of the parsed info. The finished script can run on Linux or Windows.

Sounds feasible (I do not know about calendar events, though). What's
your question? Did you start coding?

Carlo
Posted by Ed Zimmerman (eztech)
on 2013-03-07 07:30
I have not started coding, I am not a ruby programmer. I am just 
starting the process of seeking advice on how these tasks could be 
automated. Ruby seems like a viable solution, I have read a write up 
describing the calendar creation so I believe both Outlook tasks could 
be accomplished with the jruby-win32ole gem but I am as NOOB as they 
come when talking about Ruby.

Thanks
Posted by Carlo E. Prelz (Guest)
on 2013-03-07 08:02
(Received via mailing list)
Subject: Re: Question regarding automating some Outlook/IMAP and pdf 
parsing functions w/ ruby?
  Date: gio 07 mar 13 03:30:39 +0900

Quoting Ed Zimmerman (lists@ruby-forum.com):

> I have not started coding, I am not a ruby programmer. I am just
> starting the process of seeking advice on how these tasks could be
> automated. Ruby seems like a viable solution, I have read a write up
> describing the calendar creation so I believe both Outlook tasks could
> be accomplished with the jruby-win32ole gem but I am as NOOB as they
> come when talking about Ruby.

Ruby could be a pretty good choice (although I'd have problems in
using win32ole under Linux).

As far as I know, there is no ready-made solution to your
needs. Either you learn to program, or you find someone to code for
you.

Carlo
Posted by Joel Pearson (virtuoso)
on 2013-03-07 17:40
Here's a starting point for you. Once you have the object it's just a 
matter of looking up the APIs and documentation.
It helps to do a bit of VBA because the IDE will suggest methods to you 
when you're working with certain objects.

irb(main):001:0> require 'win32ole'
=> true
irb(main):002:0> ol = WIN32OLE.connect 'outlook.application'
=> #<WIN32OLE:0x2a46db8>
irb(main):003:0> ol.application.activeexplorer.selection.each do |item|
irb(main):004:1*   puts item.subject
irb(main):005:1> end
test
test
test
=> nil
Posted by Damián M. González (igorjorobus)
on 2013-03-08 03:44
I bet that is possible. You will have to use the gem pdf-reader for 
parse a pdf. Visit this page about work with Outlook:

http://rubyonwindows.blogspot.com.ar/search/label/outlook

 Kind regards. Damián.
Posted by Joel Pearson (virtuoso)
on 2013-03-08 12:52
Attachment: outlook.rb (1,07 KB)
I'm interested in this kind of thing, so I put together a simple 
example.

The attached connects to an existing outlook session, then proceeds to 
scan the inbox for messages which match these criteria:

Sender's email ends with "@exampledomain.com"
Has attachments
At least one attachment is '.pdf'

Once it finds all the matches it creates a new directory and saves all 
the matching files into it using unique references, taken from the 
original filename, part of the email text, and the attachment's existing 
name.
Posted by Ed Zimmerman (eztech)
on 2013-03-09 03:32
What sections of the script can be populated with specific info?
Posted by Ed Zimmerman (eztech)
on 2013-03-09 03:34
I change exampledomain.com to the required domain and it receives an 
error(method missing) but it does access the outlook inbox and runs 
through the messages
Posted by Joel Pearson (virtuoso)
on 2013-03-09 09:09
I ran this on Office 2010 Outlook, what version are you using and what 
exactly was the error message?

These are the bits you'd probably want to customise:

/@exampledomain.com$/i
This is a Regular Expression. The "i" means ignore case. The "$" means 
at the end of a line ("\z" would be end of the string and would also 
work).

folder = "#{ ENV['USERPROFILE'] }\\example"
This is the destination folder. You could add another "\\subfolder" for 
each group of attachments if you want as well.

special_text = msg.body.scan( /^[a-z]{10}/i )[0]
This Regular Expression extracts text from the email matching a pattern: 
in this case the first 10 a-z characters on the start of a line without 
spaces or other characters in. It will be blank if it doesn't find 
anything, but you could change it to whatever text pattern you want to 
extract. See http://www.rubular.com/ to experiment with Regex.

filename = "#{ folder }\\#{ special_text }#{ index }#{ att.filename }"
This is the one where you determine what you file is called and where it 
will go. I've decided to give them each a unique name by adding the main 
folder, the content we pulled from the email, an index so it's a unique 
name, and the original filename as well. You can customise this however 
you like.

filenames << ( filename = ...
At this point I'm collecting all the filenames together in case I want 
to report them back or do something else with them. It isn't used in the 
example but you could then do something else with this array.

Let me know if there's any further clarification you need. I haven't 
looked into calender appointments, but I think there are a few examples 
of doing this online, and you now have an interface with Outlook to play 
with.
Posted by Ed Zimmerman (eztech)
on 2013-03-10 17:21
I'm running Outlook 2010.

This is the error I see

271NoMethodError: unknown property or method: `Senderemailaddress'
    HRESULT error code:0x80020006
      Unknown name.
        from (irb):19:in `method_missing'
        from (irb):19:in `block in irb_binding'
        from (irb):17:in `each'
        from (irb):17:in `with_index'
        from (irb):17
        from C:/Ruby200-x64/bin/irb:12:in `<main>'
Posted by Joel Pearson (virtuoso)
on 2013-03-10 19:16
In that case the 271st email in your inbox is an object which doesn't 
support SenderEmailAddress. Presumably some sort of out of office or 
server notification or something like that.
I can't duplicate the error with my inbox but this addition might do it:

if ( msg.Senderemailaddress =~ /@exampledomain.com$/i rescue false ) &&

Basically rescue should trigger if there's an error within that line, 
and returninf false from the expression would cause it to skip the "if" 
block.
Posted by unknown (Guest)
on 2013-03-10 19:26
(Received via mailing list)
Am 10.03.2013 19:17, schrieb Joel Pearson:
> In that case the 271st email in your inbox is an object which doesn't
> support SenderEmailAddress. Presumably some sort of out of office or
> server notification or something like that.
> I can't duplicate the error with my inbox but this addition might do it:
>
> if ( msg.Senderemailaddress =~ /@exampledomain.com$/i rescue false ) &&
>
> Basically rescue should trigger if there's an error within that line,
> and returninf false from the expression would cause it to skip the "if"
> block.

`rescue false' will rescue nearly all exceptions that might occur, which
is a potentially dangerous thing. Better use a begin/rescue/end block.
Posted by unknown (Guest)
on 2013-03-10 19:35
(Received via mailing list)
Am 10.03.2013 19:26, schrieb sto.mar@web.de:
>> block.
>
> `rescue false' will rescue nearly all exceptions that might occur, which
> is a potentially dangerous thing. Better use a begin/rescue/end block.

I did not follow the start of the thread, but probably even better
in this case would be to use `respond_to?' instead of relying on
exceptions.
Posted by Ed Zimmerman (eztech)
on 2013-03-10 20:24
Is there a method of calling on a specific Inbox if the Outlook instance 
is connected to multiple accounts ie: other than

inbox = ol.GetNameSpace('MAPI').GetDefaultFolder(6)
Posted by Joel Pearson (virtuoso)
on 2013-03-10 22:59
unknown wrote in post #1100964:
> Am 10.03.2013 19:26, schrieb sto.mar@web.de:
>>> block.
>>
>> `rescue false' will rescue nearly all exceptions that might occur, which
>> is a potentially dangerous thing. Better use a begin/rescue/end block.
>
> I did not follow the start of the thread, but probably even better
> in this case would be to use `respond_to?' instead of relying on
> exceptions.

These are valid points, specific error trapping is good, and avoiding 
errors even better. Unfortunately I haven't been able to get 
"respond_to?" working with ole methods, so it'll have to be rescue.

Can't do a single line rescue with a specific class...

if (
begin
msg.Senderemailaddress =~ /@exampledomain.com$/i
rescue NoMethodError
false
end
) &&


This link shows how to get specific folders (users) in outlook.
http://rubyonwindows.blogspot.com.ar/2008/01/rubyg...
Posted by Matthew Kerwin (mattyk)
on 2013-03-11 05:27
Joel Pearson wrote in post #1100986:
> Can't do a single line rescue with a specific class...
>
> if (
> begin
> msg.Senderemailaddress =~ /@exampledomain.com$/i
> rescue NoMethodError
> false
> end
> ) &&

Oh, thanks for reminding me!  Many years ago I made a thing which I 
thought could be potentially useful.  Recently I even modified it to fit 
with my more mature understanding of Ruby.  Inspired by this message, 
I've just gemified it.  https://rubygems.org/gems/try

  require 'try'
  if (Try.trap(NoMethodError=>false){msg.Senderemailaddress =~ 
/@exampledomain.com$/i} &&
Posted by Joel Pearson (virtuoso)
on 2013-03-11 10:02
Nice! It looks readable and easy to use. I've installed the gem to play 
with it a bit as well.
Posted by unknown (Guest)
on 2013-03-11 22:34
(Received via mailing list)
Am 10.03.2013 22:59, schrieb Joel Pearson:
>
> false
> end
> ) &&

I would rather turn this around:

begin
   sender = msg.Senderemailaddress
rescue NoMethodError
   # whatever
else
   if sender =~ ...
   ...
end
Posted by Joel Pearson (virtuoso)
on 2013-03-11 22:37
TMTOWTDI
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.