Question regarding automating some Outlook/IMAP and pdf parsing functions w/ ruby?

Hello,

I am looking to automate a software process as follows: I would like to
know if it’s possible to create an rb that could look at an email inbox,
filter the results from the current day to 1 senders domain, any of
those emails w/ pdf attachments would get certain information parsed out
for use in the next processes, I would like a directory created on our
NAS and named from parts of the parsed info, I would like the original
pdf to be copied to that directory and renamed using parts of the parsed
info, finally I would like ruby to create a calendar event using parts
of the parsed info. The finished script can run on Linux or Windows.

Thanks in advance

Subject: Question regarding automating some Outlook/IMAP and pdf parsing
functions w/ ruby?
Date: gio 07 mar 13 01:44:18 +0900

Quoting Ed Zimmerman ([email protected]):

I am looking to automate a software process as follows: I would like to
know if it’s possible to create an rb that could look at an email inbox,
filter the results from the current day to 1 senders domain, any of
those emails w/ pdf attachments would get certain information parsed out
for use in the next processes, I would like a directory created on our
NAS and named from parts of the parsed info, I would like the original
pdf to be copied to that directory and renamed using parts of the parsed
info, finally I would like ruby to create a calendar event using parts
of the parsed info. The finished script can run on Linux or Windows.

Sounds feasible (I do not know about calendar events, though). What’s
your question? Did you start coding?

Carlo

I have not started coding, I am not a ruby programmer. I am just
starting the process of seeking advice on how these tasks could be
automated. Ruby seems like a viable solution, I have read a write up
describing the calendar creation so I believe both Outlook tasks could
be accomplished with the jruby-win32ole gem but I am as NOOB as they
come when talking about Ruby.

Thanks

Subject: Re: Question regarding automating some Outlook/IMAP and pdf
parsing functions w/ ruby?
Date: gio 07 mar 13 03:30:39 +0900

Quoting Ed Zimmerman ([email protected]):

I have not started coding, I am not a ruby programmer. I am just
starting the process of seeking advice on how these tasks could be
automated. Ruby seems like a viable solution, I have read a write up
describing the calendar creation so I believe both Outlook tasks could
be accomplished with the jruby-win32ole gem but I am as NOOB as they
come when talking about Ruby.

Ruby could be a pretty good choice (although I’d have problems in
using win32ole under Linux).

As far as I know, there is no ready-made solution to your
needs. Either you learn to program, or you find someone to code for
you.

Carlo

Here’s a starting point for you. Once you have the object it’s just a
matter of looking up the APIs and documentation.
It helps to do a bit of VBA because the IDE will suggest methods to you
when you’re working with certain objects.

irb(main):001:0> require ‘win32ole’
=> true
irb(main):002:0> ol = WIN32OLE.connect ‘outlook.application’
=> #WIN32OLE:0x2a46db8
irb(main):003:0> ol.application.activeexplorer.selection.each do |item|
irb(main):004:1* puts item.subject
irb(main):005:1> end
test
test
test
=> nil

I bet that is possible. You will have to use the gem pdf-reader for
parse a pdf. Visit this page about work with Outlook:

Kind regards. Damián.

I’m interested in this kind of thing, so I put together a simple
example.

The attached connects to an existing outlook session, then proceeds to
scan the inbox for messages which match these criteria:

Sender’s email ends with "@exampledomain.com"
Has attachments
At least one attachment is ‘.pdf’

Once it finds all the matches it creates a new directory and saves all
the matching files into it using unique references, taken from the
original filename, part of the email text, and the attachment’s existing
name.

I change exampledomain.com to the required domain and it receives an
error(method missing) but it does access the outlook inbox and runs
through the messages

What sections of the script can be populated with specific info?

I ran this on Office 2010 Outlook, what version are you using and what
exactly was the error message?

These are the bits you’d probably want to customise:

/@exampledomain.com$/i
This is a Regular Expression. The “i” means ignore case. The “$” means
at the end of a line (“\z” would be end of the string and would also
work).

folder = “#{ ENV[‘USERPROFILE’] }\example”
This is the destination folder. You could add another “\subfolder” for
each group of attachments if you want as well.

special_text = msg.body.scan( /^[a-z]{10}/i )[0]
This Regular Expression extracts text from the email matching a pattern:
in this case the first 10 a-z characters on the start of a line without
spaces or other characters in. It will be blank if it doesn’t find
anything, but you could change it to whatever text pattern you want to
extract. See http://www.rubular.com/ to experiment with Regex.

filename = “#{ folder }\#{ special_text }#{ index }#{ att.filename }”
This is the one where you determine what you file is called and where it
will go. I’ve decided to give them each a unique name by adding the main
folder, the content we pulled from the email, an index so it’s a unique
name, and the original filename as well. You can customise this however
you like.

filenames << ( filename = …
At this point I’m collecting all the filenames together in case I want
to report them back or do something else with them. It isn’t used in the
example but you could then do something else with this array.

Let me know if there’s any further clarification you need. I haven’t
looked into calender appointments, but I think there are a few examples
of doing this online, and you now have an interface with Outlook to play
with.

In that case the 271st email in your inbox is an object which doesn’t
support SenderEmailAddress. Presumably some sort of out of office or
server notification or something like that.
I can’t duplicate the error with my inbox but this addition might do it:

if ( msg.Senderemailaddress =~ /@exampledomain.com$/i rescue false ) &&

Basically rescue should trigger if there’s an error within that line,
and returninf false from the expression would cause it to skip the “if”
block.

I’m running Outlook 2010.

This is the error I see

271NoMethodError: unknown property or method: Senderemailaddress' HRESULT error code:0x80020006 Unknown name. from (irb):19:inmethod_missing’
from (irb):19:in block in irb_binding' from (irb):17:ineach’
from (irb):17:in with_index' from (irb):17 from C:/Ruby200-x64/bin/irb:12:in

Am 10.03.2013 19:17, schrieb Joel P.:

In that case the 271st email in your inbox is an object which doesn’t
support SenderEmailAddress. Presumably some sort of out of office or
server notification or something like that.
I can’t duplicate the error with my inbox but this addition might do it:

if ( msg.Senderemailaddress =~ /@exampledomain.com$/i rescue false ) &&

Basically rescue should trigger if there’s an error within that line,
and returninf false from the expression would cause it to skip the “if”
block.

`rescue false’ will rescue nearly all exceptions that might occur, which
is a potentially dangerous thing. Better use a begin/rescue/end block.

Am 10.03.2013 19:26, schrieb [email protected]:

block.

`rescue false’ will rescue nearly all exceptions that might occur, which
is a potentially dangerous thing. Better use a begin/rescue/end block.

I did not follow the start of the thread, but probably even better
in this case would be to use `respond_to?’ instead of relying on
exceptions.

unknown wrote in post #1100964:

Am 10.03.2013 19:26, schrieb [email protected]:

block.

`rescue false’ will rescue nearly all exceptions that might occur, which
is a potentially dangerous thing. Better use a begin/rescue/end block.

I did not follow the start of the thread, but probably even better
in this case would be to use `respond_to?’ instead of relying on
exceptions.

These are valid points, specific error trapping is good, and avoiding
errors even better. Unfortunately I haven’t been able to get
“respond_to?” working with ole methods, so it’ll have to be rescue.

Can’t do a single line rescue with a specific class…

if (
begin
msg.Senderemailaddress =~ /@exampledomain.com$/i
rescue NoMethodError
false
end
) &&

This link shows how to get specific folders (users) in outlook.

Joel P. wrote in post #1100986:

Can’t do a single line rescue with a specific class…

if (
begin
msg.Senderemailaddress =~ /@exampledomain.com$/i
rescue NoMethodError
false
end
) &&

Oh, thanks for reminding me! Many years ago I made a thing which I
thought could be potentially useful. Recently I even modified it to fit
with my more mature understanding of Ruby. Inspired by this message,
I’ve just gemified it. try | RubyGems.org | your community gem host

require ‘try’
if (Try.trap(NoMethodError=>false){msg.Senderemailaddress =~
/@exampledomain.com$/i} &&

Is there a method of calling on a specific Inbox if the Outlook instance
is connected to multiple accounts ie: other than

inbox = ol.GetNameSpace(‘MAPI’).GetDefaultFolder(6)

Nice! It looks readable and easy to use. I’ve installed the gem to play
with it a bit as well.

TMTOWTDI

Am 10.03.2013 22:59, schrieb Joel P.:

false
end
) &&

I would rather turn this around:

begin
sender = msg.Senderemailaddress
rescue NoMethodError

whatever

else
if sender =~ …

end