Win32OLE + DRb - Windows = Fun

Hey,

The following is a message posted to the Boston Ruby Group mailing list
after our second meeting last Tuesday.

[ANNOUNCE]
Boston folks that weren’t at the meeting, check out
http://boston.rubygroup.org/boston/show/HomePage for more info.
[/ANNOUNCE]

===========================================

It sounds like quite a few of us have already discovered how nice DRb is
for helping less-able machines do more interesting things. I’ve used
it twice: once because I couldn’t make SSL connections with Net:HTTPS
on Solaris 5.6 and once when I wanted to be able to use Word from *nix.

My Word solution consists of a few small parts in different places, a
command-line program for human use, a little module to encapsulates
what I wanted to do with Word, and a DRb server running on the Windows
box.

Note: The following code is not intended to be a secure, elegant
solution ready for production deployment–clean this up if you want to
use it for real.

Requirements: A Windows box with Word that you can run Ruby on and is
addressable. The Windows box and the calling boxen should share some
drive that you know how to get to. Mine is just an NFS mounted drive.
Some understanding of the Word object model is extremely useful.

This example explains converting Word (.doc) files into WordprocessingML
(.xml, though I call them .wml) [the XML file format in Word2003]
files.

Command line tool to convert on *nix:

#!/usr/bin/env ruby
require ‘drb/drb’
PORT = 2774 # Some open port
HOSTNAME = ‘foo.bar.com’ # IP of Windows box
DRb.start_service

Connect to the Windows box

drb = DRbObject.new(nil, “druby://#{HOSTNAME}:#{PORT}”)

Ask it to make sure Word is running

word = drb.start_word

ARGV.each {|f|

inelegant way of converting my *nix paths to something the

Windows box liked

unix_filename = File.expand_path(f)
win_filename = unix_filename.gsub(///, “\”)
win_filename.sub!(/^\work/, “R:”)

Call the transformation, macro, whatever

resp = drb.wdtowml(win_filename)
puts “Converted to WML file: #{resp}”
}
drb.quit

My server for the Windows box:

require ‘drb’
require ‘thread’
require ‘drb/acl’
require ‘wordhelper’ # the module that does the work

PORT = 2774
HOSTNAME = ‘foo.bar.com

Security?

acl = ACL.new(%w(deny all
allow localhost
allow zoo.bar.com
allow goo.bar.com)) # Some set of boxen you like
DRb.install_acl(acl)

Let people talk to me, bind me to the Word module

DRb.start_service(“druby://#{HOSTNAME}:#{PORT}”, WordHelper::Word.new)

Keep running

DRb.thread.join

The WordHelper module, where the work is done:

module WordHelper
class Word
require ‘win32ole’

WORD_HTML = 8  # Ugly, don't use
WORD_XML = 11  # Much nicer, you should use this
WORD_95 = 106  # Help old programs
WORD_DOC = 0  # The regular filetype

attr_reader :wd, :wrd

def start_word
  @wd = WIN32OLE.new('Word.Application')
  # Win32OLE sometimes barf, so try to start Word
  # in two ways
  begin
    @wrd = WIN32OLE.connect('Word.Application')
  rescue WIN32OLERuntimeError
    @wrd = WIN32OLE.new('Word.Application')
  end
 
  # Set this to 0 if you want to run invisibly
  # Be warned: you'll end up with a lot of zombie Word
  # processes if you're not careful
  @wd.Visible = 1
  return @wd, @wrd
end

# Word to WordprocessingML (xml)
def wdtowml(file)
  begin
    # Expect a proper Windows-ready filename
    doc = @wd.Documents.Open(file)
    new_filename = file.sub(/doc$/, "wml")
    doc.SaveAs(new_filename, WORD_XML)
    doc.Close()
    return new_filename
  rescue
    # Just fail blindly on errors
    @wd.Quit()
    raise "Word encountered an unknown error and crashed."
  end
end

# Almost the same method, just as an example
def wdtohtml(file)
  begin
    # Expect a proper Windows-ready filename
    doc = @wd.Documents.Open(file)
    new_filename = file.sub(/doc$/, "html")
    doc.SaveAs(new_filename, WORD_HTML)
    doc.Close()
    return new_filename
  rescue
    @wd.Quit()
    raise "Word encountered an unknown error and crashed."
  end
end

def quit
  @wd.Quit()
end

end
end # of WordHelper Module

Another example with the use of macros or the Ruby equivalent:

Now, if you know that your Word instance will always have a set of
macros (from a template, say), you can call them thusly:

def wdrunmacro(file, macro)
  begin
    # Expect a proper Windows-ready filename
    doc = @wd.Documents.Open(file)
    @wrd.Run("TheMacroIAlwaysRun", doc)
    @wrd.Run(macro, doc)  # the macro name passed in
    doc.Save()
    doc.Close()
    return new_filename
  rescue
    @wd.Quit()
    raise "Word encountered an unknown error and crashed."
  end
end

===============================================

The above suffers from relying on macros being available whenever the
method is called. With a little work, you should be able to translate
your VBA macros into Ruby code, callable from anywhere.

Here’s a stupid example that checks the first character of Body
paragraphs following Heading 1 paragraphs for weirdness, deletes that
first character, removes the all the character formatting from the Body
paragraph and styles the paragraph Heading 2 (I said it was stupid…).

Note: This is written in a very VBAish way, which may or may not be good
for you, since it’s a pretty direct mapping.

def deletestupid(doc)
doc.Paragraphs.each do |para|
if para.Style.NameLocal.match(/Heading\s?1/)
p = para.Next # Won’t work if this is the last para
r = p.Range() # So you can talk about characters
if p.Style.NameLocal.match(/Body/)
unless r.Characters.First.Text =~ /[ A-Za-z0-9]/
# could also be r.Characters(1).Delete()
r.Characters.First.Delete()

      # Blast away character formatting
      p.Range.Font.Reset()
     
      # Apply a new paragraph style
      p.Style = doc.Styles("Heading 2")
    end  
  end  
end  

end
end

I’d love to hear your thoughts, comments, and corrections on the above.

Also here: http://kfahlgren.com/blog/?p=12

HTH,
Keith

Hi!

Interesting.
Thanks.

MCI

This is interesting stuff. I have recently used similar techniques to
execute excel workbooks in parallel on a computer grid.

On 2/11/06, Méta-MCI [email protected] wrote:

– Chiaroscuro –
Liquid Development Blog:
http://feeds.feedburner.com/blogspot/liquiddevelopment