Forum: Ruby Win32OLE + DRb - Windows = Fun

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
01d68aff859065b5cbc1cfc67cb16871?d=identicon&s=25 Keith Fahlgren (Guest)
on 2006-02-10 22:08
(Received via mailing list)
Hey,

The following is a message posted to the Boston Ruby Group mailing list
after our second meeting last Tuesday.

[ANNOUNCE]
Boston folks that weren't at the meeting, check out
http://boston.rubygroup.org/boston/show/HomePage for more info.
[/ANNOUNCE]

===========================================

It sounds like quite a few of us have already discovered how nice DRb is
for helping less-able machines do more interesting things.  I've used
it twice: once because I couldn't make SSL connections with Net:HTTPS
on Solaris 5.6 and once when I wanted to be able to use Word from *nix.

My Word solution consists of a few small parts in different places, a
command-line program for human use, a little module to encapsulates
what I wanted to do with Word, and a DRb server running on the Windows
box.

Note: The following code is not intended to be a secure, elegant
solution ready for production deployment--clean this up if you want to
use it for real.

Requirements: A Windows box with Word that you can run Ruby on and is
addressable. The Windows box and the calling boxen should share some
drive that you know how to get to. Mine is just an NFS mounted drive.
Some understanding of the Word object model is extremely useful.


This example explains converting Word (.doc) files into WordprocessingML
(.xml, though I call them .wml) [the XML file format in Word2003]
files.


Command line tool to convert on *nix:
===============================================
#!/usr/bin/env ruby
require 'drb/drb'
PORT = 2774   # Some open port
HOSTNAME = 'foo.bar.com'   # IP of Windows box
DRb.start_service

# Connect to the Windows box
drb = DRbObject.new(nil, "druby://#{HOSTNAME}:#{PORT}")

# Ask it to make sure Word is running
word = drb.start_word

ARGV.each {|f|
  # inelegant way of converting my *nix paths to something the
  # Windows box liked
  unix_filename = File.expand_path(f)
  win_filename = unix_filename.gsub(/\//, "\\")
  win_filename.sub!(/^\\work/, "R:")  
 
  # Call the transformation, macro, whatever
  resp = drb.wdtowml(win_filename)
  puts "Converted to WML file: #{resp}"
}
drb.quit
===============================================

My server for the Windows box:
===============================================
require 'drb'
require 'thread'
require 'drb/acl'
require 'wordhelper' # the module that does the work

PORT = 2774
HOSTNAME = 'foo.bar.com'

# Security?
acl = ACL.new(%w(deny all
                 allow localhost
                 allow zoo.bar.com
                 allow goo.bar.com)) # Some set of boxen you like
DRb.install_acl(acl)

# Let people talk to me, bind me to the Word module
DRb.start_service("druby://#{HOSTNAME}:#{PORT}", WordHelper::Word.new)

# Keep running
DRb.thread.join
===============================================

The WordHelper module, where the work is done:
===============================================
module WordHelper
  class Word
    require 'win32ole'

    WORD_HTML = 8  # Ugly, don't use
    WORD_XML = 11  # Much nicer, you should use this
    WORD_95 = 106  # Help old programs
    WORD_DOC = 0  # The regular filetype

    attr_reader :wd, :wrd

    def start_word
      @wd = WIN32OLE.new('Word.Application')
      # Win32OLE sometimes barf, so try to start Word
      # in two ways
      begin
        @wrd = WIN32OLE.connect('Word.Application')
      rescue WIN32OLERuntimeError
        @wrd = WIN32OLE.new('Word.Application')
      end
     
      # Set this to 0 if you want to run invisibly
      # Be warned: you'll end up with a lot of zombie Word
      # processes if you're not careful
      @wd.Visible = 1
      return @wd, @wrd
    end

    # Word to WordprocessingML (xml)
    def wdtowml(file)
      begin
        # Expect a proper Windows-ready filename
        doc = @wd.Documents.Open(file)
        new_filename = file.sub(/doc$/, "wml")
        doc.SaveAs(new_filename, WORD_XML)
        doc.Close()
        return new_filename
      rescue
        # Just fail blindly on errors
        @wd.Quit()
        raise "Word encountered an unknown error and crashed."
      end
    end

    # Almost the same method, just as an example
    def wdtohtml(file)
      begin
        # Expect a proper Windows-ready filename
        doc = @wd.Documents.Open(file)
        new_filename = file.sub(/doc$/, "html")
        doc.SaveAs(new_filename, WORD_HTML)
        doc.Close()
        return new_filename
      rescue
        @wd.Quit()
        raise "Word encountered an unknown error and crashed."
      end
    end

    def quit
      @wd.Quit()
    end
  end
end # of WordHelper Module
===============================================


Another example with the use of macros or the Ruby equivalent:

Now, if you know that your Word instance will always have a set of
macros (from a template, say), you can call them thusly:
===============================================
    def wdrunmacro(file, macro)
      begin
        # Expect a proper Windows-ready filename
        doc = @wd.Documents.Open(file)
        @wrd.Run("TheMacroIAlwaysRun", doc)
        @wrd.Run(macro, doc)  # the macro name passed in
        doc.Save()
        doc.Close()
        return new_filename
      rescue
        @wd.Quit()
        raise "Word encountered an unknown error and crashed."
      end
    end
===============================================

The above suffers from relying on macros being available whenever the
method is called. With a little work, you should be able to translate
your VBA macros into Ruby code, callable from anywhere.

Here's a stupid example that checks the first character of Body
paragraphs following Heading 1 paragraphs for weirdness, deletes that
first character, removes the all the character formatting from the Body
paragraph and styles the paragraph Heading 2 (I said it was stupid..).

Note: This is written in a very VBAish way, which may or may not be good
for you, since it's a pretty direct mapping.
===============================================
def deletestupid(doc)
  doc.Paragraphs.each do |para|
    if para.Style.NameLocal.match(/Heading\s?1/)
      p = para.Next  # Won't work if this is the last para
      r = p.Range()  # So you can talk about characters
      if p.Style.NameLocal.match(/Body/)
        unless r.Characters.First.Text =~ /[ A-Za-z0-9]/
          # could also be r.Characters(1).Delete()
          r.Characters.First.Delete()  
         
          # Blast away character formatting
          p.Range.Font.Reset()
         
          # Apply a new paragraph style
          p.Style = doc.Styles("Heading 2")
        end  
      end  
    end  
  end    
end  
===============================================

I'd love to hear your thoughts, comments, and corrections on the above.

Also here: http://kfahlgren.com/blog/?p=12

HTH,
Keith
1e59a9b0a6d8ca60170ac3e6a19247ad?d=identicon&s=25 Méta-MCI (Guest)
on 2006-02-11 09:38
(Received via mailing list)
Hi!

Interesting.
Thanks.

MCI
D8fb06dfc08a477ecb0a76ffdbff3475?d=identicon&s=25 Chiaro Scuro (chiaroscuro)
on 2006-02-11 21:30
(Received via mailing list)
This is interesting stuff.  I have recently used similar techniques to
execute excel workbooks in parallel on a computer grid.

On 2/11/06, Méta-MCI <enleverlesX.XmcX@xmclaveaux.com> wrote:
>
--

-- Chiaroscuro --
Liquid Development Blog:
http://feeds.feedburner.com/blogspot/liquiddevelopment
This topic is locked and can not be replied to.