How to add a licence section to source files - Part 1

A couple of years ago I was working on an Android anti-virus scanning
project. I enjoyed it tremendously - Android was still in its infancy so
the platform was new to us, and we were porting code over from the
Symbian platform so it was a huge exercise in unit testing legacy code
which was satisfying.

However, when nearing the end of the project the customer requested that
a copyright notice was placed at the top of each and every source file.
E.g. a source file such as:

    class HelloWorld
    {
          HelloWorld()
          {
               println(“Hello World”);
          }
    }

would become:

     /**********************
     * A Licence
     *
     * Copyright 2013
     **********************/
     class HelloWorld
     {
           HelloWorld()
           {
                println(“Hello World”);
           }
     }

This turned out to be a tortuous copy and pasting task! If I’d had the
knowledge I could have written a script which would have done the job in
seconds, but I didn’t and it didn’t feel like the right time to learn.

Since then I’ve often thought I should learn the necessary skills I need
so similar problems in the future could be solved using a script. That
time arrived recently! Using the same problem - that of needing to add a
licence section to source files - I taught myself a scripting language
and wrote the code to solve that problem.

This is to be the first in a short series covering this process. This
and the next post will cover creation of a working script, with later
posts describing the steps needed to make the project ready for GitHub.

Okay, the first step was to decide which scripting language I wanted to
learn. I was aware of the “big 3” - Ruby, Python and Perl - and believed
each one of them to be capable of solving the problem. So, how to choose
between them? I decided to list my prior experience which each one:

  • Perl - have seen existing scripts and made minor changes to them
  • Ruby - have worked on a Ruby on Rails project, but I’m not entirely
    sure what was Rails and what was pure Ruby
  • Python - absolutely no experience

Given that I want to build on prior knowledge, this left a decision
between Perl and Ruby. My first thoughts were that Ruby is more readable
than Perl. E.g. to print “true” if the element “x” is in an array
“stuff” in Perl:
print “true” if grep /^x$/, @stuff
vs Ruby
print “true” if stuff.include?(“x”)

Also, while Perl is hugely capable and widely used (it’s used in
numerous build scripts at Penrillian) Ruby is a much more modern
language making use of modern idioms, and is altogether a better fit
with both my own and Penrillian’s future demands.

As a quick test, I put together a simple Ruby script which accepted
command line arguments and printed them out to ensure it could correctly
extract arguments for use later on. This was as easy as:
puts ARGV[0]
puts ARGV[1]
where ARGV is an array of arguments passed to the script. So executing:
C:\ruby>ruby_testing.rb “hello” “world”
produces:
hello
world

Right, so what arguments would the Ruby script need? Well, what will the
script do:
• find source files
• add licence text to each found source file

It needed to know what a source file is - I decided passing the script
the extensions of files to be altered would be a suitable solution for
this - and where to find them. It also needed to know what the text of
the licence is - passing the path to a licence file would suffice for
this. This gave me:
• find source files with extensions matching the given
extensions
• add text from given licence file to each found source file

The format of the command line invocation will be something like:
C:>injectLicenceIntoSrcFiles.rb path_to_licence_file
path_to_source_folder cpp java

It made no sense for the script to be invoked with any one of the
arguments missing, so I added some basic error checking:
if ARGV.length <= 2 then
puts “You must supply a path to the source files, a path to
the licence file and some file extensions”
exit
end

I then got to work writing the script, googling for info on how to use
arrays and files in Ruby. The end result worked, but was not very
Ruby-like and felt a bit “hacky”. It certainly wasn’t object-orientated.
This was the script at this point:

    #! /usr/bin/env ruby

    if ARGV.length <= 2 then
      puts "You must supply a path to the source files, a path to

the licence file and some file extensions"
exit
end

    @count = 0

    extensions = ARGV.dup #copy the ARGV array. Plain '=' meant

extensions
#was pointing to the same array and
calling shift was also altering ARGV

    extensions.shift      #remove first two entries from extensions

and we’re left with the file extensions
extensions.shift

    # extract the licence text from the given licence file
    begin
         file = File.open(ARGV[0])
         @licence = file.read
    rescue
         puts "could not find the licence file \"#{ARGV[0]}\"."
         exit
    end

    #prepend the licence text to each file
    def prepend_file(file)
         f = File.open(file, "r+")
         src = f.read
         f.close
         src = @licence + src

         output = File.new(file, "w")
         output.write(src)
         output.close

         @count += 1
    end

    Dir.chdir ARGV[1]     #switch the current working directory to

that of the source files extensions.each { |extension|

         Dir.glob("**/*.#{extension}") do |file|   #find src files

in current folder and all subfolders
prepend_file(file)
end
}

    puts "There were #{@count} changes"

The length of the ARGV array is checked immediately and the scripts
exits if there are insufficient parameters. This array is copied to a
new array, extensions, where the first two elements (path to licence and
path to source files) are shifted (Ruby’s way of removing elements from
the front of an array).

extensions will then contain only the file extensions and can be
iterated over. For each extension, files with that extension are located
in the source directory (and any sub-directories) and passed to the
prepend_file method, where the licence injection actually takes place.

As you can see from the last line, there was a nod to user feedback,
printing out how many source files had a licence injected into their
content. I hoped to improve this aspect as work continues.

In the next post, the script is used on a live project which uncovers
its limitations. This leads to a new feature and handling of “special
case” files.

On Tue, Nov 5, 2013 at 5:05 PM, Barry D. [email protected]
wrote:

This is to be the first in a short series covering this process. This
and the next post will cover creation of a working script, with later
posts describing the steps needed to make the project ready for GitHub.

Barry, wouldn’t this be better hosted as a blog? You typically have
more control over formatting and from the looks of it such a series
seems weel suited to blogging where you can link back and forth
between articles.

Kind regards

robert