Alf 0.10.0 Released

Hi all! Alf version 0.10.0 has been released! This release brings lots
of
new
features and enhancements as well as a fresh new documentation website
available at http://blambeau.github.com/alf.

Relational Algebra at your fingertips

Alf brings the relational algebra both in Shell and in Ruby. In Shell,
because
manipulating any relation-like data source should be as straightforward
as
a
one-liner. In Ruby, because I’ve never understood why programming
languages
provide data structures like arrays, hashes, sets, trees and graphs but
not
relations… Let’s stop the segregation :wink:

Alf follows semantic versioning (http://semver.org/) and has not reached
its
first stable 1.0.0 release. Quoting semver.org: “Anything may change at
any
time. The public API should not be considered stable”. Including alf
with
the
version requirement “~> 0.10.0” shouldn’t break your code.

Changes:

0.10.0 / 2011-08-15

New recognized data sources

  • Alf now provides an Environment implementation on top of a SQL
    database.
    This means that SQL tables can now be used as data-sources. This
    feature
    relies on the sequel gem (‘gem install sequel’ is required), that
    drives
    recognized SQL servers. Then (don’t forget that ALF_OPTS also exists):

    % alf --env=postgres://user:password@host/database show table
    
  • Alf now recognizes and allows manipulating .csv files as first-class
    data
    sources. CSV output is also supported of course. Under ruby <= 1.9,
    the
    fastercsv gem is required (‘gem install fastercsv’ is required). Then:

    % alf restrict suppliers.csv -- "city == 'Paris'"    (input)
    % alf show suppliers --csv                           (output)
    
  • Alf now recognizes and allows manipulating .log files as first-class
    data
    sources. This feature relies on request-log-analyzer gem that provides
    the
    parsers that Alf uses, and the log formats it recognizes
    (‘gem install request-log-analyzer’ is required). See examples/logs.

New operators and enhancements

  • A GENERATOR operator is introduced. It allows generating a relation
    with
    one
    auto-number attribute, up to a given size.

  • A COERCE operator is introduced. It provides a quick way to obtain
    type-safe
    relations from type-unsafe sources like .csv files. For example:

    % alf coerce mydirtyfile.csv -- name String  price Float  at Time
    

    The coerce operator is of course available in Ruby as well:

    (coerce "mydirtyfile.csv", :name => String, :price => Float, :at 
    

=>
Time)

  • The DEFAULTS (non-relational) operator now accepts default values as
    tuple
    expressions. When used in shell, provided default values are now
    evaluated
    that way. This allows specifying default values as being computed on
    the
    current tuple.

  • Aggregations in the Lispy DSL must not be prefixed by Agg:: anymore.

Miscellaneous enhancements

  • Added ‘alf --input-reader’ to specify $stdin format (csv, rash, etc.)
  • Added ‘alf -Ipath’ that mimics ruby’s -I (add path to $LOAD_PATH
    before
    run)
  • Lispy#run supports command arguments to be passed as a string
  • Lispy#run supports piped commands, with ‘|’ as in shell

Hurting changes to Lispy DSL (and therefore to Relation)

  • The attribute-name syntax of aggregation operators has been removed.
    The
    Agg::
    prefix must not be specified anymore.

    Agg::sum(:qty)    # !! error !!
    Agg::sum{ qty }   # !! error !!
    sum{ qty }        # simply, and only!
    
  • The group aggregation operator has been removed. It will probably be
    replaced
    in a future version. In the meantime, the GROUP relational operator
    allows
    obtaining similar results.

  • Lispy syntax of CLIP has changed (when used with --allbut option)

    (clip :suppliers, [:name, :city], true)
    

(before)
(clip :suppliers, [:name, :city], :allbut => true)
(after)

  • Lispy syntax of DEFAULTS has changed (when used with --strict option)

    (defaults :suppliers, {:country => 'Belgium'}, true)
    

(before)
(defaults :suppliers, {:country => ‘Belgium’}, :strict => true)
(after)

  • Lispy syntax of GROUP has changed (when used with --allbut option)

    (group :supplies, [:sid], :supplying, true)
    

(before)
(group :supplies, [:sid], :supplying, :allbut => true)
(after)

  • Lispy syntax of PROJECT has changed (when used with --allbut option)

    (project :suppliers, [:name, :city], true)
    

(before)
(project :suppliers, [:name, :city], :allbut => true)
(after)

  • Lispy syntax of SUMMARIZE has changed (when used with --allbut option)

    (summarize :supplies, [:qty, :pid], {...}, true)
    

(before)
(summarize :supplies, [:qty, :pid], {…}, :allbut => true)
(after)

Hurting changes in shell

  • The attribute-name syntax of aggregation operators has been removed

    sum(:qty)   # !! error !!
    sum{ qty }  # works
    
  • Shell syntax of GROUP has changed (option separator before introduced
    name)

    % alf --text group supplies  -- pid qty supplying
    

(before)
% alf --text group supplies – pid qty – supplying
(after)

  • Shell syntax of WRAP has changed (option separator before introduced
    name)

    % alf --text wrap suppliers -- city status loc_and_status
    

(before)
% alf --text wrap suppliers – city status – loc_and_status
(after)

  • Shell syntax of QUOTA has changed (–by and --order become pure
    arguments)

    % alf quota supplies --by=sid --order=qty -- position count 
    

sum_qty
“sum{ qty }” (before)
% alf quota supplies – sid – qty – position count sum_qty “sum{
qty
}” (after)

  • Shell syntax of RANK has changed (–order becomes a pure argument)

    % alf rank parts --order=weight,desc,pid,asc -- position
    

(before)
% alf rank parts – weight desc pid asc – position
(after)

  • Shell syntax of SUMMARIZE has changed (–by becomes a pure argument)

    % alf summarize supplies --by=sid -- total_qty "sum{ qty }"
    

(before)
% alf summarize supplies – sid – total_qty “sum{ qty }”
(after)

Bug fixes

  • [In shell] Options are now correctly parsed in presence of option
    separators.
    That is, every argument after a ‘–’ separator is considered a
    non-option
    argument.