Summaryse 1.1.0 Released

summaryse version 1.1.0 has been released!

sudo gem summaryse

After birthday yesterday, I already reached a second milestone. A lot
of useful features added for fun and profit! Examples are given in the
changelog exerpt below.

Array#summaryse

Summarize arrays with full power. As a side effect, this gem allows
merging
multiple YAML configuration files with power, without sacrifying
simplicity…

Changes:

1.1.0 / 2011.07.12

  • Added a way to register user-defined aggregation functions:

    Summaryse.register(:comma_join) do |ary|
      ary.join(', ')
    end
    [1, 4, 12, 7].summaryse(:comma_join)
    # => "1, 4, 12, 7"
    
  • Added the ability to specify aggregations on hash keys that are not
    used
    at
    all. In this case, the aggregator is called on an empty array.

    [
      { :size => 12 },
      { :size => 17 }
    ].summaryse(:size => :max, :hobbies => lambda{|a| a})
    # => {:size => 17, :hobbies => []}
    
  • Added the ability to use objects responding to to_summaryse as
    aggregator
    functions:

    class Foo
      def to_summaryse; :sum; end
    end
    [1, 2, 3].summaryse(Foo.new)
    # => 6
    
  • Added the ability to explicitly bypass Hash entries as the result of
    a
    computation, by returning Summaryse::BYPASS

    [
      { :hobbies => [:ruby],  :size => 12 },
      { :hobbies => [:music], :size => 17 }
    ].summaryse(:size => :max, :hobbies => lambda{|a|
    

Summaryse::BYPASS})
# => {:size => 17}

  • The semantics of aggregating empty arrays is guaranteed. Due to duck
    typing,
    nil is returned in almost all cases except :count so far. This is
    specified
    in README.

  • Best-effort for yielding friendly hash ordering under ruby >= 1.9

  • Array#summaryse now raises an ArgumentError when it does not
    understand
    its
    aggregator argument.

Enjoy!
B

On Tue, Jul 12, 2011 at 9:58 PM, Bernard L. [email protected]
wrote:

  • http://github.com/blambeau/summaryse

  • Added a way to register user-defined aggregation functions:

    Summaryse.register(:comma_join) do |ary|
    ary.join(', ')
    end
    [1, 4, 12, 7].summaryse(:comma_join)

    => “1, 4, 12, 7”

Personally I’d rather directly use #join - it’s short as well and much
clearer what is happening - especially if #register and #summaryse
calls are in separate locations.

  • Added the ability to specify aggregations on hash keys that are not used
    at
    all. In this case, the aggregator is called on an empty array.

    [
    { :size => 12 },
    { :size => 17 }
    ].summaryse(:size => :max, :hobbies => lambda{|a| a})

    => {:size => 17, :hobbies => []}

I am not sure I understand what this is good for. Can you provide a
bit more insight?

  • Added the ability to use objects responding to to_summaryse as
    aggregator
    functions:

    class Foo
    def to_summaryse; :sum; end
    end
    [1, 2, 3].summaryse(Foo.new)

    => 6

Why would I want to do that? Wouldn’t it be much shorter to just use
the constant?

  • Added the ability to explicitly bypass Hash entries as the result of a
    computation, by returning Summaryse::BYPASS

    [
    { :hobbies => [:ruby], :size => 12 },
    { :hobbies => [:music], :size => 17 }
    ].summaryse(:size => :max, :hobbies => lambda{|a|
    Summaryse::BYPASS})

    => {:size => 17}

Hm…

Generally I would rather implement those directly. The reason is
simply that this will reduce component dependencies of my applications
by one. When weighting using an external component (e.g. a gem) vs.
doing it internally the benefit must be higher than the cost for me.
YMMV though.

Kind regards

robert

I understand your objections and must add that summaryse is somewhat
unusual. However, a changelog is not a README…

   Summaryse.register(:comma_join) do |ary|
     ary.join(', ')
   end
   [1, 4, 12, 7].summaryse(:comma_join)
   # => "1, 4, 12, 7"

Personally I’d rather directly use #join - it’s short as well and much
clearer what is happening - especially if #register and #summaryse
calls are in separate locations.

Of course, if you have the array directly. Maybe that array is far from
your
control due to lots of recursive calls that summaryse applies when
summarizing complex relations. Have a look at README.

   [
     { :size => 12 },
     { :size => 17 }
   ].summaryse(:size => :max, :hobbies => lambda{|a| a})
   # => {:size => 17, :hobbies => []}

I am not sure I understand what this is good for. Can you provide a
bit more insight?

Summaryse tries not to make any strong assumption about the heading
of the summarized relations (relation = set of tuples ~= array of
hashes).
I don’t want to throw an error in such case (recall from README that one
of my main goals is to merge YAML files easily) and assuming an empty
array is therefore sound.

Why would I want to do that? Wouldn’t it be much shorter to just use
the constant?

Yes, of course, because this is a simple example (extracted from a
changelog,
even if I have to admit that the same example is in the readme). In
certain
cases is might be useful to provide complex and reusable summarization
operators as named modules or classes. to_summaryse is done for that.

Hm… Same remark.

Generally I would rather implement those directly. The reason is
simply that this will reduce component dependencies of my applications
by one. When weighting using an external component (e.g. a gem) vs.
doing it internally the benefit must be higher than the cost for me.
YMMV though.

I even encourage you to do so, especially in simplest cases.
Merging complex
YAML files is not a simple case. I’ve already implemented that in three
projects,
and would like to factorize that feature somewhere. An announce is just
an
announce. You’re not required to find a gem useful :wink:

Summaryse is an attempt to have a powerful summarization operator that
supports
relations in relations in relations, etc. It is a bit unusual, I
confess,
especially on
special features like bypass and empty arrays… ok.

Bernard