Forum: Ruby Using ruby hash on array

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2008-12-17 23:28
I would like to process some data from an array and using hash to
perform a count on one aspect of the data in the array. The array holds
lines of similarly formatted data like so

data A      data B     data C     data D     data E     data F
data A      data B     data C     data D     data E     data F
data A      data B     data C     data D     data E     data F

data A is the identifier of each row full of data, data F contains a lot
of data which requires a regular expression which is passed to a method
for example:

"#{data F[/Name:\t(.+?)\r\n/, 1]}"

I am stuck writing the hash section of the code. The alogorithm is

if data A == 100
load data F reg expression into hash
end
count data F to determine the number of each name extracted by the
regular expression
print bob 6, sue 12, tim 1

I hope this makes sense and many thanks in advance
54185df1d348bbd34587fcd4f8e4779b?d=identicon&s=25 Louis-Philippe (Guest)
on 2008-12-17 23:49
(Received via mailing list)
I don't know if I got you right...
If a is the identifier, then your data structure could be something
like:

hash = { 'data A' => ['data B', 'data C', 'data D', 'data E', 'data F'],
'data A' => ['data B', 'data C', 'data D', 'data E', 'data F'],
'data A' => ['data B', 'data C', 'data D', 'data E', 'data F'] }

where you could access a data A like hash['data A'] or with if
hash.has_key?('data A')
and data F with hash['data A'][4].

2008/12/17 Stuart Clarke <stuart.clarke1986@gmail.com>
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2008-12-18 09:42
(Received via mailing list)
2008/12/17 Stuart Clarke <stuart.clarke1986@gmail.com>:
> I would like to process some data from an array and using hash to
> perform a count on one aspect of the data in the array. The array holds
> lines of similarly formatted data like so
>
> data A      data B     data C     data D     data E     data F
> data A      data B     data C     data D     data E     data F
> data A      data B     data C     data D     data E     data F

You do not say whether the format is delimited or fixed width.

> end
> count data F to determine the number of each name extracted by the
> regular expression
> print bob 6, sue 12, tim 1

A framework:

Data = Struct.new :a, :b:, :c, :d, :e, :f

def Data.parse(line)
  d = new(*line.strip.split(/\s+/))
  d.f = Integer(d.f)
  d
end

count = Hash.new 0

ARGF.each do |line|
  data = Data.parse(line)
  count[data.f[/Name:\t(.+?)\r\n/, 1]] += 1 if data.a == 100
end

count.each do |a, cnt|
  printf "%20s %6d\n", a, cnt
end

Kind regards

robert
Ad97b577f331ae29ed90da5751f2e44f?d=identicon&s=25 Dan Diebolt (dandiebolt)
on 2008-12-18 12:41
(Received via mailing list)
I am having a difficult time understanding what you are asking, but
perhaps this will help:

lines = <<EOF
100|data B|data C|data D|data E:Name:bob
200|data B|data C|data D|data E:Name:sue
200|data B|data C|data D|data E:Name:tim
200|data B|data C|data D|data E:Name:tim
EOF

names=Hash.new(0)
lines.each do |line|
  a,b,c,d,e=line.chomp.split("|")
  names[e[/Name:([a-z]+)/,1]] += 1 if a=="200"
end

names

=> {"tim"=>2, "sue"=>1}
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2008-12-18 15:57
Sorry for not being clear. I am actually parsing data from Windows event
logs and as a result the data is held in structured fields. Earlier in
my program I load the contents of a number of event logs into an array
and then ready the data using structs for example to read the ID number
- event.event_id more complicated for descriptions -
event.description[/Name:\t(.+?)\r\n/, 1]}

So what I would like to do is read my event log array and check for
specific event ID's (if event.event_id == 100).

If the if statement finds the ID 100 it reads name from
event.description. Everything mention thus far is working correctly.

When this is complete I then want to do a count on how many times each
name occurs e.g. bob = 2, sue = 12.
My thoughts were to load the event.description[/name:/] into a hash and
then do a count on each name in the hash and print it out.

Does this make more sense??


Dan Diebolt wrote:
> I am having a difficult time understanding what you are asking, but
> perhaps this will help:
>
> lines = <<EOF
> 100|data B|data C|data D|data E:Name:bob
> 200|data B|data C|data D|data E:Name:sue
> 200|data B|data C|data D|data E:Name:tim
> 200|data B|data C|data D|data E:Name:tim
> EOF
>
> names=Hash.new(0)
> lines.each do |line|
>   a,b,c,d,e=line.chomp.split("|")
>   names[e[/Name:([a-z]+)/,1]] += 1 if a=="200"
> end
>
> names
>
> => {"tim"=>2, "sue"=>1}
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2008-12-18 15:57
Sorry for not being clear. I am actually parsing data from Windows event
logs and as a result the data is held in structured fields. Earlier in
my program I load the contents of a number of event logs into an array
and then ready the data using structs for example to read the ID number
- event.event_id more complicated for descriptions -
event.description[/Name:\t(.+?)\r\n/, 1]}

So what I would like to do is read my event log array and check for
specific event ID's (if event.event_id == 100).

If the if statement finds the ID 100 it reads name from
event.description. Everything mention thus far is working correctly.

When this is complete I then want to do a count on how many times each
name occurs e.g. bob = 2, sue = 12.
My thoughts were to load the event.description[/name:/] into a hash and
then do a count on each name in the hash and print it out.

Does this make more sense??


Louis-Philippe wrote:
> I don't know if I got you right...
> If a is the identifier, then your data structure could be something
> like:
>
> hash = { 'data A' => ['data B', 'data C', 'data D', 'data E', 'data F'],
> 'data A' => ['data B', 'data C', 'data D', 'data E', 'data F'],
> 'data A' => ['data B', 'data C', 'data D', 'data E', 'data F'] }
>
> where you could access a data A like hash['data A'] or with if
> hash.has_key?('data A')
> and data F with hash['data A'][4].
>
> 2008/12/17 Stuart Clarke <stuart.clarke1986@gmail.com>
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2008-12-18 15:58
Sorry for not being clear. I am actually parsing data from Windows event
logs and as a result the data is held in structured fields. Earlier in
my program I load the contents of a number of event logs into an array
and then ready the data using structs for example to read the ID number
- event.event_id more complicated for descriptions -
event.description[/Name:\t(.+?)\r\n/, 1]}

So what I would like to do is read my event log array and check for
specific event ID's (if event.event_id == 100).

If the if statement finds the ID 100 it reads name from
event.description. Everything mention thus far is working correctly.

When this is complete I then want to do a count on how many times each
name occurs e.g. bob = 2, sue = 12.
My thoughts were to load the event.description[/name:/] into a hash and
then do a count on each name in the hash and print it out.

Does this make more sense??


Robert Klemme wrote:
> 2008/12/17 Stuart Clarke <stuart.clarke1986@gmail.com>:
>> I would like to process some data from an array and using hash to
>> perform a count on one aspect of the data in the array. The array holds
>> lines of similarly formatted data like so
>>
>> data A      data B     data C     data D     data E     data F
>> data A      data B     data C     data D     data E     data F
>> data A      data B     data C     data D     data E     data F
>
> You do not say whether the format is delimited or fixed width.
>
>> end
>> count data F to determine the number of each name extracted by the
>> regular expression
>> print bob 6, sue 12, tim 1
>
> A framework:
>
> Data = Struct.new :a, :b:, :c, :d, :e, :f
>
> def Data.parse(line)
>   d = new(*line.strip.split(/\s+/))
>   d.f = Integer(d.f)
>   d
> end
>
> count = Hash.new 0
>
> ARGF.each do |line|
>   data = Data.parse(line)
>   count[data.f[/Name:\t(.+?)\r\n/, 1]] += 1 if data.a == 100
> end
>
> count.each do |a, cnt|
>   printf "%20s %6d\n", a, cnt
> end
>
> Kind regards
>
> robert
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2008-12-18 16:04
(Received via mailing list)
2008/12/18 Stuart Clarke <stuart.clarke1986@gmail.com>:

> Does this make more sense??

Did you actually look at my reply?  You then would probably not have
sent the same answer three times...

robert
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2008-12-18 20:20
Sorry I was getting to replying to you then got called off in a hurry. I
do not understand some of the code you have used as I am relatively new
to Ruby. Can you briefly outline the code, it would be greatly
appreciated.

Robert Klemme wrote:
> 2008/12/18 Stuart Clarke <stuart.clarke1986@gmail.com>:
>
>> Does this make more sense??
>
> Did you actually look at my reply?  You then would probably not have
> sent the same answer three times...
>
> robert
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2008-12-18 23:35
(Received via mailing list)
On 18.12.2008 20:12, Stuart Clarke wrote:
> Sorry I was getting to replying to you then got called off in a hurry. I
> do not understand some of the code you have used as I am relatively new
> to Ruby. Can you briefly outline the code, it would be greatly
> appreciated.

I define a class holding the data you want to parse from the file via
Struct.  I then also define a parse method on that class that will take
a line (basically a String) and parse it into a new data structure.  You
will likely have to change these as field names like "a" and "b" are not
really telling and also you might want to do the parsing differently.

Then I initialize a Hash with default value 0.  This is the value
returned for keys that are not present in the Hash.

Then the code reads from all input files (ARGF) named on the command
line (or stdin if there is no name), parses each line and increments the
counter for the entry.

Finally the Hash is sorted by key and key value pairs are printed.

Cheers

  robert
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2008-12-19 00:00
(Received via mailing list)
On Thu, Dec 18, 2008 at 1:12 PM, Stuart Clarke
<stuart.clarke1986@gmail.com> wrote:
> Sorry I was getting to replying to you then got called off in a hurry. I
> do not understand some of the code you have used as I am relatively new
> to Ruby. Can you briefly outline the code, it would be greatly
> appreciated.

> Robert Klemme wrote earlier:
> A framework:
>
> Data = Struct.new :a, :b:, :c, :d, :e, :f
(leave off the extra : after b  :-)

Struct is an automatic way to define a simple class.  There, he's
defining the structure of a class that he wants to call Data with the
use of Struct.  So with this statement, we will have a class that has
attributes a, b, c, d, e, f, all with getters and setters, and (at
least in 1.8.7 AFAIK) an initialize method.  For example...

MySimpleClass = Struct.new :instance_var
m = MySimpleClass.new "hi"
puts m.instance_var
m.instance_var = "bye"
puts m.instance_var
#hi
#bye

>
> def Data.parse(line)
>   d = new(*line.strip.split(/\s+/))
>   d.f = Integer(d.f)
>   d
> end

There he is defining a class method called parse on the class Data.
This method returns a Data object filled with the info that was in the
string "line".

>
> count = Hash.new 0

As explained, that was a Hash initialization with a default value of 0

>
> ARGF.each do |line|
>   data = Data.parse(line)
>   count[data.f[/Name:\t(.+?)\r\n/, 1]] += 1 if data.a == 100
> end

Rack up the count if the conditions are met

>
> count.each do |a, cnt|
>   printf "%20s %6d\n", a, cnt
> end

Print the darn thing out!

hth,
Todd
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2008-12-21 17:34
Thanks for your help Todd. I have gone for a different approach which I
will post should it work.

I was wondering could you explain this code:

 count.each do |a, cnt|
   printf "%20s %6d\n", a, cnt
 end


I have no idea what the %20s ...... is doing.

Many thanks

Todd Benson wrote:
> On Thu, Dec 18, 2008 at 1:12 PM, Stuart Clarke
> <stuart.clarke1986@gmail.com> wrote:
>> Sorry I was getting to replying to you then got called off in a hurry. I
>> do not understand some of the code you have used as I am relatively new
>> to Ruby. Can you briefly outline the code, it would be greatly
>> appreciated.
>
>> Robert Klemme wrote earlier:
>> A framework:
>>
>> Data = Struct.new :a, :b:, :c, :d, :e, :f
> (leave off the extra : after b  :-)
>
> Struct is an automatic way to define a simple class.  There, he's
> defining the structure of a class that he wants to call Data with the
> use of Struct.  So with this statement, we will have a class that has
> attributes a, b, c, d, e, f, all with getters and setters, and (at
> least in 1.8.7 AFAIK) an initialize method.  For example...
>
> MySimpleClass = Struct.new :instance_var
> m = MySimpleClass.new "hi"
> puts m.instance_var
> m.instance_var = "bye"
> puts m.instance_var
> #hi
> #bye
>
>>
>> def Data.parse(line)
>>   d = new(*line.strip.split(/\s+/))
>>   d.f = Integer(d.f)
>>   d
>> end
>
> There he is defining a class method called parse on the class Data.
> This method returns a Data object filled with the info that was in the
> string "line".
>
>>
>> count = Hash.new 0
>
> As explained, that was a Hash initialization with a default value of 0
>
>>
>> ARGF.each do |line|
>>   data = Data.parse(line)
>>   count[data.f[/Name:\t(.+?)\r\n/, 1]] += 1 if data.a == 100
>> end
>
> Rack up the count if the conditions are met
>
>>
>> count.each do |a, cnt|
>>   printf "%20s %6d\n", a, cnt
>> end
>
> Print the darn thing out!
>
> hth,
> Todd
F065301eb65a5d0da8edcb8de9d5e28e?d=identicon&s=25 Tim Greer (Guest)
on 2008-12-21 20:06
(Received via mailing list)
Stuart Clarke wrote:

> count.each do |a, cnt|
> printf "%20s %6d\n", a, cnt
> end
>
>
> I have no idea what the %20s ...... is doing.

It's just formatting/padding.  It will right adjust the output on the
column 20 chars over.
4828d528e2e46f7c8160c336eb332836?d=identicon&s=25 Robert Heiler (shevegen)
on 2008-12-22 11:29
> I have no idea what the %20s ...... is doing.

The format delimiters are quite common in the whole "programming world".

I.e. c printf and so on.

For example:

  "%-9s"  % "12251"  # => "12251    "
  "%9s"   % "12251"  # => "    12251"
  "%09d"  % 12251    # => "000012251"
  "%015d" % "123456" # => "000000000123456"
This topic is locked and can not be replied to.