Forum: Ruby Hash counting

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2009-02-02 21:44
I am trying to load some data into a hash and then count how many times
it occurs in the hash, if it occurs more than 5 times then we are adding
some data to an array. Below is my code which I will explain

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
  if eventdateID.find {|d| (counts[d] +=1) >= 5}
    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}")
  end

The first line loads a time and date value into an array and using gsub
it creates the date and time into an ID value. We then process the array
and say if an entry (a date/time ID) occurs more or equal to 5 times add
some data to an array.

My testing with this code is not picking up on any such occurrences,
which I no exist see below:

MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009

Does anyone have any ideas why my code is not working?

I do not get errors, it just does not return any data.

Thanks in advance
753dcb78b3a3651127665da4bed3c782?d=identicon&s=25 Brian Candler (candlerb)
on 2009-02-02 21:50
STDERR.puts is your friend.

> eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
> eventsbydate[26..30]

STDERR.puts "A: #{eventdateID.inspect}"

> counts = Hash.new(0)
>   if eventdateID.find {|d| (counts[d] +=1) >= 5}
>     @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
> #{@tab}#{event.event_type} #{@tab} #{type}")

STDERR.puts "B: #{@alerts.inspect}"

>   end

Then you can see if the data is what you expect before you go into the
loop.

Note that 'find' will abort after one successful match. Is that what you
want?
391f9b787cdc12aa2c179713f5103e3a?d=identicon&s=25 Ilan Berci (iberci)
on 2009-02-02 21:59
Stuart,

I believe this will get you closer to what you want..

[1,2,2,3,3,3].inject({}) do |hash, val|
  hash[val] ||= 0
  hash[val] +=1
  hash
end

=> {1=>1,2=>2,3=>3}

hth

ilan


Stuart Clarke wrote:
> I am trying to load some data into a hash and then count how many times
> it occurs in the hash, if it occurs more than 5 times then we are adding
> some data to an array. Below is my code which I will explain
>
> eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
> eventsbydate[26..30]
> counts = Hash.new(0)
>   if eventdateID.find {|d| (counts[d] +=1) >= 5}
>     @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
> #{@tab}#{event.event_type} #{@tab} #{type}")
>   end
>
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2009-02-02 22:25
(Received via mailing list)
On 02.02.2009 21:43, Stuart Clarke wrote:
>   end
You probably rather want

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
   @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end

Cheers

  robert
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2009-02-02 23:37
Thanks Robert.

This is more to what I need. However I am still getting no result,
everything works until we get to this section:

> counts = Hash.new(0)
> eventdateID.each {|d| counts[d] +=1}
> counts.each do |d,cnt|
>    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
> end

Does it make any difference that the data being read into the
eventdateID is alphanumeric eg:

MonFeb022009
MonFeb022009
MonFeb022009

Many thanks.


Robert Klemme wrote:
> On 02.02.2009 21:43, Stuart Clarke wrote:
>>   end
> You probably rather want
>
> eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
> eventsbydate[26..30]
> counts = Hash.new(0)
> eventdateID.each {|d| counts[d] +=1}
> counts.each do |d,cnt|
>    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
> end
>
> Cheers
>
>   robert
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2009-02-03 00:37
I have worked out the problem but I am a little unsure how to solve it.

We have counts which holds all of the event ID's, however |d, cnt| is
not counting the number of matching ID numbers and it just assigns each
ID the number 1.

So given this example, we would expect cnt to find the ID of ?? as
occuring more than 5 times and ignore the rest:

MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
TueAug052008
TueAug052008
WedAug062008

However instead cnt is ust placing the number 1 for each ID for example

1
1
1
1
1
1
1
1
1
1

Can anyone help me with a fix? Many thanks

Stuart Clarke wrote:
> Thanks Robert.
>
> This is more to what I need. However I am still getting no result,
> everything works until we get to this section:
>
>> counts = Hash.new(0)
>> eventdateID.each {|d| counts[d] +=1}
>> counts.each do |d,cnt|
>>    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
>> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
>> end
>
> Does it make any difference that the data being read into the
> eventdateID is alphanumeric eg:
>
> MonFeb022009
> MonFeb022009
> MonFeb022009
>
> Many thanks.
>
>
> Robert Klemme wrote:
>> On 02.02.2009 21:43, Stuart Clarke wrote:
>>>   end
>> You probably rather want
>>
>> eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
>> eventsbydate[26..30]
>> counts = Hash.new(0)
>> eventdateID.each {|d| counts[d] +=1}
>> counts.each do |d,cnt|
>>    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
>> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
>> end
>>
>> Cheers
>>
>>   robert
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2009-02-03 00:37
I have worked out the problem but I am a little unsure how to solve it.

We have counts which holds all of the event ID's, however |d, cnt| is
not counting the number of matching ID numbers and it just assigns each
ID the number 1.

So given this example, we would expect cnt to find the ID of ?? as
occuring more than 5 times and ignore the rest:

MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
MonFeb022009
TueAug052008
TueAug052008
WedAug062008

However instead cnt is just placing the number 1 for each ID for example

1
1
1
1
1
1
1
1
1
1

Can anyone help me with a fix? Many thanks

Stuart Clarke wrote:
> Thanks Robert.
>
> This is more to what I need. However I am still getting no result,
> everything works until we get to this section:
>
>> counts = Hash.new(0)
>> eventdateID.each {|d| counts[d] +=1}
>> counts.each do |d,cnt|
>>    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
>> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
>> end
>
> Does it make any difference that the data being read into the
> eventdateID is alphanumeric eg:
>
> MonFeb022009
> MonFeb022009
> MonFeb022009
>
> Many thanks.
>
>
> Robert Klemme wrote:
>> On 02.02.2009 21:43, Stuart Clarke wrote:
>>>   end
>> You probably rather want
>>
>> eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
>> eventsbydate[26..30]
>> counts = Hash.new(0)
>> eventdateID.each {|d| counts[d] +=1}
>> counts.each do |d,cnt|
>>    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
>> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
>> end
>>
>> Cheers
>>
>>   robert
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2009-02-03 10:56
(Received via mailing list)
2009/2/3 Stuart Clarke <stuart.clarke1986@gmail.com>:
> I have worked out the problem but I am a little unsure how to solve it.
>
> We have counts which holds all of the event ID's, however |d, cnt| is
> not counting the number of matching ID numbers and it just assigns each
> ID the number 1.

What does that mean? What's in the Hash?

> TueAug052008
> 1
> 1
> 1
> 1
> 1
>
> Can anyone help me with a fix? Many thanks

Frankly, you lost me there.  Please do this:

require 'pp'

File.open('/tmp/log', 'w') {|io| io.write(counts.pretty_inspect)}

And look at the output and / or post it here.

robert
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2009-02-03 17:15
Thanks for replying and sorry for the confusion.

My hash (counts) contains date and time ID's like so TueAug052008

When I do a puts on counts I get a list of these as per there date and
time values which is what I want. However counting to see if there is
more than 5 occurances of one the ID values fails and doesn't find
anything in my data set.

I have done as you asked and the output is as follows:

{"WedAug062008"=>1}

This suggests there is a problem. Just for your information doing an
output on counts (the hash) gives this:

MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
WedAug062008
WedAug062008

Thanks for your help


Robert Klemme wrote:
> 2009/2/3 Stuart Clarke <stuart.clarke1986@gmail.com>:
>> I have worked out the problem but I am a little unsure how to solve it.
>>
>> We have counts which holds all of the event ID's, however |d, cnt| is
>> not counting the number of matching ID numbers and it just assigns each
>> ID the number 1.
>
> What does that mean? What's in the Hash?
>
>> TueAug052008
>> 1
>> 1
>> 1
>> 1
>> 1
>>
>> Can anyone help me with a fix? Many thanks
>
> Frankly, you lost me there.  Please do this:
>
> require 'pp'
>
> File.open('/tmp/log', 'w') {|io| io.write(counts.pretty_inspect)}
>
> And look at the output and / or post it here.
>
> robert
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2009-02-03 17:49
(Received via mailing list)
2009/2/3 Stuart Clarke <stuart.clarke1986@gmail.com>:
> Thanks for replying and sorry for the confusion.
>
> My hash (counts) contains date and time ID's like so TueAug052008

Obviously not as the output below demonstrates that there is just a
single entry in the Hash.

> When I do a puts on counts I get a list of these as per there date and
> time values which is what I want. However counting to see if there is
> more than 5 occurances of one the ID values fails and doesn't find
> anything in my data set.
>
> I have done as you asked and the output is as follows:
>
> {"WedAug062008"=>1}

Looks like there is a lot missing.

> This suggests there is a problem. Just for your information doing an
> output on counts (the hash) gives this:

What does "doing an output" mean? Please be more specific (e.g. by
posting complete code, ideally a test case that someone else can
execute), otherwise nobody can help you.

> MonFeb0220091
> MonFeb0220091
> MonFeb0220091
> MonFeb0220091
> MonFeb0220091
> MonFeb0220091
> MonFeb0220091
> MonFeb0220091
> WedAug062008
> WedAug062008

Cheers

robert
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2009-02-03 18:17
Ok I will get straight to the code causing the problem, so first off you
need to no that 'eventdateID' is an array full of values taken from log
files. A sample of the values in this array are:

> MonFeb0220091
> MonFeb0220091
> MonFeb0220091
> MonFeb0220091
> MonFeb0220091
> MonFeb0220091
> WedAug062008
> WedAug062008


Then I have the following code:

>> counts = Hash.new(0)
>> eventdateID.each {|d| counts[d] +=1}
>> counts.each do |d,cnt|
>>    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
>> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
>> end


The @alerts.push data is again specific to the logs I am parsing.
Basically each record in the log is given an ID number based on the time
and date values which goes into eventdateID. The purpose of the code
above is to check if any of the ID numbers occur more than 5 times in
eventdateID.


counts = Hash.new(0) - empty hash called counts
eventdateID.each {|d| counts[d] +=1} - process each ID value in
eventdateID and load into the hash counts
counts.each do |d,cnt| - process counts and see how many of each ID
value exist
@alerts.push ............. if cnt >=5 - If there are more than 5 of an
ID push some of the log data to an array which matches the eventdateID


I have done some checking

eventdateID.each {|d| counts[d] +=1}
@alerts.push

gives

MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
WedAug062008
WedAug062008

At this stage we are on the right lines we have the hash counts with
some date ID's in it.

Another test was:

eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
@alerts.push cnt

This gives

1
1
1
1
1
1
1

This is where the problem I want it to identify that

MonFeb0220091 occurs 6 times in the counts hash
WedAug062008 occurs twice  in the counts hash

As a result of this I am expecting my code to output the log data to the
@alerts array based on the  eventdateID MonFeb0220091 as it occurs more
than 5 times. Below is my code again to summarise, but the restriction
is you do not have the logs, I can assure you the data in eventdateID
are values like this MonFeb0220091.

Code block:

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
counts = Hash.new(0)
eventdateID.each {|d| counts[d] +=1}
counts.each do |d,cnt|
   @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
end


Thanks again.




Robert Klemme wrote:
> 2009/2/3 Stuart Clarke <stuart.clarke1986@gmail.com>:
>> Thanks for replying and sorry for the confusion.
>>
>> My hash (counts) contains date and time ID's like so TueAug052008
>
> Obviously not as the output below demonstrates that there is just a
> single entry in the Hash.
>
>> When I do a puts on counts I get a list of these as per there date and
>> time values which is what I want. However counting to see if there is
>> more than 5 occurances of one the ID values fails and doesn't find
>> anything in my data set.
>>
>> I have done as you asked and the output is as follows:
>>
>> {"WedAug062008"=>1}
>
> Looks like there is a lot missing.
>
>> This suggests there is a problem. Just for your information doing an
>> output on counts (the hash) gives this:
>
> What does "doing an output" mean? Please be more specific (e.g. by
> posting complete code, ideally a test case that someone else can
> execute), otherwise nobody can help you.
>
>> MonFeb0220091
>> MonFeb0220091
>> MonFeb0220091
>> MonFeb0220091
>> MonFeb0220091
>> MonFeb0220091
>> MonFeb0220091
>> MonFeb0220091
>> WedAug062008
>> WedAug062008
>
> Cheers
>
> robert
E088bb5c80fd3c4fd02c2020cdacbaf0?d=identicon&s=25 Jesús Gabriel y Galán (Guest)
on 2009-02-03 18:32
(Received via mailing list)
On Tue, Feb 3, 2009 at 6:16 PM, Stuart Clarke
<stuart.clarke1986@gmail.com> wrote:

>> WedAug062008
>>> end
> 1
> 1
> 1
> 1
> 1
>

Sorry Stuart, can you show the exact code that produces that output
(including the puts that you are using to print those values)? Cause
this works for me as it is:


irb(main):009:0> eventDateID = %w{MonFeb0220091 MonFeb0220091
MonFeb0220091 MonFeb0220091 MonFeb0220091 MonFeb0220091 WedAug062008
WedAug062008}
=> ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
"MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
"WedAug062008"]
irb(main):010:0> counts = Hash.new(0)
=> {}
irb(main):011:0> eventDateID.each {|d| counts[d] += 1}
=> ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
"MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
"WedAug062008"]
irb(main):012:0> counts
=> {"MonFeb0220091"=>6, "WedAug062008"=>2}
irb(main):013:0> @alerts = []
=> []
irb(main):014:0> counts.each do |id, cnt|
irb(main):015:1* @alerts.push(id) if cnt >= 5
irb(main):016:1> end
=> {"MonFeb0220091"=>6, "WedAug062008"=>2}
irb(main):017:0> @alerts
=> ["MonFeb0220091"]


If each element in the array eventDateID is stored in the hash as a
different key (which is what seems to be happening), maybe what is
inside the array are not strings, but another class that has a
different implementation of eql?.
Can you inspect the eventDateID array to check that?

Jesus.
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2009-02-03 19:35
Thanks for getting back to me.

I have done similar to you in Fxri and got those results earlier it
seems you may be correct and eventdateID and id, cnt do not like
eachother so much. After doing an inspect on eventdateID array I only
get the following:

["WedAug062008"]

This is strange  as it seems to missing all the other data. For your
information in my actual code I do @alerts.push(counts)

        counts = Hash.new(0)
        eventdateID.each {|d| counts[d] += 1}
        @alerts.push(counts)
        counts.each do |id,cnt|

and get what is expected:

MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
MonFeb0220091
WedAug062008
WedAug062008

Its the next step counts.each do |id,cnt| which is the problem.

> If each element in the array eventDateID is stored in the hash as a
> different key (which is what seems to be happening), maybe what is
> inside the array are not strings, but another class that has a
> different implementation of eql?.

Not sure what you mean by this. This is how I make eventdateID, just
regular expressions on a string from  a struct:

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]

Many thanks

Jesús Gabriel y Galán wrote:
> On Tue, Feb 3, 2009 at 6:16 PM, Stuart Clarke
> <stuart.clarke1986@gmail.com> wrote:
>
>>> WedAug062008
>>>> end
>> 1
>> 1
>> 1
>> 1
>> 1
>>
>
> Sorry Stuart, can you show the exact code that produces that output
> (including the puts that you are using to print those values)? Cause
> this works for me as it is:
>
>
> irb(main):009:0> eventDateID = %w{MonFeb0220091 MonFeb0220091
> MonFeb0220091 MonFeb0220091 MonFeb0220091 MonFeb0220091 WedAug062008
> WedAug062008}
> => ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
> "MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
> "WedAug062008"]
> irb(main):010:0> counts = Hash.new(0)
> => {}
> irb(main):011:0> eventDateID.each {|d| counts[d] += 1}
> => ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
> "MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
> "WedAug062008"]
> irb(main):012:0> counts
> => {"MonFeb0220091"=>6, "WedAug062008"=>2}
> irb(main):013:0> @alerts = []
> => []
> irb(main):014:0> counts.each do |id, cnt|
> irb(main):015:1* @alerts.push(id) if cnt >= 5
> irb(main):016:1> end
> => {"MonFeb0220091"=>6, "WedAug062008"=>2}
> irb(main):017:0> @alerts
> => ["MonFeb0220091"]
>
>
> If each element in the array eventDateID is stored in the hash as a
> different key (which is what seems to be happening), maybe what is
> inside the array are not strings, but another class that has a
> different implementation of eql?.
> Can you inspect the eventDateID array to check that?
>
> Jesus.
E088bb5c80fd3c4fd02c2020cdacbaf0?d=identicon&s=25 Jesús Gabriel y Galán (Guest)
on 2009-02-03 21:56
(Received via mailing list)
On Tue, Feb 3, 2009 at 7:34 PM, Stuart Clarke
<stuart.clarke1986@gmail.com> wrote:
> Thanks for getting back to me.

>
> Its the next step counts.each do |id,cnt| which is the problem.

Sorry, but can you post a complete executable piece of code we can use
to reproduce the problem?
You have this:

eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]

but what is eventdateID? Maybe you have an earlier line of code like
eventdateID = [].
I'd like to see the complete picture. Also, what is eventsbydate?
By the way, now I'm realizing that eventsbydate might be a string, so
how can eventdateID contain more than 1 entry at all?
If that's true, then

eventsbydate.gsub(/\s/, '')[0..7] + eventsbydate[26..30]

is also a string. So you are pushing a single string into eventdateID,
so when you later iterate you only get one iteration. Perhaps you have
a loop around the piece of code you showed? If that's the case, then
it makes sense that you never get more than 1 count per entry, because
you are creating the hash every time. So, I think it would be easier
if you pasted the complete program.


>> If each element in the array eventDateID is stored in the hash as a
>> different key (which is what seems to be happening), maybe what is
>> inside the array are not strings, but another class that has a
>> different implementation of eql?.
>
> Not sure what you mean by this.

It was another hipothesis, but I think you can forget about it, since
I'm pretty sure now that with the piece of code you showed you are
only ever pushing one string into eventdateID.

Jesus.
Cf25fbf53c67e27d95845e77e949b56f?d=identicon&s=25 Stuart Clarke (sclarke)
on 2009-02-03 23:12
Thanks for your response. It makes a lot more sense and you are on the
right lines I think. There is other code around this but it does not
bare much relevance:

def scanEVTWithSource(file, source)
@alerts = []
@evtLogArray = []
    begin
    #read the contents of the event logs files
    evtLog = EventLog.open_backup(file, source)

    #put data into an array
    @evtLogArray = evtLog.read.sort { |a, b| (a.event_id <=>
b.event_id).nonzero? || (a.time_written <=> b.time_written)}

    #event log data collected
    evtLog.close

    if evtLogArray.length == 0
      return
    end

    #failed logons where more than 10 have occurred in a day
    if event.event_id == 529
      eventdateID = []
      #assign all time written values to the eventsbydate array
      eventsbydate = "#{event.time_written}"
      eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
eventsbydate[26..30]
      counts = Hash.new(0)
      eventdateID.each {|d| counts[d] += 1}
      counts.each do |id,cnt|
        @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
#{@tab} #{event.event_type} #{@tab} #{type}") if cnt >= 5
      end
    end
end


I will explain this.

The scanEVTWithSource(file, source) - takes data and arguements from two
other methods which assist with the reading of the log files.

@evtLogArray - an array full of log data which is inspected in structs

The rest we no about, but for example event.event_id is a struct to
inspect the the ID field.

Hope this helps and thank you very much for your help. You are right
eventsbydate is a string based on data from the event.time_written
struct using GSUB etc to chomp it down into the values you have already
seen.

Regards

Jesús Gabriel y Galán wrote:
> On Tue, Feb 3, 2009 at 7:34 PM, Stuart Clarke
> <stuart.clarke1986@gmail.com> wrote:
>> Thanks for getting back to me.
>
>>
>> Its the next step counts.each do |id,cnt| which is the problem.
>
> Sorry, but can you post a complete executable piece of code we can use
> to reproduce the problem?
> You have this:
>
> eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
> eventsbydate[26..30]
>
> but what is eventdateID? Maybe you have an earlier line of code like
> eventdateID = [].
> I'd like to see the complete picture. Also, what is eventsbydate?
> By the way, now I'm realizing that eventsbydate might be a string, so
> how can eventdateID contain more than 1 entry at all?
> If that's true, then
>
> eventsbydate.gsub(/\s/, '')[0..7] + eventsbydate[26..30]
>
> is also a string. So you are pushing a single string into eventdateID,
> so when you later iterate you only get one iteration. Perhaps you have
> a loop around the piece of code you showed? If that's the case, then
> it makes sense that you never get more than 1 count per entry, because
> you are creating the hash every time. So, I think it would be easier
> if you pasted the complete program.
>
>
>>> If each element in the array eventDateID is stored in the hash as a
>>> different key (which is what seems to be happening), maybe what is
>>> inside the array are not strings, but another class that has a
>>> different implementation of eql?.
>>
>> Not sure what you mean by this.
>
> It was another hipothesis, but I think you can forget about it, since
> I'm pretty sure now that with the piece of code you showed you are
> only ever pushing one string into eventdateID.
>
> Jesus.
F53b05cdbdf561cfe141f69b421244f3?d=identicon&s=25 David A. Black (Guest)
on 2009-02-03 23:24
(Received via mailing list)
Hi --

On Wed, 4 Feb 2009, Stuart Clarke wrote:

>
>    #put data into an array
>    @evtLogArray = evtLog.read.sort { |a, b| (a.event_id <=>
> b.event_id).nonzero? || (a.time_written <=> b.time_written)}

I haven't really been following this thread but this caught my eye and
I thought I'd mention this other technique:

   array.sort_by {|e| [e.event_id, e.time_written] }


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Coming in 2009: The Well-Grounded Rubyist (http://manning.com/black2)

http://www.wishsight.com => Independent, social wishlist management!
E088bb5c80fd3c4fd02c2020cdacbaf0?d=identicon&s=25 Jesús Gabriel y Galán (Guest)
on 2009-02-03 23:52
(Received via mailing list)
On Tue, Feb 3, 2009 at 11:12 PM, Stuart Clarke
<stuart.clarke1986@gmail.com> wrote:
> Thanks for your response. It makes a lot more sense and you are on the
> right lines I think. There is other code around this but it does not
> bare much relevance:
>
> def scanEVTWithSource(file, source)
> @alerts = []
> @evtLogArray = []

This is unneeded, since you later assign another array to this
variable without using this one.

>    begin
>    #read the contents of the event logs files
>    evtLog = EventLog.open_backup(file, source)
>
>    #put data into an array
>    @evtLogArray = evtLog.read.sort { |a, b| (a.event_id <=>
> b.event_id).nonzero? || (a.time_written <=> b.time_written)}

Are you sure you want to put this in an instance variable?

>    #event log data collected
>    evtLog.close
>    if evtLogArray.length == 0

Shouldn't this be checking the @evtLogArray?

>      return
>    end
>
>    #failed logons where more than 10 have occurred in a day
>    if event.event_id == 529

Here we are reaching the culprit, I think. What is event? It's not
defined in this method...

>      end
>    end
> end
>

Let me try to write what I think you want cause I still think the
above code is not what you are actually running, cause the above as is
will give a NoMethodError in the evtLogArray.length method call. The
following is untested:


def scanEVTWithSource(file, source)
  @alerts = []
  #read the contents of the event logs files
  evtLog = EventLog.open_backup(file, source)
  #put data into an array; sort it using David's advice
  evtLogArray = evtLog.read.sort_by { |e| [e.event_id, e.time_written] }

  #event log data collected
  evtLog.close
  return if evtLogArray.length == 0

   # Important part here: create the hash outside the loop
   # and, actually, do a loop on evtLogArray
   counts = Hash.new(0)
   # select relevant events, mapping them to the modified string
   events = evtLogArray.select {|event| event.event_id == 529}
   events.each do |event|
      event_time = event.time_written.to_s
      eventsbydate = event_time.gsub(/\s/, '')[0..7] +
event_time[26..30]
      counts[eventsbydate] += 1
   end
   counts.each do |id,cnt|
        # Now I have a problem here: what we are putting in the hash
is a string, not an event object
        # @alerts.push("#{event.event_id} #{@tab}
#{event.time_written} #{@tab} #{event.event_type} #{@tab} #{type}") if
cnt >= 5
        @alerts.push(id) if cnt >= 5
   end
end

I hope this helps. I don't have time now to solve the issue about you
wanting to push the event object to the alerts array, instead of just
the calculated string, but I hope you find a way to do that easily.

Let me know if this helped.

Jesus.
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2009-02-04 10:07
(Received via mailing list)
2009/2/3 Jesús Gabriel y Galán <jgabrielygalan@gmail.com>:
> This is unneeded, since you later assign another array to this
> variable without using this one.

Also, when reinitializing these variables on each method call then
chances are that they can be local variables and not instance
variables - unless, of course, some other method in the class (which
class?) uses the leftovers of scanEVTWithSource in those instance
variables.

I am suspecting the issue is somewhere above the method. For example,
you might have a loop calling scanEVTWithSource and expecting that
counts are aggregated throughout all calls but they aren't since you
reinitialize the Hash on each call.

>>    #event log data collected
>
>>      counts.each do |id,cnt|
>>        @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
>> #{@tab} #{event.event_type} #{@tab} #{type}") if cnt >= 5
>>      end
>>    end
>> end
>>

Absolutely agree to your other comments.  I still think we haven't
seen all the code. Also, the whole problem is not very clear to me
either.

Cheers

robert
Ae16cb4f6d78e485b04ce1e821592ae5?d=identicon&s=25 Martin DeMello (Guest)
on 2009-02-04 15:28
(Received via mailing list)
On Wed, Feb 4, 2009 at 12:04 AM, Stuart Clarke
<stuart.clarke1986@gmail.com> wrote:
>
>        counts = Hash.new(0)
>        eventdateID.each {|d| counts[d] += 1}

Here is your problem. Hash.new(0) means "when I query the hash, and
the key I request is not in there, return 0". It does not actually add
{key => 0} to the hash itself. To do that, you need the block form of
Hash.new, which yields as block the hash itself and the key:

counts = Hash.new {|h, k| h[k] = 0}

irb(main):001:0> a = Hash.new(0)
=> {}
irb(main):002:0> b = Hash.new {|h,k| h[k] = 0}
=> {}
irb(main):003:0> a['hello']
=> 0
irb(main):004:0> b['hello']
=> 0
irb(main):005:0> a
=> {}
irb(main):006:0> b
=> {"hello"=>0}

martin
E088bb5c80fd3c4fd02c2020cdacbaf0?d=identicon&s=25 Jesús Gabriel y Galán (Guest)
on 2009-02-04 16:21
(Received via mailing list)
On Wed, Feb 4, 2009 at 3:27 PM, Martin DeMello <martindemello@gmail.com>
wrote:
> On Wed, Feb 4, 2009 at 12:04 AM, Stuart Clarke
> <stuart.clarke1986@gmail.com> wrote:
>>
>>        counts = Hash.new(0)
>>        eventdateID.each {|d| counts[d] += 1}
>
> Here is your problem. Hash.new(0) means "when I query the hash, and
> the key I request is not in there, return 0". It does not actually add
> {key => 0} to the hash itself.

This is true, but counts[d] += 1 is actually counts[d] = counts[d] + 1
so the RHS will evaluate to 1 the first time, assigning it to the hash:

irb(main):001:0> h = Hash.new(0)
=> {}
irb(main):002:0> h["a"] += 1
=> 1
irb(main):003:0> h
=> {"a"=>1}

So the above snippet is correct for generating a histogram.

Jesus.
Ae16cb4f6d78e485b04ce1e821592ae5?d=identicon&s=25 Martin DeMello (Guest)
on 2009-02-04 16:25
(Received via mailing list)
On Wed, Feb 4, 2009 at 8:48 PM, Jesús Gabriel y
Galán<jgabrielygalan@gmail.com> wrote:
>> Here is your problem. Hash.new(0) means "when I query the hash, and
>> the key I request is not in there, return 0". It does not actually add
>> {key => 0} to the hash itself.
>
> This is true, but counts[d] += 1 is actually counts[d] = counts[d] + 1
> so the RHS will evaluate to 1 the first time, assigning it to the hash:

Oops - yes, missed that.

martin
3131fcea0a711e5ad89c8d49cc9253b4?d=identicon&s=25 Julian Leviston (Guest)
on 2009-02-05 03:17
(Received via mailing list)
Or square bracket method

Hash[1,2,3,4] # => {1 => 2, 3 => 4}

Blog: http://random8.zenunit.com/
Learn rails: http://sensei.zenunit.com/

On 05/02/2009, at 1:27 AM, Martin DeMello <martindemello@gmail.com>
This topic is locked and can not be replied to.