Forum: Ruby Troubled while trying to create list from hash

Posted by Panagiotis Atmatzidis (Guest)
on 2012-07-08 04:15
(Received via mailing list)
Hello,

I have a hash with IP addresses and numbers. Each number represents the 
number that the IP address is found in a log.

I want to create a list, which contains the IP address (or addresses) of 
the most frequently found IP(s).

Here is my approach however doesn't work as it should: 
http://codepad.org/CNaVyJ1H

I have the strong feeling that there's an easier, more clear way of 
doing this. I'm not even sure if the enumerable method 'any' is used 
correctly.

Any hints or ideas on how to adjust/improve this piece of code are 
welcomed!

Thanks a priori for your time :-)

Panagiotis Atmatzidis
Posted by unknown (Guest)
on 2012-07-08 10:39
(Received via mailing list)
Am 08.07.2012 04:14, schrieb Panagiotis Atmatzidis:
> I have a hash with IP addresses and numbers. Each number represents the number 
that the IP address is found in a log.

You should provide more information about this hash
(are these *line* numbers?) and provide an example.

> I want to create a list, which contains the IP address (or addresses) of the 
most frequently found IP(s).
>
> Here is my approach however doesn't work as it should: 
http://codepad.org/CNaVyJ1H

What does @data_table look like???
What should your method do and what does not work?

(Why do you need the 'any' part,
isn't 'sorted' already the list you are looking for?)
Posted by Brian Candler (candlerb)
on 2012-07-08 17:16
Panagiotis Atmatzidis wrote in post #1067854:
> I'm not even sure if the enumerable method 'any' is used
> correctly.

No, it's not how it's intended to be used.

The idea behind 'any?' is to return true if the block returns true for 
any of the elements. So the block is expected to return a "truthy" 
value.

puts [1,3,7,4].any? { |x| x > 10 }    # false
puts [1,3,7,4].any? { |x| x > 5 }    # true

However, because your block ends with a "p" statement, which always 
returns nil, this will be treated as false always.

So in this case, 'any?' is really behaving just as 'each', just 
iterating over every single element. Also, you're ignoring the return 
value from 'any?'

> Any hints or ideas on how to adjust/improve this piece of code are
> welcomed!

Why are you not just taking 'sorted.first' as the most frequently found 
element?

If you are concerned about getting all the equal top values, then I'd do 
something like this:

best_count = sorted[0][1]
return sorted.select { |data,no| no == best_count }

Regards,

Brian.
Posted by Panagiotis Atmatzidis (Guest)
on 2012-07-08 22:53
(Received via mailing list)
Hello,


On 8 Ιουλ 2012, at 11:35 , sto.mar@web.de wrote:

> Am 08.07.2012 04:14, schrieb Panagiotis Atmatzidis:
>> I have a hash with IP addresses and numbers. Each number represents the number 
that the IP address is found in a log.
>
> You should provide more information about this hash
> (are these *line* numbers?) and provide an example.

puts @data_table returns lines like this: "2012-05-21 09:51:21 
[ssh-iptables] 222.177.23.129 CN China"

So the ip is referred in the source code as "[entry.split(' ')[3]]" ... 
which returns the 4th parameter (if you start counting by 1 the 3rd if 
you count in ruby way).

>
>> I want to create a list, which contains the IP address (or addresses) of the 
most frequently found IP(s).
>>
>> Here is my approach however doesn't work as it should: 
http://codepad.org/CNaVyJ1H
>
> What does @data_table look like???
> What should your method do and what does not work?

@data_table is an Array class, containing multiple strings as the one 
above.

> (Why do you need the 'any' part,
> isn't 'sorted' already the list you are looking for?)

There are some "duplicate IP's" which I want to enumerate. There are two 
IP's that appear 7 times. I want o add these into a secondary list in 
order to display them later on in another function.

>
>
> --
> <https://github.com/stomar/>
>
>


Panagiotis Atmatzidis
Posted by Panagiotis Atmatzidis (Guest)
on 2012-07-08 22:59
(Received via mailing list)
Hello,

On 8 Ιουλ 2012, at 18:16 , Brian Candler wrote:

> puts [1,3,7,4].any? { |x| x > 10 }    # false
> puts [1,3,7,4].any? { |x| x > 5 }    # true
>
> However, because your block ends with a "p" statement, which always
> returns nil, this will be treated as false always.
>
> So in this case, 'any?' is really behaving just as 'each', just
> iterating over every single element. Also, you're ignoring the return
> value from 'any?'

Thanks for the detailed explanation. I need to get much more accustomed 
to the enumerable and other elementary ruby methods apparently :-/

> return sorted.select { |data,no| no == best_count }
The reason is that I didn't thought about it! All solutions that came to 
mind involved several lines of code and functions which made the entire 
process feel utterly complicated and wrong for such an easy task. I was 
sure that there was an easy - 1 or 2 lines of code - to do this.

Your solution works like a charm! Thanks for the code snippet!

> Regards,
>
> Brian.
>
> --
> Posted via http://www.ruby-forum.com/.
>



regards,

Panagiotis Atmatzidis
Posted by Robert Klemme (robert_k78)
on 2012-07-09 17:24
(Received via mailing list)
On Sun, Jul 8, 2012 at 5:16 PM, Brian Candler <lists@ruby-forum.com> 
wrote:

> However, because your block ends with a "p" statement, which always
> returns nil, this will be treated as false always.

That statement is true only for 1.8.*.  Brian, your quarrel with
encoding in 1.9.* prevents you from giving correct answers nowadays
when most people seem to use 1.9.* versions:

$ irb19
Ruby version 1.9.3
irb(main):001:0> p 123
123
=> 123

Otherwise I totally agree with your reply.

Cheers

robert
Posted by Robert Klemme (robert_k78)
on 2012-07-09 17:42
(Received via mailing list)
On Sun, Jul 8, 2012 at 10:52 PM, Panagiotis Atmatzidis
<ml@convalesco.org> wrote:
> puts @data_table returns lines like this: "2012-05-21 09:51:21 [ssh-iptables] 
222.177.23.129 CN China"
>
> So the ip is referred in the source code as "[entry.split(' ')[3]]" ... which 
returns the 4th parameter (if you start counting by 1 the 3rd if you count in ruby 
way).

split will break if there can be whitespace between [] (where you have
"ssh-iptables").  I'd rather match IP addresses.

> There are some "duplicate IP's" which I want to enumerate. There are two IP's 
that appear 7 times. I want o add these into a secondary list in order to display 
them later on in another function.

Here's an alternative solution - a tad more involved.

def top_ips
  count = Hash.new 0
  max = 0

  @data_table.each do |line|
    ip = line[/\d{1,3}(?:\.\d{1,3}){3}/] and max=[max, count[ip] += 
1].max
  end

  count.select {|ip, c| c == max}.map {|ip, c| ip}
end

OR

def top_ips
  count = Hash.new 0
  max = 0

  @data_table.each do |line|
    ip = line[/\d{1,3}(?:\.\d{1,3}){3}/] and (count[ip] += 1).tap {|c|
max = c if c > max}
  end

  count.select {|ip, c| c == max}.map {|ip, c| ip}
end

The idea is to match IP adresses properly, calculate the max along the
way and finally select only those pairs where count == max.

Kind regards

robert
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.