Parsing log with date time entry

The log file looks like this

01/10/2012 14:46:02:87 Some Data
01/10/2012 14:46:03:319 Some Stuff
01/10/2012 14:46:04:160 OMFG
01/10/2012 14:46:04:160 okay

I have the script to the point where it grabs all the logs, sorts
through them and spits out the lines with the specified date/time into a
separate file.

What would be the best way to get the current time formatted per the log
entries and collect a range of times say from current to 15 minutes
previously.

On Mon, Jan 16, 2012 at 11:56 PM, Christopher G.
[email protected] wrote:

What would be the best way to get the current time formatted per the log
entries and collect a range of times say from current to 15 minutes
previously.

I am not sure what you mean by “collect a range”. Parsing can be done
like this:

require ‘time’
t = Time.strptime(‘01/10/2012 14:46:02:86’, ‘%m/%d/%Y %H:%M:%S:%L’)

Cheers

robert

Robert K. wrote in post #1041261:

On Mon, Jan 16, 2012 at 11:56 PM, Christopher G.
[email protected] wrote:

What would be the best way to get the current time formatted per the log
entries and collect a range of times say from current to 15 minutes
previously.

I am not sure what you mean by “collect a range”. Parsing can be done
like this:

require ‘time’
t = Time.strptime(‘01/10/2012 14:46:02:86’, ‘%m/%d/%Y %H:%M:%S:%L’)

Cheers

robert

Thanks Robert,

By collect range, how would I say subtract 15 minutes from the current
time so that I can collect all matches from 14:31:02 through 14:46:02?

On Tue, Jan 17, 2012 at 4:37 PM, Peter V.
[email protected]wrote:

Thanks Robert,

By collect range, how would I say subtract 15 minutes from the current
time so that I can collect all matches from 14:31:02 through 14:46:02?

1.9.3p0 :006 > Time.now.to_s
=> “2012-01-17 16:36:06 +0100”
1.9.3p0 :007 > (Time.now - (15*60)).to_s
=> “2012-01-17 16:21:09 +0100”

Actually, for log files, it may be better to use uniform UTC times. So,
no confusion on summer/winter time or servers in different part of
the world etc.

1.9.3p0 :029 > Time.now.utc.to_s
=> “2012-01-17 15:43:25 UTC”
1.9.3p0 :030 > (Time.now - (15*60)).utc.to_s
=> “2012-01-17 15:28:30 UTC”

HTH,

Peter

On Tue, Jan 17, 2012 at 4:26 PM, Christopher G.
[email protected]wrote:

time so that I can collect all matches from 14:31:02 through 14:46:02?

1.9.3p0 :006 > Time.now.to_s
=> “2012-01-17 16:36:06 +0100”
1.9.3p0 :007 > (Time.now - (15*60)).to_s
=> “2012-01-17 16:21:09 +0100”

Do you want to use string comparison on the lines (and not compare
“Time”
entries) ? In that case … be careful with your date format that has
mm/dd/yyyy … When your filter goes over a midnight boundary, you may
get incorrect results.

For this kind of “technical” reading of log times, I would certainly
advise
a
format YYYY/MM/DD … so that you can sort and filter on the string
representations
(assuming that was your intention from the original posting). This is
actually
the default (ISO I think ?) format that is returned by Time#to_s.

HTH,

Peter

Peter V. wrote in post #1041349:

On Tue, Jan 17, 2012 at 4:37 PM, Peter V.
[email protected]wrote:

Thanks Robert,

By collect range, how would I say subtract 15 minutes from the current
time so that I can collect all matches from 14:31:02 through 14:46:02?

1.9.3p0 :006 > Time.now.to_s
=> “2012-01-17 16:36:06 +0100”
1.9.3p0 :007 > (Time.now - (15*60)).to_s
=> “2012-01-17 16:21:09 +0100”

Actually, for log files, it may be better to use uniform UTC times. So,
no confusion on summer/winter time or servers in different part of
the world etc.

1.9.3p0 :029 > Time.now.utc.to_s
=> “2012-01-17 15:43:25 UTC”
1.9.3p0 :030 > (Time.now - (15*60)).utc.to_s
=> “2012-01-17 15:28:30 UTC”

HTH,

Peter

Unfortunately I have no control over the format of the logs. I am trying
to compare the current time to the string value in the logs.
Any ideas on how to get past the crossing midnight issue?

I came up with is for gathering start and end times but the above looks
nicer.

s_time = Time.now
e_time = s_time - 900

Either way I now have the start time and end time so how would I make a
range out of that counting in seconds?

Being relatively newb,I still find it amazing that parts f the script I
think would be the simplest turns out to be the most difficult.

Thanks for the input so far!!!

Range#=== , need #succ method in Ruby1.9.3
under only #<=>, use Range#cover?

2012/1/18 Peter V. [email protected]:

On Tue, Jan 17, 2012 at 6:12 PM, Christopher G.
[email protected]wrote:

1.9.3p0 :029 > Time.now.utc.to_s
Any ideas on how to get past the crossing midnight issue?

I came up with is for gathering start and end times but the above looks
nicer.

s_time = Time.now
e_time = s_time - 900

Maybe s_time and e_time are swapped here ?

Either way I now have the start time and end time so how would I make a
range out of that counting in seconds?

Well, reading the Time documentation I see Time#to_f that may be a
useful
number. And then, you could make a range with

s_time.to_f…e_time.to_f

Take note that comparisons on floats have their own problems
if the delta between them is very small and they are not integer.

1.9.3-p0 :001 > e_time = Time.now.utc
=> 2012-01-17 17:51:46 UTC
1.9.3-p0 :002 > s_time = e_time - 900
=> 2012-01-17 17:36:46 UTC

1.9.3-p0 :003 > range = s_time.to_f…e_time.to_f
=> 1326821806.2174225…1326822706.2174225

1.9.3-p0 :004 > log_time = Time.now.utc-300
=> 2012-01-17 17:47:22 UTC

1.9.3-p0 :007 > range === log_time.to_f
=> true # that log_time is in the range :slight_smile:

1.9.3-p0 :008 > range === Time.now.to_f
=> false # now is obviously outside the range :slight_smile:

1.9.3-p0 :009 > # but, you could simply compare times themselves … not
!
1.9.3-p0 :010 > time_range = s_time…e_time
=> 2012-01-17 17:36:46 UTC…2012-01-17 17:51:46 UTC
1.9.3-p0 :011 > time_range === Time.now
TypeError: can’t iterate from Time
from (irb):11:in each' from (irb):11:in include?’
from (irb):11:in include?' from (irb):11:in ===’
from (irb):11
from /home/peterv/.rvm/rubies/ruby-1.9.3-p0/bin/irb:16:in `’

Strange, obviously, for Float too … (can’t iterate from Float),
but, I can determine if a value is inside a range of Floats.
Why not for Time then ? Time has a <=> operator …

1.9.3-p0 :014 > range.each{|f| puts f}
TypeError: can’t iterate from Float
from (irb):14:in each' from (irb):14 from /home/peterv/.rvm/rubies/ruby-1.9.3-p0/bin/irb:16:in

Being relatively newb, I still find it amazing that parts f the script I

think would be the simplest turns out to be the most difficult.

I may have confused you even more … (sorry for that).

I was under the impression you wanted to compare the lines
based on the string representation (in which case those issues
of midnight and mm/dd/yyyy representation etc. would play).

But if you import the date+time of each line with the code that
Robert proposed above, you end up with “clean” Time instances
that will work properly.

Sorry for the confusion …

HTH,

Peter

On Tue, Jan 17, 2012 at 7:44 PM, Kenichi K. [email protected]
wrote:

Range#=== , need #succ method in Ruby1.9.3
under only #<=>, use Range#cover?

Thank you. For Time class, I could get it to work.

This will simplify the code for the OP.

$ irb
1.9.3-p0 :001 > t2 = Time.now
=> 2012-01-17 19:47:19 0100

1.9.3-p0 :002 > t1 = t2 - 900
=> 2012-01-17 19:32:19 0100

1.9.3-p0 :003 > time_range = t1…t2
=> 2012-01-17 19:32:19 0100…2012-01-17 19:47:19 0100

1.9.3-p0 :004 > t3 = Time.now - 300
=> 2012-01-17 19:42:41 0100

1.9.3-p0 :006 > time_range === t3
TypeError: can’t iterate from Time
from (irb):6:in each' from (irb):6:in include?’
from (irb):6:in include?' from (irb):6:in ===’
from (irb):6
from /home/peterv/.rvm/rubies/ruby-1.9.3-p0/bin/irb:16:in `’

1.9.3-p0 :007 > time_range.cover? t3
=> true # :slight_smile:

1.9.3-p0 :008 > time_range.cover? Time.now
=> false # :slight_smile:

However, for Float the === operator seems to work, without Float#succ

1.9.3-p0 :009 > f1 = 13.6
=> 13.6
1.9.3-p0 :010 > f2 = 18.6
=> 18.6
1.9.3-p0 :011 > float_range = f1…f2
=> 13.6…18.6
1.9.3-p0 :012 > float_range === 15.5
=> true
1.9.3-p0 :013 > f1.succ
NoMethodError: undefined method succ' for 13.6:Float from (irb):13 from /home/peterv/.rvm/rubies/ruby-1.9.3-p0/bin/irb:16:in
1.9.3-p0 :014 > float_range.cover? 15.5
=> true

So, maybe the behavior could be made similar?

I suggest that Range#=== arg falls back to Range#cover? arg
uniformily if arg.respond_to?(:succ) is false (and probably other
conditions).

Thanks again,

Peter

Sorry, I have no skills for C programming.
But, I guess the causes from below codes by only my intuition.

https://github.com/ruby/ruby/blob/trunk/range.c


static int
discrete_object_p(VALUE obj)
{
if (rb_obj_is_kind_of(obj, rb_cTime)) return FALSE; /* until
Time#succ removed */
return rb_respond_to(obj, id_succ);
}


static VALUE
range_include(VALUE range, VALUE val)
{
VALUE beg = RANGE_BEG(range);
VALUE end = RANGE_END(range);
int nv = FIXNUM_P(beg) || FIXNUM_P(end) ||
rb_obj_is_kind_of(beg, rb_cNumeric) ||
rb_obj_is_kind_of(end, rb_cNumeric);


It looks specific rules for Time and Numeric.

2012/1/18 Peter V. [email protected]:

Thanks for pointing me in the right direction.

I’ll work on this when I get some more free time this afternoon. I’ll
get back to you guys for either some more help if I get stuck or for
some code critique.

On Tue, Jan 17, 2012 at 8:38 PM, Christopher G. [email protected]
wrote:

Thanks for pointing me in the right direction.

I’ll work on this when I get some more free time this afternoon. I’ll
get back to you guys for either some more help if I get stuck or for
some code critique.

You can easily cook yourself a range which uses comparisons for
matching:

irb(main):005:0> CompareRange = Struct.new :first, :last do
irb(main):006:1* def ===(x)
irb(main):007:2> x >= first && x < last
irb(main):008:2> end
irb(main):009:1> end
=> CompareRange
irb(main):010:0> r=CompareRange.new Time.now-10,Time.now+2
=> #<struct CompareRange first=2012-01-17 20:42:28 +0100,
last=2012-01-17 20:42:40 +0100>
irb(main):011:0> a=10.times.map {|i| r.first - 100 + i * 15}
=> [2012-01-17 20:40:48 +0100, 2012-01-17 20:41:03 +0100, 2012-01-17
20:41:18 +0100, 2012-01-17 20:41:33 +0100, 2012-01-17 20:41:48 +0100,
2012-01-17 20:42:03 +0100, 2012-01-17 20:42:18 +0100, 2012-01-17
20:42:33 +0100, 2012-01-17 20:42:48 +0100, 2012-01-17 20:43:03 +0100]
irb(main):012:0> a.grep r
=> [2012-01-17 20:42:33 +0100]

Note, this is simplistic but illustrates the idea. You probably also
want to implement =~ operator.

Kind regards

robert

I finally got a little more time to work on this. I went the route of
comparing the current time to the string value and came up with this

require ‘time’

s_time = (Time.now - (15*60))
e_time = Time.now
range = (s_time.strftime("%m/%d/%Y %H:%M")…e_time.strftime("%m/%d/%Y
%H:%M"))
time = s_time.strftime("%m/%d/%Y %H:%M")

Dir["*.log"].each |path|
File.open( path ) do |s|
s.grep( /#{range}/ ) do |line|
puts line
end
end
end

the issue is the range variable is not catching anything. When I use the
time variable it works as expected if there is a matching entry in the
log

hmm maybe next I should try to compare the entry in the log to the time
range…

hmm ok I see the issue

range = (s_time.strftime("%m/%d/%Y %H:%M")…e_time.strftime("%m/%d/%Y
%H:%M"))

does not create a range instead it is looking for s_time(two
characters)e_time hmmmm

oh well back to the drawing board

On Tue, 2012-01-17 at 07:56 +0900, Christopher G. wrote:

What would be the best way to get the current time formatted per the log
entries and collect a range of times say from current to 15 minutes
previously.

$ cat test.log
01/10/2012 14:46:02:87 Some Data
01/10/2012 14:46:03:319 Some Stuff
01/10/2012 14:46:04:160 OMFG
01/10/2012 14:46:04:160 okay

assume format is m/d/y

require ‘rio’
def test
now = Time.utc(2012, 1, 10, 14, 59, 00)
puts "Explicitely set now to " + now.to_s
puts “”

rio('/home/rthompso/test.log').lines { |line|
    dt, tm, msg = line.split
    # assume format is m/d/y
    month, day, year = dt.split('/')
    hr, min, secfrac = tm.split(':')
    ts = year +'-' + month + '-' + day  + ' ' + tm
    at = Time.utc(year, month, day, hr, min, secfrac)
    p at

    cutoff = (now - (15 * 60))
    puts "15 mins ago was " + (now - (15 * 60)).to_s
    if ( at >= Time.at(cutoff) )
        puts line
    end
}

#f = rio(’/home/rthompso/test.log’) > ?-
puts “”
puts “”
puts “”

now = Time.utc(2012, 1, 10, 15, 01, 5)
puts "Explicitely set now to " + now.to_s
puts ""

rio('/home/rthompso/test.log').lines { |line|
    dt, tm, msg = line.split
    # assume format is m/d/y
    month, day, year = dt.split('/')
    hr, min, secfrac = tm.split(':')
    ts = year +'-' + month + '-' + day  + ' ' + tm
    at = Time.utc(year, month, day, hr, min, secfrac)
    p at

    cutoff = (now - (15 * 60))
    puts "15 mins ago was " + (now - (15 * 60)).to_s
    if ( at >= Time.at(cutoff) )
        puts line
    end
}

end

irb(main):089:0> test
Explicitly set now to Tue Jan 10 14:59:00 UTC 2012

Tue Jan 10 14:46:02 UTC 2012
15 mins ago was Tue Jan 10 14:44:00 UTC 2012
01/10/2012 14:46:02:87 Some Data
Tue Jan 10 14:46:03 UTC 2012
15 mins ago was Tue Jan 10 14:44:00 UTC 2012
01/10/2012 14:46:03:319 Some Stuff
Tue Jan 10 14:46:04 UTC 2012
15 mins ago was Tue Jan 10 14:44:00 UTC 2012
01/10/2012 14:46:04:160 OMFG
Tue Jan 10 14:46:04 UTC 2012
15 mins ago was Tue Jan 10 14:44:00 UTC 2012
01/10/2012 14:46:04:160 okay

Explicitly set now to Tue Jan 10 15:01:05 UTC 2012

Tue Jan 10 14:46:02 UTC 2012
15 mins ago was Tue Jan 10 14:46:05 UTC 2012
Tue Jan 10 14:46:03 UTC 2012
15 mins ago was Tue Jan 10 14:46:05 UTC 2012
Tue Jan 10 14:46:04 UTC 2012
15 mins ago was Tue Jan 10 14:46:05 UTC 2012
Tue Jan 10 14:46:04 UTC 2012
15 mins ago was Tue Jan 10 14:46:05 UTC 2012

On Sat, 2012-01-21 at 04:21 +0900, Christopher G. wrote:

The Rio way looks interesting but even after installing the gem it is
not working right on my test log.

you can ignore rio – use any file io that will read and pass |line| to
the block

i.e.

File.open(‘haggis.txt’).each_line{ |s|
puts s
}

File.foreach(‘haggis.txt’) { |s|
puts s
}

IO.foreach(“testfile”) {|x| print "GOT ", x }

The Rio way looks interesting but even after installing the gem it is
not working right on my test log.

I did manage to get it working

require ‘time’

s_time = (Time.now - (1560))
e_time = Time.now
range =
(s_time.strftime("%m/%d/%Y\s%H:%M:%S:%L")…e_time.strftime("%m/%d/%Y\s%H:%M:%S:%L"))
Dir["
.log"].each {|path|
File.readlines( path ).each do |sort|
sort.scan(/^\d{2}/\d{2}/\d{4}\s\d{1,2}:\d{2}:\d{2}:\d{1,3}/).each
do |time|

 if range.cover? time
 puts sort
  end
end

end

}

What do you guys think?

Thanks for pointing me in the right direction. Without all of your help
this little project would ahve taken me a lot longer than it did.

On Fri, Jan 20, 2012 at 8:21 PM, Christopher G.
[email protected]wrote:

end

}

What do you guys think?

I am afraid this version will fail to select any line for the first 15
minutes of every year.
But maybe that is not an issue?

021:0> e_time = Time.new(2012,1,1,0,0,30)
=> 2012-01-01 00:00:30 +0100
022:0> s_time = e_time-900
=> 2011-12-31 23:45:30 +0100
023:0> range =
s_time.strftime(“%m/%d/%Y\s%H:%M:%S:%L”)…e_time.strftime(“%m/%d/%Y\s%H:%M:%S:%L”)
=> “12/31/2011 23:45:30:000”…“01/01/2012 00:00:30:000”
027:0> log_line = “01/01/2012 00:00:20:000 Some Data”
=> “01/01/2012 00:00:20:000”
028:0> range.cover? log_line
=> false

I wrote a version where the string comparison is executed on a
%Y/%d/%s … version and that version seems to behave correctly,
both for the simple case and on a newyear morning. The catch is
that I swapped %m/%d with %Y before calling cover? ($1, $2 and
$3 return the groups in a recent regex).

e_time = Time.new(2012,1,1,1,0,30) # the simple case

e_time = Time.new(2012,1,1,0,0,30) # the year boundary case
s_time = e_time - 900

range =
s_time.strftime(“%Y/%m/%d\s%H:%M:%S:%L”)…e_time.strftime(“%Y/%m/%d\s%H:%M:%S:%L”)

test_log_file =
[ “12/31/2011 22:00:00:000 way before”,
“12/31/2011 23:45:30:000 just inside”,
“01/01/2012 00:00:20:000 Some Data just before now”,
“01/01/2012 00:00:40:000 Some Data just after now”,
“01/01/2012 01:00:20:000 An hour later”]

[test_log_file].each {|path|
path.each do |sort|

sort.scan(/^(\d{2}/\d{2})/(\d{4})(\s\d{1,2}:\d{2}:\d{2}:\d{1,3})/).each
do
time = “#{$2}/#{$1}#{$3}”
if range.cover? time
puts sort
end
end
end
}

HTH,

Peter

On Sat, Jan 21, 2012 at 9:52 PM, Kenichi K. [email protected]
wrote:

Thank you for the correction.

Indeed, because of the variable width on the milliseconds (1 to 3
digits),
you can have

12/31/2011 23:45:30:2

seemingly later than

12/31/2011 23:45:30:10

Note

“%L” in Time#strftime means ‘microsec’, I did’t know this too.

I believe that is not correct. It seems to be “milliseconds”.

from Class: Time (Ruby 1.9.3)
the section “strftime( string ) → string”

%L - Millisecond of the second (000…999)

%N - Fractional seconds digits, default is 9 digits (nanosecond)
%3N millisecond (3 digits)
%6N microsecond (6 digits)
%9N nanosecond (9 digits)
%12N picosecond (12 digits)

HTH,

Peter

Hi

Is this log fomart below meanings?
mon/day/year hour:min:sec:millisec
And Time.parse is not better this case, I guess.
And Christopher want to millisec? then you use padding 0 or cast to Time

a sample of “cast to Time” with Peter’s test case(+)

http://ideone.com/tBz7D

Note

“%L” in Time#strftime means ‘microsec’, I did’t know this too.