awk ‘{if ($4~/something/) {i+=1}} END {print i}’ file.txt
That means if a line’s 4th field match “something” then increase the
counter by 1.
How to write the corresponding ruby code?
What is a ‘field’? Whitespace delimited?
Yes, thanks.
Here are two ways:
Don’t read the whole file into memory, but do it one line at a time
i = 0
file = File.open( “foo.txt” )
file.each_line do |line|
pieces = line.split( /\s+/ )
i += 1 if pieces[ 3 ] =~ /something/
end
Just read the whole file at once, assuming it’s small enough,
and create an array of the fourth column’s
col = File.read(“foo.txt”).scan(/.+/).map{ |line| line.scan(/\S+/)
[3] }
i = col.count{ |val| val =~ /something/ }
Don’t read the whole file into memory, but do it one line at a time
i = 0
file = File.open( “foo.txt” )
file.each_line do |line|
pieces = line.split( /\s+/ )
i += 1 if pieces[ 3 ] =~ /something/
end
I like that, thank you!
The code above does not close the file handle properly. Also if can be
done shorter:
File.foreach “file.txt” do |line|
…
end
You can even use Ruby like awk which seems to be rarely done - but it’s
possible.
awk ‘{if ($4~/something/) {i+=1}} END {print i}’ file.txt
Can be done like
ruby -nae ‘BEGIN {$i=0}; $i+=1 if /something/ =~ $F[3]; END {puts $i}’
file.txt
ruby -nae ‘BEGIN {$i=0}; /something/ =~ $F[3] and $i+=1; END {puts $i}’
file.txt
For a script, I’d probably do something similar to what Phrogz suggested
but with the difference that I’d use ARGF. That way you fetch file
names from the command line and do not need to change the script if the
file name changes:
i = 0
ARGF.each do |line|
bit = line.split(/\s+/)[3]
i += 1 if /something/ =~ bit
end
puts i
Or, do the matching in one step which seems more efficient
i = 0
ARGF.each do |line|
i += 1 if /^\s*(?:\S+\s+){3}something/ =~ line
end
puts i
There are about 2,843 million other ways to do it in Ruby.
After that replace “ruby” to “perl” on above commands and that will be
working too.
Ah, you want a single program to work both for Perl and Ruby. Thanks
for sharing!
I usually prefer to have the regular expression as the first argument
to =~ because for me that seems more natural (the regexp is doing the
matching) and IIRC it is a tad faster (but really only a tad).
Kind regards
robert
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.