I'm a novice at Ruby, although I've been introduced to its syntax by virtue of working with Puppet for the past 2 1/2 years. As a sys admin, I'm most comfortable working with bash, but I recognize that there are things that a scripting language can't do - like interacting with a database. So I'm writing some Ruby code to extract text out of an HTML file and then insert it into a mySQL database. I've been using Nokogiri to get some formatted text, but also need to get some text that's easy enough to get with a scripting language like bash. Specifically, I need to do this: grep "<strong>Date:</strong>" $1 | cut -d' ' -f3-5 Is there an easy way to embed this logic within Ruby code - or, better yet - a way to do the same thing in Ruby?
on 2012-12-14 19:15
on 2012-12-14 19:51
There are a number of ways of executing system code from Ruby. As you
are familiar with Bash, the easiest would probably be to use a pair of
backticks "``" to execute code. Ruby returns the output as a string.
date = `grep "<strong>Date:</strong>" $1 | cut -d' ' -f3-5`
You may also use the executable string notation "%x{}" which is similar
to Bash's "$()" notation.
date = %x{grep "<strong>Date:</strong>" $1 | cut -d' ' -f3-5}
You can check the exit status using "$?" as you would in Bash as well.
However, an object is returned so you will want to compare
"$?.exitstatus".
There are also commands like "system", "exec", "spawn", etc., which make
other shell features accessible.
Jamal Wills
on 2012-12-14 20:35
Thanks for the quick reply! I've substituted $1 with ARGV[0], as I
believe that's proper Ruby syntax for a command argument, but I'm still
having trouble getting the grep command to treat the argument as a file
name.
Here's the entirety of my code:
#!/usr/bin/env ruby
fn = ARGV[0]
puts fn
date = %x{grep "<strong>Date:</strong>" fn | cut -d' ' -f3-5}
puts date
And here's the result when I run it on the command line:
pablo@pmenaws1=> ./extract_date_written.rb myfile.html
myfile.html
grep: fn: No such file or directory
I'm hoping this is a newbie mistake that is easily remedied. Thanks
again in advance.
on 2012-12-14 20:53
Yes; use string interpolation to insert the variable's content into the
string:
foo = 'world'
puts "Hello, #{foo}!" # outputs: Hello, world!
It works the same for backticks and the percent string-like notation
(but you'll have to use %X with uppercased X, similarly to how you
have to use strings in quotes instead of in apostrophes to be able to
interpolate in them).
So this would be (note: code untested):
date = %X{grep "<strong>Date:</strong>" #{fn} | cut -d' ' -f3-5}
Also, if the contents of fn come from the user, you should escape it
somehow to prevent them from executing arbitrary commands in your
shell environment!
-- Matma Rex
on 2012-12-14 21:22
I knew someone would set me straight. Thank you so much! By the way, ultimately the contents of "fn" will come from a script and not from a user, so there should be no potential security holes.
on 2012-12-15 08:01
Paul Mena wrote in post #1089118: > > So I'm > writing some Ruby code to extract text out of an HTML file and then > insert it into a mySQL database. I've been using Nokogiri to get some > formatted text, but also need to get some text that's easy enough to get > with a scripting language like bash. Specifically, I need to do this: > > grep "<strong>Date:</strong>" $1 | cut -d' ' -f3-5 > > Is there an easy way to embed this logic within Ruby code - or, better > yet - a way to do the same thing in Ruby? data.txt: ------ <html> <head><title>Testing</title></head> <body> <strong>Hello world</strong><div>no no no no no no</div> <strong>Date:</strong><div>1 2 3 4 5 6 7</div> <strong>Date:</strong><div>a b c d e f g</div> </body> </html> my_prog.rb: ---------- fname = ARGV[0] start_column = 3 end_column = 5 target_range = (start_column-1)..(end_column-1) IO.foreach(fname) do |line| if line.match(/<strong>Date:<\/strong>/) pieces = line.split(" ") puts pieces[target_range].join(":") end end --output:-- $ ruby my_prog.rb data.txt 3:4:5 c:d:e > > So I'm > writing some Ruby code to extract text out of an HTML file and then > insert it into a mySQL database. I've been using Nokogiri to get some > formatted text, but also need to get some text that's easy enough to get > with a scripting language like bash. > Nokogiri provides myriad ways to locate any text in an html file. For instance: require 'nokogiri' fname = 'data.txt' @doc = Nokogiri::XML(File.open(fname)) my_xpath ="//strong[text()='Date:']/following-sibling::div[1]" @doc.xpath(my_xpath).each do |div| puts div.text.split(" ")[2..4].join(":") end --output:-- $ ruby my_prog.rb 3:4:5 c:d:e > > I recognize that there are > things that a > scripting language can't do - like interacting with a database. > What scripting language is unable to interact with a database? mysql> use mydb; mysql> select * from people; +----+-------+-------+ | id | name | info | +----+-------+-------+ | 1 | Jane | 3 4 5 | | 2 | John | a b c | +----+-------+-------+ require 'mysql2' begin client = Mysql2::Client.new(:host => "localhost", :username => "root") client.query("USE mydb") results = client.query("SELECT * FROM people") results.each do |row| puts "#{row['id']} #{row['name']} #{row['info']}" end client.query("INSERT INTO people(name, info) VALUES('Jeff', '7 8 9')") results = client.query("SELECT * FROM people") results.each do |row| puts "#{row['id']} #{row['name']} #{row['info']}" end rescue Mysql2::Error => e puts e.errno puts e.error ensure client.close if client end --output:-- 1 Jane 3 4 5 2 John a b c 1 Jane 3 4 5 2 John a b c 3 Jeff 7 8 9 mysql> select * from people; +----+-------+-------+ | id | name | info | +----+-------+-------+ | 1 | Jane | 3 4 5 | | 2 | John | a b c | | 3 | Jeff | 7 8 9 | +----+-------+-------+
on 2012-12-15 09:38
Am 14.12.2012 19:15, schrieb Paul Mena: > grep "<strong>Date:</strong>" $1 | cut -d' ' -f3-5 > > Is there an easy way to embed this logic within Ruby code - or, better > yet - a way to do the same thing in Ruby? Yes, you can do this easily without using Bash: pattern = %r{<strong>Date:</strong>} File.readlines('testfile.txt').grep(pattern).each do |line| p line.split(' ')[2..4] end Output: ["2", "3", "4"] ["b", "c", "d"] where: $ cat testfile.txt <strong>Date:</strong> 1 2 3 4 5 line another line without Date <strong>Date:</strong> a b c d e yet another line
on 2012-12-15 15:57
Thanks again for numerous suggestions. I'm learning that Ruby is a rich and versatile language.
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.