Splitting with a regex & keeping a ref?

I’m writing some scripts to help handle some ornery samba servers we
have: part of that is unfortunately reading the config scripts that
have built up over the years.

I was hoping to use the standard string method as a quick &
not-so-dirty way of parsing the files, given that samba uses a very
simple format.

#the sample_data variable is defined below
irb(main):sample_data.split(/[[a-z0-9]+]/i)
=> ["", "\ncomment = shared directory for the shop\npath =
/dept/shop\nvalid u …(truncated)
Gives good results, but omits what’s between the brackets. I expected
that part.

irb(main):sample_data.split(/([[a-z0-9]+])/i)
=> ["", “[shop]”, "\ncomment = shared directory for the shop\npath =
/dept/sho …(truncated)
Neat, gives me the data between the brackets in an element before the
data itself.

I know quite well I can zip through that array again, but I was
wondering, hoping, that there would be a way of accessing that back
reference in a block as part of the split.

Is there any way to do that that I’m just missing?

Thanks,
Kyle

sample_data=%{[shop]
comment = shared directory for the shop
path = /dept/shop
valid users = @shop @admin
public = no
writable = yes
force group = shop
create mask = 0770
[bob]
comment = User files for bob
path = /users/bob
valid users = bob @admin
public = no
writable = yes
create mask = 0770}

Hi –

On Thu, 1 May 2008, Kyle S. wrote:

=> ["", "\ncomment = shared directory for the shop\npath =

I know quite well I can zip through that array again, but I was
wondering, hoping, that there would be a way of accessing that back
reference in a block as part of the split.

I’m afraid I can’t quite follow that sentence. What do you mean by a
back reference? Can you show some sample desired output?

David

David, back reference as in a regex back reference.
In a nutshell, it stores what was matched, and allows you to do
something with it. You just place parentheses around the part of the
match you want to save.

They work like this in ruby’s gsub (but a little differently in sed,
if that’s the regex you grew up with).

example=%{Brian had a dog
James had a cat
Allen has a hampster}
puts example
#If you wanted to change the type of pet with gsub, you could do it like
this…
puts example.gsub("/[^ ]+$/",“grue”)
#but if you wanted to describe the pet, and not change the type, you’d
need a backreference
puts example.gsub(/([^ ]+$)/){|i| “big ugly #{i}”}

Ohh right, desired sample output.

What I’d really like, is to split the string, and either stuff it
straight into a hash at the same time, or, more realistically since
it’s splitting, array tuples.
So…

sample.data.split(){magic happens here}
=>{"[shop]"=>"\ncomment = shared directory for the shop\npath…>"}

or
sample.data.split(){magick happens here}
=>[["[shop]","\ncomment = shared directory for the shop\npath…>"]]

David,
re-reading your sig, and that page, I’ve got to apologize,
you already knew that stuff in spades I’m sure! :slight_smile:

What part doesn’t quite make sense?

yermej,
scan you say. Heh, I never even thought of that one.
Makes the whole thing rather simple!

Thanks.

On May 1, 9:46 am, Kyle S. [email protected] wrote:

=> ["", "\ncomment = shared directory for the shop\npath =
I know quite well I can zip through that array again, but I was
path = /dept/shop
writable = yes
create mask = 0770}

I think you might want scan instead of split.

sample_data.scan( /([[a-z0-9]+])([^[]*)/i) do |share, opts|

create your hash or whatever here

end

Robbert, yermej, David,

Thanks a bunch!
Here’s what I finally came up with, in case anyone’s bored enough to
wonder.

file="/path/to/smb/file/sample.conf"
regex=/([[a-z0-9]+])([^[])/i
samba_config={}
File.open(file){|f| f.read()}.scan(regex) do
|title,options|
samba_config.store(title,{})
options.strip.each() do
|l|
samba_config[title].store(l[/^[^=]
/].strip,l[/[^=]*[^\n]$/].strip)
end
end

On 01.05.2008 18:08, yermej wrote:

irb(main):sample_data.split(/[[a-z0-9]+]/i)

comment = shared directory for the shop
public = no
writable = yes
create mask = 0770}

I think you might want scan instead of split.

sample_data.scan( /([[a-z0-9]+])([^[]*)/i) do |share, opts|

create your hash or whatever here

end

Yes. Two suggestions:

[email protected] /cygdrive/c/Temp
$ ./smpars.rb
{“shop”=>
{“public”=>“no”,
“writable”=>“yes”,
“create mask”=>“0770”,
“valid users”=>"@shop @admin",
“path”=>"/dept/shop",
“comment”=>“shared directory for the shop”,
“force group”=>“shop”},
“bob”=>
{“public”=>“no”,
“writable”=>“yes”,
“create mask”=>“0770”,
“valid users”=>“bob @admin”,
“path”=>"/users/bob",
“comment”=>“User files for bob”}}
{“shop”=>
{“public”=>“no”,
“writable”=>“yes”,
“create mask”=>“0770”,
“valid users”=>"@shop @admin",
“path”=>"/dept/shop",
“comment”=>“shared directory for the shop”,
“force group”=>“shop”},
“bob”=>
{“public”=>“no”,
“writable”=>“yes”,
“create mask”=>“0770”,
“valid users”=>“bob @admin”,
“path”=>"/users/bob",
“comment”=>“User files for bob”}}

[email protected] /cygdrive/c/Temp
$ cat smpars.rb
#!/bin/env ruby

require ‘pp’

sample_data = <<EOS
[shop]
comment = shared directory for the shop
path = /dept/shop
valid users = @shop @admin
public = no
writable = yes
force group = shop
create mask = 0770
[bob]
comment = User files for bob
path = /users/bob
valid users = bob @admin
public = no
writable = yes
create mask = 0770
EOS

def parse1(dat)
conf = {}
key = nil
dat.each do |line|
case line
when /^\s*[([^]]+)]\s*$/
key = $1
conf[key] ||= {}
when /^\s*([^=]?)\s=\s*(.*)$/
conf[key][$1] = $2
end
end
conf
end

def parse2(dat)
conf = {}
key = nil
dat.scan %r{
^\s*[([^]]+)]\s*$
| ^\s*([^=]?)\s=\s*(.*)$
}x do |m|
if m[0]
key = m[0]
conf[key] ||= {}
else
conf[key][m[1]] = m[2]
end
end
conf
end

pp parse1(sample_data)
pp parse2(sample_data)

[email protected] /cygdrive/c/Temp
$

Cheers

robert

David,
I don’t mind it at all!

Out of curiosity, agreeing that File.read().scan() is much cleaner, is
it just syntactic sugar for the same thing, or is it computationally
different?

Thanks for the Hash[*Array] syntax btw, I’ve used it way way back, but
for the life of me couldn’t remember it, thought maybe I was mistaken.

Hi –

On Fri, 2 May 2008, Kyle S. wrote:

David,
I don’t mind it at all!

Out of curiosity, agreeing that File.read().scan() is much cleaner, is
it just syntactic sugar for the same thing, or is it computationally
different?

I don’t know whether File.read is actually written in terms of
File.open (I’m afraid I’m too lazy to check right now), but I think
the underlying system calls etc. would be at least very similar. I
expect File.read is slightly cheaper, since it doesn’t involve the
whole block structure. (I’m talking, of course, specifically about
comparison with File.open {|f| f.read }, where you get the whole file
at once.)

David

On May 1, 9:46 am, Kyle S. [email protected] wrote:

=> ["", "\ncomment = shared directory for the shop\npath =
I know quite well I can zip through that array again, but I was
path = /dept/shop
writable = yes
create mask = 0770}

regex = /([.*?])/
h = Hash[ *IO.read(“data”).strip.split( regex )[1…-1] ]
h.each{|k,v| h[k] = Hash[ *v.strip.split( / *= *|\n/ ) ] }

Hi –

On Fri, 2 May 2008, Kyle S. wrote:

samba_config.store(title,{})
options.strip.each() do
|l|
samba_config[title].store(l[/^[^=]/].strip,l[/[^=][^\n]$/].strip)
end
end

I know you’re not asking for refactoring advice, but here’s some
anyway :slight_smile:

If you’re just going to read a file’s contents into a string, you can
use File.read, rather than the whole open/read thing. Also, I’d
encourage you to drop the empty parentheses after method names. The
message-sending dot tells you that it’s a method; the () doesn’t add
signal, just noise.

Anyway, here’s a tweaked version, in case it’s of interest. Nothing
too radical, just a couple of possibly fun alternative techniques :slight_smile:

File.read(“filename”).scan(regex) do |title,options|
samba_config[title] = {}
options.strip.each do |option|
samba_config[title].update(Hash[option.strip.split(/\s=\s*/)])
end
end

David

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs