YAML & readlines & modify text files

Hello,

Can anyone help me with the following issue:

I have a YAML file that looks like this:


gm: Name01
gs: N01

gm: Name02
gs: N02

In ruby I’m trying to read all *.txt files in the current folder and
all sub-folders. The text files look like this:

Name-line: Name01
Link-line: blabla.something.N01&=bla

Name-line: Name02
Link-line: blabla.something.N02&=bla

My ruby code looks like this:

require ‘yaml’
lnk = ‘blabla.something.’
op = ‘&=bla’

name = File.open(‘name.yaml’)
yp = YAML::load_documents(name) do |name|
txt_files = Dir.glob(’**/*.txt’).each do |path|
file = File.open(path).readlines.each { |line|

               if line.match(/Link-line/)
                  then line.gsub!(/Link-line.*/, 'Link-line: '+

lnk + name[‘gs’] + op)
end
}

File.open(path, 'w'){|f| f.write file}
end

end

My problem is that the code replaces the YAML value ‘gs’ with the last
value found in the *.txt values.

I want it to read the Name-line in each file and after that use the
appropriate ‘gs’ value from the YAML file in “line.gsub!(/Link-
line.*/, 'Link-line: '+ lnk + name[‘gs’] + op)” with the Link-line
field.

I’ve been trying to find a way for some time now but I just can’t seem
to be able to do it and I’m starting to have headaches :slight_smile: so if anyone
has any ideas or improvements or critiques please don’t hesitate to
reply.

Cheers :slight_smile:

Alle mercoledì 19 settembre 2007, Dan G. ha scritto:

gm: Name02

has any ideas or improvements or critiques please don’t hesitate to
reply.

Cheers :slight_smile:

I’m not at all sure I understand correctly what you want to do. I think
you
want to replace the line under

Name-line: Name01

with some text containing a string taken from the Name01 entry in the
yaml
file. Is it correct? If it isn’t, then please explain better what you
mean.
Otherwise, read on.

In my opinion, you’re storing data in the YAML file in the wrong way,
because,
at each iteration, name contains only one pair of name/replacement
string,
which forces you to iterate over all the files for each document in the
YAML
file (and also makes the replacing code more complicated). I think your
YAML
file should contain a single hash, with the names as keys and the
replacement
strings as values:


Name01: N01
Name02: N02

Then, you can do the following (untested)

require ‘yaml’
lnk = ‘blabla.something.’
op = ‘&=bla’
hash = File.open(‘name.yaml’){|f| YAML.load f}
Dir.glob(’**/.txt’).each do |path|
lines = File.readlines(path)
lines.each_with_index do |line, i|
if line.match(/Link-line/)
match = lines[i-1].match(/Name-line:\s(.)$/)[1]
line.gsub!(/Link-line.
/, 'Link-line: '+ lnk+hash[match[1]]+
op) if match and hash.has_key?(match[1])
end
end
File.open(path,‘w’){|f| f.write lines}
end

When iterating on the lines, the block is passed not only the line, but
also
the line number. This way, when you meet a Link-line line, you can
access the
corresponding name-line using its index. It then matches the previous
line
with a regexp to extract the name from it and stores the result in the
match
variable. If match is not nil (i.e if the name line had the expected
format)
and the name is included in hash, the replacement is performed (of
course,
you can skip this test if you’re confident enough in the format of the
files
and in the contents of the yaml file)

I hope this helps

Stefano

On Sep 19, 5:27 pm, Stefano C. [email protected] wrote:

Alle mercoledì 19 settembre 2007, Dan G. ha scritto:

at each iteration, name contains only one pair of name/replacement string,

         op) if match and hash.has_key?(match[1])

and the name is included in hash, the replacement is performed (of course,
you can skip this test if you’re confident enough in the format of the files
and in the contents of the yaml file)

I hope this helps

Stefano

You are correct, that is what I want!

I tryied what you suggested but I get some weird errors:

new.rb:45: undefined method []' for nil:NilClass (NoMethodError) from D:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in each_with_index’
from new.rb:43:in each' from new.rb:43:in each_with_index’
from new.rb:43
from new.rb:41:in `each’
from new.rb:41

Line 45 is: match = lines[i-1].match(/Name-line:\s(.)$/)[1]
Line 43 is: lines.each_with_index do |line, i|
Line 41 is: Dir.glob('**/
.jad’).each do |path|

I don’t get it, as far as I can tell the code is ok…

On Sep 19, 9:22 pm, Stefano C. [email protected] wrote:

Name-line: Name01
with the names as keys and the replacement strings as values:
hash = File.open(‘name.yaml’){|f| YAML.load f}
end

`each_with_index’
I don’t get it, as far as I can tell the code is ok…
However, if you found this mistake, it may mean that there’s something amiss
in either your .txt files or your yaml file (see the end of my previous
post).

Stefano

Stefano thanks for your replies so far.

That did. The code executes ok but “match = lines[i-1].match(/Name-
line:\s(.*)$/)” always returs nil (I did “puts match” after it) and I
don’t understand why. I checked my txt files and my yaml file and they
are ok.

txt file has:

Name-line: New York
Link-line: etc

and YAML file has:


New York: NY

So the value from the Name is equal to the one in the YAML file. Am I
still missing something obvious?

Sorry if some things seem so obvious that I should understand them but
I’m just a beginner, started a couple of weeks back and the only way
for me to learn is from examples that I try myself and it annoys to
see that something simple gives me so much trouble but I don’t want to
quit either :).

Alle mercoledì 19 settembre 2007, Dan G. ha scritto:

with some text containing a string taken from the Name01 entry in the

Dir.glob(’**/.txt’).each do |path|

I hope this helps
from new.rb:43:in `each’

However, if you found this mistake, it may mean that there’s something
are ok.

So the value from the Name is equal to the one in the YAML file. Am I
still missing something obvious?

Sorry if some things seem so obvious that I should understand them but
I’m just a beginner, started a couple of weeks back and the only way
for me to learn is from examples that I try myself and it annoys to
see that something simple gives me so much trouble but I don’t want to
quit either :).

There’s nothing obvious in the problem you’re having (at least, not
obvious
for me). If, as you say, match is nil, it means that the trouble is
outside
the yaml file (which is only used in the following line). So, either the
data
doesn’t have the expected format or the regexp isn’t doing what I think
it
should. Yet, trying in irb, the regexp matched the line you posted. I’m
at a
loss, here. The best suggestion I can give you is to try putting a
p lines[i-1]
before the match line and see if this gives some insight on what it’s
happening.

Stefano

On Sep 19, 11:03 pm, Stefano C. [email protected] wrote:

think you want to replace the line under
files for each document in the YAML file (and also makes the
require ‘yaml’
end
hash, the replacement is performed (of course, you can skip this test

Line 43 is: lines.each_with_index do |line, i|
checks that match is not nil before trying to extract an element from

and YAML file has:
see that something simple gives me so much trouble but I don’t want to
happening.

Stefano

First of all thanks for all your help so far.

I just don’t get it… the output looks like this:

“CATEG: CITY=0;STATE=0;\n”
nil
“CATEG: CITY=0;STATE=0;\n”
nil

and so on for all files.

CATEG being another line inside the text files… and the problem
might be because the Name-line isn’t always above the Link-line. I
could have n lines between or below, what I’m trying to say is that I
don’t know where the Name-line is inside the texts files.

Alle mercoledì 19 settembre 2007, Dan G. ha scritto:

CATEG being another line inside the text files… and the problem
might be because the Name-line isn’t always above the Link-line. I
could have n lines between or below, what I’m trying to say is that I
don’t know where the Name-line is inside the texts files.

This changes everything. I assumed (according to the example lines you
posted)
that each Link-line had the corresponding Name-line above it. But, if
there
isn’t a relation between the position of the two kind of lines, how can
you
know what to put in the link line? I mean, what is the relation which
connects a Link-line and the corresponding Name-line? Since (from what
you
say now) the position of the two kind of lines are random (as far as
this
problem is concerned, at any rate) are you able, given a single
Link-line, to
understand which is the corresponding Name-line? If yes, how? Whithout
knowing this, I can’t help you.

Stefano

Alle mercoledì 19 settembre 2007, Dan G. ha scritto:

you mean. Otherwise, read on.
Name02: N02
if line.match(/Link-line/)
access the corresponding name-line using its index. It then matches the

from new.rb:41:in `each’
from new.rb:41

Line 45 is: match = lines[i-1].match(/Name-line:\s(.)$/)[1]
Line 43 is: lines.each_with_index do |line, i|
Line 41 is: Dir.glob(’**/
.jad’).each do |path|

I don’t get it, as far as I can tell the code is ok…

I think it’s because of a mistake in my code: the [1] part of line 45
shouldn’t be there (it’s a leftover from a previous version of the
code).
If I’m right, the string doesn’t match the regexp, so
lines[i-1].match(…)
returns nil, which doesn’t have a [] method, leading to the error you
get.
Avod to call nil.[] is the reason for the conditional at the end of the
following line (note that the conditional checks that match is not nil
before
trying to extract an element from it), but this is useless if I call it
on
the line before. Removing that [1] from line 45 should solve your
porblem.

However, if you found this mistake, it may mean that there’s something
amiss
in either your .txt files or your yaml file (see the end of my previous
post).

Stefano

On Sep 20, 12:39 am, Stefano C. [email protected] wrote:

know what to put in the link line? I mean, what is the relation which
connects a Link-line and the corresponding Name-line? Since (from what you
say now) the position of the two kind of lines are random (as far as this
problem is concerned, at any rate) are you able, given a single Link-line, to
understand which is the corresponding Name-line? If yes, how? Whithout
knowing this, I can’t help you.

Stefano

All I can say is that in each text file there will only be one Name-
line and one Link-line. The only connection between this 2 lines is
that the Link uses the shorter version of what is written in the Name-
line (i.e. if the Name-line: New York the Link line will use NY)

Isn’t it possible to read the Name-line, take the correspond value
from the YAML file, store it in a variable (string?) inside the code
and then use it in the Link-line, when going to the next text file
read again the Name-line and take the value from YAML and so on…

Alle giovedì 20 settembre 2007, Dan G. ha scritto:

lines, how can you know what to put in the link line? I mean, what is the
that the Link uses the shorter version of what is written in the Name-
line (i.e. if the Name-line: New York the Link line will use NY)

Isn’t it possible to read the Name-line, take the correspond value
from the YAML file, store it in a variable (string?) inside the code
and then use it in the Link-line, when going to the next text file
read again the Name-line and take the value from YAML and so on…

If each file contains only one Name-line and one instance of the
corresponding
Link-line, this should work:

require ‘yaml’

lnk=‘blabla.something.’
op = ‘&=bla’
hash = File.open(‘name.yaml’){|f| YAML.load f}

Dir.glob(’**/.txt’).each do |f|
lines = File.readlines f
name = nil
link_idx = nil
lines.each_with_index do |l, i|
if l.match(/Name-line:\s+(.
)$/) then name $1
elsif l.match(/Link-line/) then link = i
end
break if name and link_idx
end
if name
rep = hash[name]
if rep
lines[link_idx]=“Link-line: #{lnk}#{rep}#{op}”
File.open(f, ‘w’){|of| of.write lines}
else puts “name.yaml doesn’t contain an entry for #{name} (file
#{f})”
end
else puts “Couldn’t find a Name line in file #{f}”
end
end

For each file, it loops each line looking for a Name-line or a
Link-line. When
it finds the former, it stores the name in the name variable; when it
finds
the latter, it stores its index in the link_idx variable. When both are
found, the loop stops (no point in examining the remaining lines). If a
name
has been found and it corresponds to an entry in the hash, the line with
index link_idx is replaced with a new one (I removed the call to gsub!,
since
we’re rebuilding the entire line, but you can put it back, if you need
it),
then the array is written to the file. If the Name-line wasn’t found, or
if
the hash doesn’t contain an entry for it, an error message is printed on
screen and the next file is processed.

I hope this helps

Stefano

On Sep 20, 1:41 am, Stefano C. [email protected] wrote:

Stefano

lines = File.readlines f
if rep
the latter, it stores its index in the link_idx variable. When both are
Stefano
Thanks for your reply Stefano!

I had to do:
link_idx = nil.to_i
otherwise I would get this error: `[]': no implicit conversion from
nil to integer (TypeError)

And it seems to be working but if I use this:
lines[link_idx]=“Link-line: #{lnk}#{rep}#{op}”
the Name-line is removed and is replaced by the “Link-line:
#{lnk}#{rep}#{op}” but the old Link-line = 0 is kept too.

If I use
lines[link_idx].gsub!(/Link-line.*/, 'Link-line: '+lnk+rep
+op)
nothing happends, no errors, no modified files, no nothing.

I don’t see anything wrong with the gsub! so what might be the problem?

Alle giovedì 20 settembre 2007, Dan G. ha scritto:

you posted) that each Link-line had the corresponding Name-line above

If each file contains only one Name-line and one instance of the
name = nil
lines[link_idx]=“Link-line: #{lnk}#{rep}#{op}”
both are found, the loop stops (no point in examining the remaining

#{lnk}#{rep}#{op}" but the old Link-line = 0 is kept too.

If I use
lines[link_idx].gsub!(/Link-line.*/, 'Link-line: '+lnk+rep
+op)
nothing happends, no errors, no modified files, no nothing.

I don’t see anything wrong with the gsub! so what might be the problem?

Another couple of mistakes on my part, I’m afraid. This should work

require ‘yaml’

lnk=‘blabla.something.’
op = ‘&=bla’
hash = File.open(‘name.yaml’){|f| YAML.load f}

Dir.glob(’**/.txt’).each do |f|
lines = File.readlines f
name = nil
link_idx = nil
lines.each_with_index do |l, i|
#added missing = between name and $1
if l.match(/Name-line:\s+(.
)$/) then name = $1
#changed link = i to link_idx = i
elsif l.match(/Link-line/) then link_idx = i
end
break if name and link_idx
end
#checking that also link_idx is not nil
if name and link_idx
rep = hash[name]
if rep
lines[link_idx]=“Link-line: #{lnk}#{rep}#{op}”
File.open(f, ‘w’){|of| of.write lines}
else puts “name.yaml doesn’t contain an entry for the name #{name}”
end
#changed the error message
else puts “Name or Link line are missing in file #{f}”
end
end

Stefano

And I have another question since I couldn’t find anything about this
subject.

If in my YAML file I have a value that is like “New York: City: NYC”
it will obviously generate an error. What I want to know if it’s
possible to change the YAML separator “:” with something else like “;”
so I can have a value in the YAML file like this “New York: City; NYC”?

On Sep 20, 11:51 am, Stefano C. [email protected] wrote:

could have n lines between or below, what I’m trying to say is that
the corresponding Name-line? If yes, how? Whithout knowing this, I
from the YAML file, store it in a variable (string?) inside the code
hash = File.open(‘name.yaml’){|f| YAML.load f}
end

I don’t see anything wrong with the gsub! so what might be the problem?
lines = File.readlines f
#checking that also link_idx is not nil
end

Stefano

Thank you Stefano! Works like a charm now :slight_smile:

This part (#added missing = between name and $1 ) I figured it too but
I should have been more careful about this part “#checking that also
link_idx is not nil”

I liked the #comments part, thank you!

Now I’m gonna try and search for text files inside a zip archive and
modify them and then zip them back together.

Stefano, how long have you been using Ruby because you seem to know a
lot of stuff and I was wondering how long it will take me to know half
of what you know?

Alle giovedì 20 settembre 2007, Dan G. ha scritto:

inside the texts files.
Name-line? If yes, how? Whithout knowing this, I can’t help you.
from the YAML file, store it in a variable (string?) inside the
op = ‘&=bla’
break if name and link_idx
end
found, or if the hash doesn’t contain an entry for it, an error
otherwise I would get this error: `[]’: no implicit conversion from
nothing happends, no errors, no modified files, no nothing.

break if name and link_idx

else puts “Name or Link line are missing in file #{f}”

I liked the #comments part, thank you!

Now I’m gonna try and search for text files inside a zip archive and
modify them and then zip them back together.

Stefano, how long have you been using Ruby because you seem to know a
lot of stuff and I was wondering how long it will take me to know half
of what you know?

I’ve been using ruby for about two years. And don’t worry: you only need
a
little time to get to know the language, then progresses become much
quicker.

Stefano

On Sep 20, 3:17 pm, Dan G. [email protected] wrote:

And I have another question since I couldn’t find anything about this
subject.

If in my YAML file I have a value that is like “New York: City: NYC”
it will obviously generate an error. What I want to know if it’s
possible to change the YAML separator “:” with something else like “;”
so I can have a value in the YAML file like this “New York: City; NYC”?

Never mind this… it was pretty obvious that I had to use “New York:
City”: NYC in the YAML file :slight_smile: