Hi everyone,
I expect this is a rather trivial problem, but I just started using ruby
and am a bit stuck right now.
Here is what I want to do:
I have a text file, that contains information in the following format:
KOG0003
At2g36170
At3g52590
CE15495
7295730
KOG0004
Hs20476120
YIL148w
YKR094c
SPAC11G7.04
Now, this has to go into a relational database. But right now this is
not really a table. The desired output would look something like this:
KOG0003 At2g36170
KOG0003 At3g52590
KOG0003 CE15495
KOG0003 7295730
KOG0004 Hs20476120
KOG0004 YIL148w
KOG0004 YKR094c
Well, you get the picture. What I tried to do is to read the text file,
than look for lines that start with a blank and replace that blank with
the first word of the previous line, given that this line does in fact
starts with a word (could also be selected by using KOG[0-9]*). I
thought of storing the KOG[0-9] in a variable, but overall I cant make
it work and have no real idea how to solve this. Any help would be
greatly appreciated. Guess for an experienced user this is a three-liner
._.
Cheers,
Marc
2007/7/9, Marc H. [email protected]:
At3g52590
the first word of the previous line, given that this line does in fact
starts with a word (could also be selected by using KOG[0-9]*). I
thought of storing the KOG[0-9] in a variable, but overall I cant make
it work and have no real idea how to solve this. Any help would be
greatly appreciated. Guess for an experienced user this is a three-liner
Hm… Maybe something like this:
key = nil
ARGF.each do |line|
line.chomp!
case line
when /^(\S+)/
key = line.strip
when /^\s+(\S+)/
print key, " ", $1, “\n” if key
else
# ignore
end
end
Kind regards
robert
Thanks you two, worked like a charm!
Cheers,
Marc
On 9 Jul 2007, at 16:42, Marc H. wrote:
At2g36170
not really a table. The desired output would look something like this:
file,
Cheers,
Marc
–
Posted via http://www.ruby-forum.com/.
Not a very fancy solution, but it seems to work for the data you
posted. Also uses the pattern you suggested, storing the KOG*
identifier in a variable (field1):
[alexg@powerbook]/Users/alexg/Desktop(7): cat test.rb
field1 = nil
IO.foreach(ARGV[0]) do |l|
if l.match(/^(\S+)/)
field1 = $1
else
puts “#{field1} #{l.strip}”
end
end
[alexg@powerbook]/Users/alexg/Desktop(8): cat data.dat
KOG0003
At2g36170
At3g52590
CE15495
7295730
KOG0004
Hs20476120
YIL148w
YKR094c
SPAC11G7.04
[alexg@powerbook]/Users/alexg/Desktop(9): ruby test.rb data.dat
KOG0003 At2g36170
KOG0003 At3g52590
KOG0003 CE15495
KOG0003 7295730
KOG0004 Hs20476120
KOG0004 YIL148w
KOG0004 YKR094c
KOG0004 SPAC11G7.04
Alex G.
Bioinformatics Center
Kyoto University