Formatting a listing

i have a listing which looks like this
1
2
3

3
4
5

20
3
5
5

i would like a quick way to format that output such that i end up with 3
columns like the following.
1,2,20
2,4,3
3,5,5
5

any ideas?
thanks

On Mon, Jun 15, 2009 at 12:46 PM, George
George[email protected] wrote:

3
any ideas?
thanks

If this is a homework assignement I am terribly sorry, but I cannot
know.

puts DATA.inject( [ [] ] ){ | ary, ele |
ele.strip.empty? ? ary << [] : ary.last << ele.to_i # or ele.chomp
if you prefer
ary
}
.reject( &:empty? ) # .map( &:join, ", " ) is not standard Ruby
.map{ | eles | eles.join( ", " )}

END
1
2

42
44
46

2000

HTH
Robert

Toutes les grandes personnes ont d’abord été des enfants, mais peu
d’entre elles s’en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exupéry]

George G. [email protected] writes:

3
any ideas?
It’s hard to tell, since there seem to be random changes in the output
data vs. the input. Well, one random change. Or do you really mean
to subtract 1 from the first row, second column of the output?
What about the missing commas? Why the numbers are left-aligned?

(printf “%s\n\n” ,
(begin
(matrix = (Array . new))
(ncols = 1)
(matrix . push(Array . new))
(((IO . readlines’/tmp/test.data’) .
map { | line | (line . strip) }) .
each { | item | (if (item == “”)
(ncols = (ncols + 1))
(matrix . push(Array . new))
else
((matrix . at(ncols - 1)) . push item)
end)
})
(height = ((matrix . map { | column | (column . size) }) . max))
(widths = (matrix . map { | column | ((column . map { | cell |
((sprintf(“%s” , cell)) . size) }) . max) }))
((((matrix . map { | column | (column + (Array . new((height -
(column . size)) , “”))) }) .
transpose) .
map { | row | ((([ row , widths ] . transpose) .
map { | r , w | (sprintf( “%*s” , w , r)) }) .
join “,”)}) .
join “\n”)
end))

1,3,20
2,4, 3
3,5, 5
, , 5

Pascal J. Bourguignon wrote:

George G. [email protected] writes:

3
any ideas?
It’s hard to tell, since there seem to be random changes in the output
data vs. the input. Well, one random change. Or do you really mean
to subtract 1 from the first row, second column of the output?
What about the missing commas? Why the numbers are left-aligned?

thank you so much for the reply.
The first column is a column of numbers separated by some white space.
The idea is to capture the first block and create a column and the
capture the next one and create another column adjacent to the previous
one such that
if
1
2
3
whitespace(one or more)
3
4
5
6
whitespace(one or more)
20
2
.
.
what i need is to transform that into (call it a matrix)
such that every block of numbers separated by white space becomes a new
row in the output e.g for he above case
1,3,20
2,4, 2
3,5,
, 6,

and so on. the comma separator is neccessary so that i can later parse
it using any csv library or i can read it into a spreadsheet and

Thank you

On Mon, Jun 15, 2009 at 2:12 PM, George
George[email protected] wrote:

puts DATA.inject( [ [] ] ){ | ary, ele |

working on some huge datasets that run in to up to 20 000 columns and
was looking for an easy way of creating the columns.
If you have perf problems do not worry, inject is slow replace
inject(x){ |a,e|
}
with
loc = x
each { |e|
loc << e # e.g
}
loc

you will have a speedup of 2~3

Robert D. wrote:

On Mon, Jun 15, 2009 at 12:46 PM, George
George[email protected] wrote:

3
any ideas?
thanks

If this is a homework assignement I am terribly sorry, but I cannot
know.

puts DATA.inject( [ [] ] ){ | ary, ele |
ele.strip.empty? ? ary << [] : ary.last << ele.to_i # or ele.chomp
if you prefer
ary
}
.reject( &:empty? ) # .map( &:join, ", " ) is not standard Ruby
.map{ | eles | eles.join( ", " )}

END
1
2

42
44
46

2000

HTH
Robert

Thank so much for pointing the way: And this is not assignment! :slight_smile: Am
working on some huge datasets that run in to up to 20 000 columns and
was looking for an easy way of creating the columns.

puts DATA.inject( [ [] ] ){ | ary, ele |
ele.strip.empty? ? ary << [] : ary.last << ele.to_i # or ele.chomp
#if you prefer

ary

}
#.reject( &:empty? ) # .map( &:join, ", " ) is not standard Ruby
#.map{ | eles | eles.join( ", " )}

END
1
2

42
44
46

am not trying to ask too much but your code is real witchcraft :slight_smile:
raises an error
`<<': can’t convert Array into Integer (TypeError)

On Mon, Jun 15, 2009 at 2:10 PM, Pascal J.
Bourguignon[email protected] wrote:


Pascal B.
Really Pascal, you do not need to sign your mails :wink:


Toutes les grandes personnes ont d’abord été des enfants, mais peu
d’entre elles s’en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exupéry]


such that every block of numbers separated by white space becomes a new
row in the output e.g for he above case …

what i meant was column not row(sorry)
what i need is to transform that into (call it a matrix)
such that every block of numbers separated by white space becomes a new
column in the output e.g for he above case

(printf “%s\n\n” ,
(begin
(matrix = (Array.new))
(ncols = 1)
(matrix.push(Array.new))
(((IO.readlines’/home/george/test.data’).
map { | line | (line.strip) }).
each { | item | (if (item == “”)
(ncols = (ncols + 1))
(matrix.push(Array.new))
else
((matrix.at(ncols - 1)).push item)
end)
})
(height = ((matrix.map { | column | (column.size) }).max))
(widths = (matrix.map { | column | ((column.map { | cell |
((sprintf("%s" , cell)).size) }).max) }))
((((matrix.map { | column | (column + (Array.new((height -
(column.size)) , “”))) }) .
transpose).
map { | row | ((([ row , widths ].transpose).
map { | r , w | (sprintf( “%*s” , w , r)) }) .
join “,”)}).
join “\n”)
end))

produces an error:
in `sprintf’: no implicit conversion from nil to integer (TypeError)

George G. [email protected] writes:

                  else
                  map { | r , w | (sprintf( "%*s" , w , r)) }) .
                 join ",")}).
 join "\n")

end))

produces an error:
in `sprintf’: no implicit conversion from nil to integer (TypeError)

Well, I tried it with this file:

1
2
3

3
4
5

20
3
5
5


You may try to execute it expression per expression, and see where that
nil comes from?

irb(main):850:0>
(matrix = (Array . new))
[]
irb(main):856:0>
(ncols = 1)
1
irb(main):862:0>
(matrix . push(Array . new))
[[]]
irb(main):868:0>
(((IO . readlines’/tmp/test.data’) .
map { | line | (line . strip) }) .
each { | item | (if (item == “”)
(ncols = (ncols + 1))
(matrix . push(Array . new))
else
((matrix . at(ncols - 1)) . push item)
end)
})
[“1”, “2”, “3”, “”, “3”, “4”, “5”, “”, “20”, “3”, “5”, “5”]
irb(main):890:0>
(height = ((matrix . map { | column | (column . size) }) . max))
4
irb(main):896:0>
(widths = (matrix . map { | column | ((column . map { | cell |
((sprintf(“%s” , cell)) . size) }) . max) }))
[1, 1, 2]
irb(main):902:0>
((((matrix . map { | column | (column + (Array . new((height - (column .
size)) , “”))) }) .
transpose) .
map { | row | ((([ row , widths ] . transpose) .
map { | r , w | (sprintf( “%*s” , w , r)) }) .
join “,”)}) .
join “\n”)
“1,3,20\n2,4, 3\n3,5, 5\n , , 5”
irb(main):918:0>
(begin
(matrix = (Array . new))
(ncols = 1)
(matrix . push(Array . new))
(((IO . readlines’/tmp/test.data’) .
map { | line | (line . strip) }) .
each { | item | (if (item == “”)
(ncols = (ncols + 1))
(matrix . push(Array . new))
else
((matrix . at(ncols - 1)) . push item)
end)
})
(height = ((matrix . map { | column | (column . size) }) . max))
(widths = (matrix . map { | column | ((column . map { | cell |
((sprintf(“%s” , cell)) . size) }) . max) }))
matrix
((((matrix . map { | column | (column + (Array . new((height -
(column . size)) , “”))) }) .
transpose) .
map { | row | ((([ row , widths ] . transpose) .
map { | r , w | (sprintf( “%*s” , w , r)) }) .
join “,”)}) .
join “\n”)
end)
“1,3,20\n2,4, 3\n3,5, 5\n , , 5”
irb(main):968:0>
(matrix = (Array . new))
[]
irb(main):974:0>
(ncols = 1)
1
irb(main):980:0>
(matrix . push(Array . new))
[[]]
irb(main):986:0>
(((IO . readlines’/tmp/test.data’) .
map { | line | (line . strip) }) .
each { | item | (if (item == “”)
(ncols = (ncols + 1))
(matrix . push(Array . new))
else
((matrix . at(ncols - 1)) . push item)
end)
})
[“1”, “2”, “3”, “”, “3”, “4”, “5”, “”, “20”, “3”, “5”, “5”]
irb(main):1005:0>
matrix
[[“1”, “2”, “3”], [“3”, “4”, “5”], [“20”, “3”, “5”, “5”]]
irb(main):1008:0>
(height = ((matrix . map { | column | (column . size) }) . max))
4
irb(main):1014:0>
(widths = (matrix . map { | column | ((column . map { | cell |
((sprintf(“%s” , cell)) . size) }) . max) }))
[1, 1, 2]
irb(main):1020:0>
matrix
[[“1”, “2”, “3”], [“3”, “4”, “5”], [“20”, “3”, “5”, “5”]]
irb(main):1026:0>
((((matrix . map { | column | (column + (Array . new((height - (column .
size)) , “”))) }) .
transpose) .
map { | row | ((([ row , widths ] . transpose) .
map { | r , w | (sprintf( “%*s” , w , r)) }) .
join “,”)}) .
join “\n”)
“1,3,20\n2,4, 3\n3,5, 5\n , , 5”
irb(main):1042:0>

Thanks a lot Pascal! The code is working though a little complex but a
very nice learning piece of code. one nitty gritty thing is that not all
the columns are seperated by a comma. for example your code

File.open("/home/george/test_r.csv",‘w’) do |f|
printf “%s\n\n”,
begin
matrix = Array.new
ncols = 1
matrix.push(Array.new)
(IO.readlines ‘/home/george/test.data’).
map { | line | (line.strip) }.
each { | item | if item == “”
ncols = (ncols + 1)
matrix.push(Array.new)
else
matrix.at(ncols - 1).push item
end
}

height = (matrix.map { | column | (column.size) }).max

widths = matrix.map { | column | ((column.map { | cell |

((sprintf("%s" , cell)).size) }).max) }

f.puts((matrix.map { | column | (column + (Array.new((height -
column.size) , “”))) }).transpose.
map { | row | ((([ row , widths ].transpose).
map { | r , w | (sprintf( “%*s” , w , r)) }).join “,”)}.join “\n”)
end
end

when given a file(test.data) containing
1 2
2 4

1 4
2 3

1 3
2 3
3 5
4 6

it produces (test_r.csv)

1 2,1 4,1 3
2 4,2 3,2 3
, ,3 5
, ,4 6

which is almost there!
the ideal output would have been

1, 2,1, 4,1, 3
2, 4,2, 3,2, 3
, , ,3, 5
, , ,4, 6

Sorry am asking for a horse and the saddle :slight_smile: what little modification
should i do to produce the second output?
Thank you so much!

George

George G. [email protected] writes:

(IO.readlines '/home/george/test.data').

when given a file(test.data) containing

1, 2,1, 4,1, 3
2, 4,2, 3,2, 3
, , ,3, 5
, , ,4, 6

Sorry am asking for a horse and the saddle :slight_smile: what little modification
should i do to produce the second output?

You could add a split and a each somewhere.

irb(main):001:0> “1 2”.split(" ").each{|x| puts x}
1
2
[“1”, “2”]

Brian C. wrote:

George G. wrote:

any ideas?

src = “1\n2\n3\n\n3\n4\n5\n\n20\n3\n5\n5”
rows = src.split("\n\n").collect { |b| b.split("\n") }
m = rows.collect { |r| r.size }.max
m.times { |i| puts rows.collect { |r| r[i] }.join(",") }

No witchcraft there.

I’d say it’s better to use fastercsv gem for outputting, as it handles
values which need quoting (e.g. values which themselves contain commas)

require ‘rubygems’
require ‘fastercsv’
src = “1\n2\n3\n\n3\n4\n5\n\n20\n3\n5,9\n5”
rows = src.split("\n\n").collect { |b| b.split("\n") }
m = rows.collect { |r| r.size }.max
FCSV { |out| m.times { |i| out << rows.collect { |r| r[i] } } }

Hi Brian!
Thanks for the example however since i want to have the code read from a
file and write to another file i have done this
File.open("/home/george/output.csv",“w”) do |f|
src = “”
File.open(’/home/george/sequences/results_processed.csv’).each do
|rows|
src << rows
end
rows = src.split("\n\n").collect { |b| b.split("\n") }
m = rows.collect { |r| r.size }.max
m.times { |i| f.puts rows.collect { |r| r[i] }.join(",") }

but it prints out the same old original file without modifications

Any ideas?

George G. wrote:

i have a listing which looks like this
1
2
3

3
4
5

20
3
5
5

i would like a quick way to format that output such that i end up with 3
columns like the following.
1,2,20
2,4,3
3,5,5
5

any ideas?

src = “1\n2\n3\n\n3\n4\n5\n\n20\n3\n5\n5”
rows = src.split("\n\n").collect { |b| b.split("\n") }
m = rows.collect { |r| r.size }.max
m.times { |i| puts rows.collect { |r| r[i] }.join(",") }

No witchcraft there.

I’d say it’s better to use fastercsv gem for outputting, as it handles
values which need quoting (e.g. values which themselves contain commas)

require ‘rubygems’
require ‘fastercsv’
src = “1\n2\n3\n\n3\n4\n5\n\n20\n3\n5,9\n5”
rows = src.split("\n\n").collect { |b| b.split("\n") }
m = rows.collect { |r| r.size }.max
FCSV { |out| m.times { |i| out << rows.collect { |r| r[i] } } }

Hi –

On Fri, 19 Jun 2009, George G. wrote:

ncols = 1
matrix.push(Array.new)
(IO.readlines ‘/home/george/test.data’).
map { | line | (line.strip) }.

Don’t use the extra parentheses around expressions. They result in
unidiomatic and obfuscated code. They’re an artifact of some very
specific disgruntlement about the fact that Ruby differs from Common
Lisp, and they shouldn’t be emulated. (Search the ruby-talk archives
for more on this if interested.)

4 6

1, 2,1, 4,1, 3
2, 4,2, 3,2, 3
, , ,3, 5
, , ,4, 6

Sorry am asking for a horse and the saddle :slight_smile: what little modification
should i do to produce the second output?

Could you write an example using letters, instead of digits? I’m
having trouble mapping the input to the output, and having unique
symbols would help.

Judging from the comma count, the last two rows have fewer fields than
the first two. Is that right?

David

George G. wrote:

but it prints out the same old original file without modifications

It works for me (I just copy-pasted your code and changed the
input/output filenames).

The simplest way to debug this is to add some extra debugging statements
to see if the variables contain what you expect, for example:


rows = src.split("\n\n").collect { |b| b.split("\n") }
p rows # << debugging statement

or more verbosely,

STDERR.puts “rows = #{rows.inspect}”

At a guess, perhaps you’re on a Windows platform and the paragraphs are
terminated by \r\n\r\n instead of \n\n. Try:

rows = src.split(/\r?\n\r?\n/).collect … etc

In a regular expression, \r? means 0 or 1 occurrences of \r

On Tue, 23 Jun 2009, George G. wrote:

whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3

That’s a spurious - before the 3, right?

2,4,7
3,5

or
1,-3.05,-6.3
2,4,7
3,5,0

The example that I couldn’t quite figure out was the one with input
like:

1 2,3 4

or something like that. Anyway, a simple version (which may not handle
that case) is:

$/ = “\n\n” # if the input is definitely \n\n delimited

cols = File.open(“input.csv”) do |fh|
max = 0
fh.map do |s|
row = s.scan(/\S+/)
max = [max, row.size].max
row << “0” until row.size == max
row
end
end

David

Brian C. wrote:

At a guess, perhaps you’re on a Windows platform and the paragraphs are
terminated by \r\n\r\n instead of \n\n. Try:

rows = src.split(/\r?\n\r?\n/).collect … etc

In a regular expression, \r? means 0 or 1 occurrences of \r

Thank you Brian. Am on ubuntu linux and Ruby 1.8.7. Let me put some
debugger and work it out why its not printing the desirable output.

@David Black:
My input consists of digits and not letters, that is why i used digits
in the examples above; in short what i really wanted is: Given a listing
1
2
3
white space
3.05
4
5
whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3
2,4,7
3,5

or
1,-3.05,-6.3
2,4,7
3,5,0

David A. Black wrote:

On Tue, 23 Jun 2009, George G. wrote:

whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3

That’s a spurious - before the 3, right?

2,4,7
3,5

or
1,-3.05,-6.3
2,4,7
3,5,0

The example that I couldn’t quite figure out was the one with input
like:

1 2,3 4

or something like that. Anyway, a simple version (which may not handle
that case) is:

$/ = “\n\n” # if the input is definitely \n\n delimited

cols = File.open(“input.csv”) do |fh|
max = 0
fh.map do |s|
row = s.scan(/\S+/)
max = [max, row.size].max
row << “0” until row.size == max
row
end
end

David

Hi David!
Thanks for the reply but the output given by the above code again
only produces a single column instead of multiple columns. instead of
given
1
2
3
white space
3.05
4
5
whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3
2,4,7
3,5

Please If you don’t mind i can send you my actual input file off the
list to try with. my email is georgkam hosted with the google email
domain

Hi –

On Wed, 24 Jun 2009, George G. wrote:

That’s a spurious - before the 3, right?
like:
fh.map do |s|
Hi David!
whitespace
-6.3
7

Produce an output such that:
1,-3.05,-6.3
2,4,7
3,5

Here’s my input and output, along with the program:

$ cat george.rb
$/ = “\n\n”

cols = File.open(“input.csv”) do |fh|
max = 0
fh.map do |s|
row = s.scan(/\S+/)
max = [max, row.size].max
row << “0” until row.size == max
row
end
end

rows = cols.transpose.map {|row| row.join(",") }

puts rows
$ cat input.csv
1
2
3

3.05
4
5

-6.3
7
$ ruby george.rb
1,3.05,-6.3
2,4,7
3,5,0

The columns look OK. Note that the $/="\n\n" thing only works if your
input is separated by exactly that sequence. That might be the problem
you’re having.

David