Hi, I have files full of numbers that I need to twiddle, but the format of the numbers cannot change[1], e.g., '0.4577' -> '0.7728' or '-2.345e-02' -> ' 1.232e-03' Using scanf for the output seems to be the solution to the second half of the problem, but how does one derive the format specifier string of the input fields, which vary? Thanks, -- Bil Kleb http://fun3d.larc.nasa.gov [1] Legacy formatted-Fortran data files.

On May 2, 2007, at 12:50 PM, Bil Kleb wrote: > Hi, > > I have files full of numbers that I need to twiddle, > but the format of the numbers cannot change[1], e.g., > > '0.4577' -> '0.7728' > > or > > '-2.345e-02' -> ' 1.232e-03' Are there many different formats? -- fxn

On 02.05.2007 12:47, Bil Kleb wrote: > the second half of the problem, but how does one derive > the format specifier string of the input fields, which vary? If there is a fixed number of formats you can probably use a cascade of RX matches. Otherwise it probably becomes a bit more complex like matching sequences of digits and measuring their lengths. >> md = %r{^(\d+)\.(\d+)?$}.match('0.4577') => #<MatchData:0x7ef61250> >> pa="%#{md[0].size}.#{md[2].size}f" => "%6.4f" >> pa % 0.4577111 => "0.4577" HTH robert

Hi -- On 5/2/07, Bil Kleb <Bil.Kleb@nasa.gov> wrote: > > Using scanf for the output seems to be the solution to > the second half of the problem, but how does one derive > the format specifier string of the input fields, which vary? You could probably just do a gsub, like this: require 'scanf' re = /-?\d+\.\d+(e-\d+)?/ a = "'0.4577' -> '0.7728'" b = "'-2.345e-02' -> ' 1.232e-03'" as = a.gsub(re, "%f") bs = a.gsub(re, "%f") p a.scanf(as) p b.scanf(bs) Output: [0.4577, 0.7728] [-0.02345, 0.001232] David

On 5/2/07, Bil Kleb <Bil.Kleb@nasa.gov> wrote: > > Using scanf for the output seems to be the solution to > the second half of the problem, but how does one derive > the format specifier string of the input fields, which vary? > Bill, How's this for a start? I wrote it leaning towards clarity vs. conciseness. rick@frodo:/public/rubyscripts$ cat number_format.rb class String def to_number_format m = match(%r{^([ ]*)([+-]?)(.*)$}) leading_blanks, sign, rest = m[1], m[2], m[3] plus_flag = sign == '+' ? sign : '' case rest when %r{^([\d]\.([\d]+)([eE])[+-][\d]+)(.*)$} # exponentiated float entirety, frac_part, e_or_E, exponent, suffix = $1, $2, $3, $4, $5 entirety = leading_blanks << entirety "%#{entirety.length}.#{frac_part.length}#{e_or_E}#{suffix}" when %r{^([\d]+\.([\d]*))(.*)$} # simple float entirety, frac_part, suffix = $1, $2, $3 zero = frac_part.match(/00$/) ? '0' : '' "%#{zero}#{entirety.length}.#{frac_part.length}f#{suffix}" when %r{^(0[\d]+)([^e.]*)$} # zero padded integer digits, suffix = $1, $2 "#{leading_blanks}%#{plus_flag}0#{digits.length}d#{$suffix}" when %r{^([\d]+)([^e.]*)$} # whitespace padded integer digits, suffix = $1, $2 digits = leading_blanks << digits "%#{digits.length}d#{suffix}" else nil end end end x = '0.4577' puts x puts x.to_number_format puts x.to_number_format % x.to_f puts(x.to_number_format % 0.7728) puts (x.to_number_format % x.to_f) == x puts x = '-2.345e-02' puts x puts x.to_number_format puts(x.to_number_format % x.to_f) puts(x.to_number_format % 1.232e-03) puts (x.to_number_format % x.to_f) == x puts x = '12345' puts x puts x.to_number_format puts(x.to_number_format % x.to_i) puts(x.to_number_format % 765) puts (x.to_number_format % x.to_f) == x puts x = ' 00012345' puts x puts x.to_number_format puts(x.to_number_format % x.to_i) puts(x.to_number_format % 765) puts (x.to_number_format % x.to_i) == x puts x = ' 12345' puts x puts x.to_number_format puts(x.to_number_format % x.to_i) puts(x.to_number_format % 765) puts (x.to_number_format % x.to_i) == x rick@frodo:/public/rubyscripts$ ruby number_format.rb 0.4577 %6.4f 0.4577 0.7728 true -2.345e-02 %9.3e -2.345e-02 1.232e-03 true 12345 %5d 12345 765 true 00012345 %08d 00012345 00000765 true 12345 %7d 12345 765 true -- Rick DeNatale My blog on Ruby http://talklikeaduck.denhaven2.com/

On May 2, 2007, at 2:50 PM, Bil Kleb wrote: > Xavier Noria wrote: >> Are there many different formats? > > Yes, in that the field lengths are different. > > No, in that the there are really only three "types": > integers, vanilla floats, and exponentials. Then I think you could base the solution on String#index/regexps depending on the existence of "e" and ".", since we can assume numbers are well-formed. The idea would be: if none %d elsif "e" %e else %f with computed widths end -- fxn

On 02.05.2007 15:08, Bil Kleb wrote: >> >> md = %r{^(\d+)\.(\d+)?$}.match('0.4577') > capacity of the existing format. For floating point numbers you might even get away with a single regexp if that is crafted appropriately and group values are evaluated accordingly. Kind regards robert

Rick DeNatale wrote: > On 5/4/07, Bil Kleb <Bil.Kleb@nasa.gov> wrote: >> assert_equal( '%8.7f', '.0001170'.to_number_format ) > > Not sure how this one worked, it fails for me. As a matter of fact: > irb(main):001:0> '%8.7f' % 0.0001170 > => "0.0001170" > > And I haven't been able to find an sprintf format string which > supresses a leading zero on a float. You're correct; as you wrote, I wasn't testing round-trip. Thanks,

Bil Kleb wrote: > > Puzzling the minus sign part now... "%#{zero}#{sign.length+entirety.length}.#{frac_part.length}f#{suffix}" ^^^^^^^^^^^^ Later,

David A. Black wrote: > Hi -- Hi. > Output: > > [0.4577, 0.7728] > [-0.02345, 0.001232] The second output indicates that I failed to express my predicament clearly, as the numbers are no longer in exponential format? A brief re-cast: The original file has numbers of the form 5 0.4577 -2.345e-02 Something reads the numbers and spits out new numbers, but in exactly the same format as the original file, e.g., 8 0.7728 1.232e-03 I.e., I can't write the last number out as 0.001232 -- it has to be in exponential format with the same field lengths. Regards,

Robert Klemme wrote: > > If there is a fixed number of formats you can probably use a cascade of > RX matches. Unfortunately not. > Otherwise it probably becomes a bit more complex like > matching sequences of digits and measuring their lengths. > > >> md = %r{^(\d+)\.(\d+)?$}.match('0.4577') > => #<MatchData:0x7ef61250> > >> pa="%#{md[0].size}.#{md[2].size}f" Hmmm, this looks like a viable path. I hadn't thought of using MatchData groups, but as you say, it may get ugly fast... I'm thinking of edge cases like dealing with the leading space if positive numbers become negative, or accommodating the number of digits needed for exponentials or integers if the new number exceeds the capacity of the existing format. Thanks,

Xavier Noria wrote: > %f with computed widths > end This, coupled with Robert's computed field lengths is beginning to look tractable... Thanks,

On 5/5/07, Rick DeNatale <rick.denatale@gmail.com> wrote: > On 5/4/07, Bil Kleb <Bil.Kleb@nasa.gov> wrote: > > Rick DeNatale wrote: > > > > > > How's this for a start? > > > > Excellent! Thanks. By the way Bill, seeing who you seem to work for, I'd like to dedicate whatever help I've given to you to the memory of Wally Schirra! Are you a turtle? <G> -- Rick DeNatale Visit the Project Mercury Wiki Site http://www.mercuryspacecraft.com/ My blog on Ruby http://talklikeaduck.denhaven2.com/

Rick DeNatale wrote: > > By the way Bill, seeing who you seem to work for, I'd like to dedicate > whatever help I've given to you to the memory of Wally Schirra! You helped me learn more Ruby; always a pure joy. Thank you. I've since decided that I'm going to require the users specify the format instead of trying to back it out -- there are cases for which you just can't back out the correct format. Besides, the need is infrequent, and I have no sympathy for code that employs formatted reads... > Are you a turtle? <G> You bet your sweet ass I am! ;) Regards,

Xavier Noria wrote: > > Are there many different formats? Yes, in that the field lengths are different. No, in that the there are really only three "types": integers, vanilla floats, and exponentials. Regards,

Rick DeNatale wrote: > > How's this for a start? Excellent! Thanks. All but my last test passed: require 'test/unit' require 'number_format' class TestNumberFormat < Test::Unit::TestCase def test_some_floats assert_equal( '%3.1f', '8.3'.to_number_format ) assert_equal( '%05.3f', '0.500'.to_number_format ) assert_equal( '%8.7f', '.0001170'.to_number_format ) assert_equal( '%7.1f', '14000.0'.to_number_format ) assert_equal( '%9.3E', '4.480E+09'.to_number_format ) assert_equal( '%6.1e', '3.2e-5'.to_number_format ) assert_equal( '%6.1f', '-254.2'.to_number_format ) end end 1) Failure: test_some_floats(TestNumberFormat) [-:11]: <"%6.1f"> expected but was <"%5.1f">. Note: made the simple float leading digit match 0 or more to get the third test to pass. Puzzling the minus sign part now... Thanks again,

On 5/4/07, Bil Kleb <Bil.Kleb@nasa.gov> wrote: > class TestNumberFormat < Test::Unit::TestCase > def test_some_floats > assert_equal( '%3.1f', '8.3'.to_number_format ) > assert_equal( '%05.3f', '0.500'.to_number_format ) > assert_equal( '%8.7f', '.0001170'.to_number_format ) Not sure how this one worked, it fails for me. As a matter of fact: irb(main):001:0> '%8.7f' % 0.0001170 => "0.0001170" And I haven't been able to find an sprintf format string which supresses a leading zero on a float. > <"%5.1f">. > > Note: made the simple float leading digit match 0 > or more to get the third test to pass. > > Puzzling the minus sign part now... I see that you figured this out. Another thing to test is that the values actually round trip. Here's my test: rick@frodo:/public/rubyscripts$ cat test_number_format.rb require 'test/unit' require 'number_format' class TestNumberFormat < Test::Unit::TestCase def test_some_floats assert_equal( '%3.1f', '8.3'.to_number_format ) assert_nf('8.3') assert_equal( '%05.3f', '0.500'.to_number_format ) assert_nf('0.500') assert_equal( '%8.7f', '.0001170'.to_number_format ) assert_nf('.0001170') assert_equal( '%7.1f', '14000.0'.to_number_format ) assert_nf('14000.0') assert_equal( '%9.3E', '4.480E+09'.to_number_format ) assert_nf('4.480E+09') assert_equal( '%6.1e', '3.2e-5'.to_number_format ) assert_nf('3.2e-5') assert_equal( '%6.1f', '-254.2'.to_number_format ) assert_nf('-254.2') end private def assert_nf(str) assert_equal(str, str.to_number_format % eval(str)) end end -- Rick DeNatale My blog on Ruby http://talklikeaduck.denhaven2.com/