Problem removing new line characters on Mac OS X

Hi, I’m pretty new to Ruby. I’ve got a text file where I need to
remove some new line characters. I’ve tried everything I can think of
to do this with no success, including:

line.gsub!("/r","")
line.gsub!("/n","")
line=line.chomp

I can’t seem to get the new line character to be recognised and dealt
with. Any advice appreciated.

Thanks

On 5/17/07, Singeo [email protected] wrote:

Hi, I’m pretty new to Ruby. I’ve got a text file where I need to
remove some new line characters. I’ve tried everything I can think of
to do this with no success, including:

line.gsub!(“/r”,“”)

             ~~ \r

line.gsub!(“/n”,“”)

            ~~\n

line=line.chomp

I can’t seem to get the new line character to be recognised and dealt
with. Any advice appreciated.

Thanks

try it

On 5/17/07, Singeo [email protected] wrote:

Thanks

line.chomp! doesn’t work?
Would you show some code?

Harry

A Look into Japanese Ruby List in English
http://www.kakueki.com/

Singeo wrote:

Thanks

It looks like you should be using backslashes. If you want to match both
newlines and carriage returns, you can use:

line.gsub!(/[\n\r]/, “”)

-Dan

Singeo wrote:

Hi, I’m pretty new to Ruby. I’ve got a text file where I need to
remove some new line characters. I’ve tried everything I can think of
to do this with no success, including:

line.gsub!("/r","")
line.gsub!("/n","")
line=line.chomp

In case your problem is just your Ruby syntax:

  1. Replace the forward slashes (like in “/r”) by
    backward slashes ("\r" in your above mentioned
    solution.

  2. Make the first parameter to the gsub! method
    a Regexp instead of a string. The API docs say:
    “… if it is a String then no regular expression
    metacharacters will be interpreted …”.

This is why neither “/r” (1) nor “\r” (2) will
work.

If chomp does not work, you may be using a Mac
file under Linux or Windows. In that case you may
want to try something like

line.gsub!(/\015/, ‘’)

Hermann

Apologies for the mis-understanding, I have been using backslashes.
Here’s my code, as you’ll see from the resulting file there are a
bunch of new line characters in the file I’d like to get rid of.
Thanks for the help so far.

require(“rubygems”)
require(“scrubyt”)
require (“open-uri”)
require ‘time’
require ‘date’

psi = Scrubyt::Extractor.define do
fetch(“http://app.nea.gov.sg/psi/”)

record(“/html/body/div/table/tr/td/table/tbody/tr/td/div”,
{ :generalize => true }) do
title(“/strong[1]/font[1]”)
item(“/table/tbody/tr/td/table/tbody/tr”, { :generalize => true })
do
region(“/td[1]”)
psi(“/td[7]”)
aqd(“/td[8]”)
end
end
end

f = open(“psiregions.xml”, File::CREAT|File::TRUNC|File::RDWR) {|f|
psi.to_xml.write(f, 1)
}

Create the RSS file.

rssfile = File.new(“sgpsi.xml”, “w”)
rssfile.puts(‘<?xml version="1.0" encoding="UTF-8"?>’)
rssfile.puts(‘’)
rssfile.puts(’ ‘)
rssfile.puts(’ http://app.nea.gov.sg/psi/‘)
rssfile.puts(’ Singapore PSI Readings‘)
#rssfile.puts(’ Singapore PSI Readings’ + Time.now.rfc2822

  • ‘’)
    rssfile.puts(’ ’ + Time.now.rfc2822 + ‘</
    lastBuildDate>’)
    rssfile.puts(’ [email protected]')

File.open(‘psiregions.xml’, ‘r’) do |f1|
while line = f1.gets
line=line.strip
line=line.chomp
line.gsub!(/[\n]/, “”)
line.gsub!(//, “”)
line.gsub!(/</root>/, “”)
line.gsub!(//, “”)
line.gsub!(/</record>/, “”)
line.gsub!(“24-hr”, “Singapore 24-hr”)
line.gsub!(“Region”, “”)
line.gsub!(“Sulphur Dioxide”, “”)
line.gsub!(//, “”)
line.gsub!(/</region>/,“:”)
line.gsub!(//, " PSI Level ")
line.gsub!(/</psi>/, “”)
line.gsub!(//, " - ")
line.gsub!(/</aqd>/, “”)
line.gsub!(//, “” + Time.now.rfc2822 + “</
pubDate>”)
rssfile.puts line
end
end

rssfile.puts(‘’)
rssfile.puts(‘’)
rssfile.close

Singeo wrote:

Here’s my code, as you’ll see from the resulting file there are a
bunch of new line characters in the file I’d like to get rid of.
[…]
line=line.chomp
[…]
rssfile.puts line

puts adds a newline to the end of the string it writes. If you don’t
want that
behaviour (which you obviously don’t), use print instead.

Singeo wrote:

Hi Hermann, just tried your suggestion of:

line.gsub!(/\015/, ‘’)

still no success. I’m creating and running the file on a Mac.

Are you shure that you it is not successful?

It would be good to know how you read the lines,
how you (not) remove the carriage returns,
and how you perhaps put the lines together
(adding again \r characters by mistake?).

Are you removing the carriage returns line
by line (in which case the chomp should be perfect)
or are you trying it as a whole, i.e. do you have
not only one line but a whole file in ‘line’?

Rather than an answer to these questions I would
prefer to see some more code of the whole part
from opening the file to writing back or putting
out the strings.

Hermann

Hi Hermann, just tried your suggestion of:

line.gsub!(/\015/, ‘’)

still no success. I’m creating and running the file on a Mac.

Hermann, I followed Sebatian’s advice to use print instead of puts and
that solved my problem. But I would still like to understand how to
remove the newline characters. Here’s my code as it currently stands
(with “rssfile.print line” in place of “rssfile.puts line”), hopefully
it will help you see how I was trying to tackle the problem.

require(“rubygems”)
require(“scrubyt”)
require (“open-uri”)
require ‘time’
require ‘date’

psi = Scrubyt::Extractor.define do
fetch(“http://app.nea.gov.sg/psi/”)

record(“/html/body/div/table/tr/td/table/tbody/tr/td/div”,
{ :generalize => true }) do
title(“/strong[1]/font[1]”)
item(“/table/tbody/tr/td/table/tbody/tr”, { :generalize => true })
do
region(“/td[1]”)
psi(“/td[7]”)
aqd(“/td[8]”)
end
end
end

f = open(“psiregions.xml”, File::CREAT|File::TRUNC|File::RDWR) {|f|
psi.to_xml.write(f, 1)
}

Create the RSS file.

rssfile = File.new(“sgpsi.xml”, “w”)
rssfile.puts(‘<?xml version="1.0" encoding="UTF-8"?>’)
rssfile.puts(‘’)
rssfile.puts(’ ‘)
rssfile.puts(’ http://app.nea.gov.sg/psi/‘)
rssfile.puts(’ Singapore PSI Readings‘)
#rssfile.puts(’ Singapore PSI Readings’ + Time.now.rfc2822

  • ‘’)
    rssfile.puts(’ ’ + Time.now.rfc2822 + ‘</
    lastBuildDate>’)
    rssfile.puts(’ [email protected]')

File.open(‘psiregions.xml’, ‘r’) do |f1|
while line = f1.gets
line=line.strip
line.gsub!(//, “”)
line.gsub!(/</root>/, “”)
line.gsub!(//, “”)
line.gsub!(/</record>/, “”)
line.gsub!(“24-hr”, “Singapore 24-hr”)
line.gsub!(“Region”, “”)
line.gsub!(“Sulphur Dioxide”, “”)
line.gsub!(//, “”)
line.gsub!(/</region>/,“:”)
line.gsub!(//, " PSI Level ")
line.gsub!(/</psi>/, “”)
line.gsub!(//, " - ")
line.gsub!(/</aqd>/, “”)
line.gsub!(//, “” + Time.now.rfc2822 + “</
pubDate>”)
line.gsub!(/</item>/, “\n”)
rssfile.print line
end
end

rssfile.puts(‘’)
rssfile.puts(‘’)
rssfile.puts(‘’)

rssfile.close

On May 17, 6:20 pm, Hermann M. [email protected]

On 5/17/07, Singeo [email protected] wrote:

Hermann, I followed Sebatian’s advice

It s good to know that nobody reads your posts ;), seriously.

Is all you can see from my post

?

Strange

Probably I did something stupid, but that is not important as somebody
else came up with it too :slight_smile:

Robert

Hi Siingeo (btw: is that your first name?),

Singeo wrote:

… I followed Sebatian’s advice to use print instead of puts and
that solved my problem.

Running out of time now, but I see that Sebastian
has given the correct answer already. I was on the
same track, which is why I was asking for the code
to see how you read and write the lines.

But I would still like to understand how to
remove the newline characters. Here’s my code as it currently stands
(with “rssfile.print line” in place of “rssfile.puts line”), hopefully
it will help you see how I was trying to tackle the problem.

You may have done well removing the newlines (to be
more precise: the carriage returns or “\r” characters),
but then when writing the lines you add it back again:

Hermann

On 17.05.2007 12:33, Singeo wrote:

Hermann, I followed Sebatian’s advice to use print instead of puts and
that solved my problem. But I would still like to understand how to
remove the newline characters. Here’s my code as it currently stands
(with “rssfile.print line” in place of “rssfile.puts line”), hopefully
it will help you see how I was trying to tackle the problem.

Maybe I’m being thick here but I don’t understand you: you have a
working solution and an explanation why the NL weren’t “removed”
(actually they were removed and then reinserted by using “puts”). Now
what is it that you want to understand about this?

Another remark: since you seem to be dealing with XML files why don’t
you use an XML tool, such as REXML? That’s certainly less error prone
when manipulating XML files.

Regards

robert

Robert, thanks for trying, but all I could see was … as you
say, strange…

On 5/17/07, Singeo [email protected] wrote:

Robert, thanks for trying, but all I could see was … as you
say, strange…

Thx for confirming which means I just did something stupid as this
post got through.
R.

Singeo wrote:

I have a
solution to my problem but wanted to understand more about why my
previous appraoch failed (for future reference)

Your previous approach didn’t fail as such. You did remove the newline
as you
wanted to. You just added it again afterwards by using puts instead of
print.
puts adds a newline, that’s just what it does (and what it says it does
in the
docs). That’s why I didn’t work. There’s nothing more to it.

Robert, I’m pretty new to Ruby so ignore my ignorance… I have a
solution to my problem but wanted to understand more about why my
previous appraoch failed (for future reference), in particular, how do
I do a simple string substitution of a newline character on Mac OS X,
nothing I’ve tried or that has been suggested has worked.

I’ll certainly look at REXML, thanks for the help.