Regex woes with breaklines


#1

Hello, all. I’ve been banging my head against this for a little bit,
and I’m no closer to a solution. The issue is that I have a string
where I would like to gsub all line ends (’\n’ or ‘\r\n’) into ‘
’,
while preserving the line ends. The catch is that I don’t want to gsub
anything that looks like ‘
\r\n’ or ‘
\n’, to prevent my gsub
from inserting breaklines on subsequent edits of this string. I’ve
tried gsub with the regex /(?!<br\s*/>)($)[^\z]/, but haven’t had any
luck getting it to match. For example, on the string
“test\r\ntest
\r\ntest”, it will match both of the ‘\r\n’ sets,
instead of only the first one (which would be optimal).

Thanks in advance for your time.


#2

On Fri, Apr 10, 2009 at 1:03 AM, Aa Wilson removed_email_address@domain.invalid wrote:

Thanks in advance for your time.

Replace the
along with the new line
Also remember that a new line can be CR (Mac), LF (Linux) or CRLF
(Windows)

s = “test\r\ntest
\r\ntest\ntest\rtest
\ntest”
s.gsub(/(?:<br />)*(?:\r\n|\n|\r)/, “
\n”)
#=> “test
\ntest
\ntest
\ntest
\ntest
\ntest”

Andrew T.
http://ramblingsonrails.com
http://www.linkedin.com/in/andrewtimberlake

“I have never let my schooling interfere with my education” - Mark Twain


#3

Andrew T. wrote:

s = “test\r\ntest
\r\ntest\ntest\rtest
\ntest”
s.gsub(/(?:<br />)*(?:\r\n|\n|\r)/, “
\n”)
#=> “test
\ntest
\ntest
\ntest
\ntest
\ntest”

Thanks a million. It feels like a waste replacing something that’s
already there, but I suppose that’s just my conceptions of real life
getting in the way of my abstractions.

I was actually hoping that ‘$’ (or possibly ‘$$?’) would cover all the
CR/LF/CRLF cases for me, was I mistaken about that?


#4

On 10.04.2009 14:52, Aa Wilson wrote:

Andrew T. wrote:

s = “test\r\ntest
\r\ntest\ntest\rtest
\ntest”
s.gsub(/(?:<br />)*(?:\r\n|\n|\r)/, “
\n”)
#=> “test
\ntest
\ntest
\ntest
\ntest
\ntest”

Thanks a million. It feels like a waste replacing something that’s
already there, but I suppose that’s just my conceptions of real life
getting in the way of my abstractions.

Not necessarily: with 1.9’s regular expression engine there is negative
lookbehind as well. You could use that to prevent substitution of
newlines which are preceeded by a
already.

http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt

I was actually hoping that ‘$’ (or possibly ‘$$?’) would cover all the
CR/LF/CRLF cases for me, was I mistaken about that?

I believe it does not cover \r alone. In every case it is safer to be
explicit IMHO. Btw, I believe you can simplify to (?:\r\n?|\n).

Kind regards

robert