Forum: Ruby regex woes with breaklines

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
6f9833bac86f685067d44b09a138b228?d=identicon&s=25 Aa Wilson (aawilson)
on 2009-04-10 01:02
Hello, all.  I've been banging my head against this for a little bit,
and I'm no closer to a solution.  The issue is that I have a string
where I would like to gsub all line ends ('\n' or '\r\n') into '<br />',
while preserving the line ends.  The catch is that I don't want to gsub
anything that looks like '<br />\r\n' or '<br />\n', to prevent my gsub
from inserting breaklines on subsequent edits of this string.  I've
tried gsub with the regex /(?!<br\s*\/>)($)[^\z]/, but haven't had any
luck getting it to match.  For example, on the string
"test\r\ntest<br>\r\ntest", it will match both of the '\r\n' sets,
instead of only the first one (which would be optimal).

Thanks in advance for your time.
5772c599ccab3081e0fffb1d54f3b6de?d=identicon&s=25 Andrew Timberlake (andrewtimberlake)
on 2009-04-10 09:06
(Received via mailing list)
On Fri, Apr 10, 2009 at 1:03 AM, Aa Wilson <aawilso1@vt.edu> wrote:
>
> Thanks in advance for your time.

Replace the <br> along with the new line
Also remember that a new line can be CR (Mac), LF (Linux) or CRLF
(Windows)

s = "test\r\ntest<br>\r\ntest\ntest\rtest<br/>\ntest"
s.gsub(/(?:<br *\/*>)*(?:\r\n|\n|\r)/, "<br />\n")
#=> "test<br />\ntest<br />\ntest<br />\ntest<br />\ntest<br />\ntest"

Andrew Timberlake
http://ramblingsonrails.com
http://www.linkedin.com/in/andrewtimberlake

"I have never let my schooling interfere with my education" - Mark Twain
6f9833bac86f685067d44b09a138b228?d=identicon&s=25 Aa Wilson (aawilson)
on 2009-04-10 14:51
Andrew Timberlake wrote:

> s = "test\r\ntest<br>\r\ntest\ntest\rtest<br/>\ntest"
> s.gsub(/(?:<br *\/*>)*(?:\r\n|\n|\r)/, "<br />\n")
> #=> "test<br />\ntest<br />\ntest<br />\ntest<br />\ntest<br />\ntest"

Thanks a million.  It feels like a waste replacing something that's
already there, but I suppose that's just my conceptions of real life
getting in the way of my abstractions.

I was actually hoping that '$' (or possibly '$$?') would cover all the
CR/LF/CRLF cases for me, was I mistaken about that?
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2009-04-10 23:47
(Received via mailing list)
On 10.04.2009 14:52, Aa Wilson wrote:
> Andrew Timberlake wrote:
>
>> s = "test\r\ntest<br>\r\ntest\ntest\rtest<br/>\ntest"
>> s.gsub(/(?:<br *\/*>)*(?:\r\n|\n|\r)/, "<br />\n")
>> #=> "test<br />\ntest<br />\ntest<br />\ntest<br />\ntest<br />\ntest"
>
> Thanks a million.  It feels like a waste replacing something that's
> already there, but I suppose that's just my conceptions of real life
> getting in the way of my abstractions.

Not necessarily: with 1.9's regular expression engine there is negative
lookbehind as well.  You could use that to prevent substitution of
newlines which are preceeded by a <br/> already.

http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt

> I was actually hoping that '$' (or possibly '$$?') would cover all the
> CR/LF/CRLF cases for me, was I mistaken about that?

I believe it does not cover \r alone.  In every case it is safer to be
explicit IMHO.  Btw, I believe you can simplify to (?:\r\n?|\n).

Kind regards

  robert
This topic is locked and can not be replied to.