Hi,
How do I split the below string into words…Words can be either a
consecutive set of non whitespace characters or anything withn " "
‘hi hello “hello world” hey yo’
should return
[hi, hello, hello world,hey,yo]
I tried to somehow do a collect , but not sure if there is a way to
retain a variable in between 2 invocations and then concat them and
return as one string…
Ofcourse if there is a smart way to do it in one shot using a regex
then i can do a scan on the string
But this returns “hello world” as two entries, not one as required.
The “should return” clause is not well-formed anyway…
On the (usually misappropriated, but hopefully not here) Occam’s Razor
principle[1], I would refrain from positing that there’s actually
supposed to be a comma between the second “hello” and “world”, or that
the quotation marks that were removed to illustrate the results are
actually supposed to be reinstated as literals. We can wait for a
ruling from Vivek, though; he’s now got just about every permutation
to choose from (Including shellwords, thanks to Harry, and that of
course is the best. Or at least, if Occam is right, then Harry is
right
On the (usually misappropriated, but hopefully not here) Occam’s Razor
principle[1], I would refrain from positing that there’s actually
supposed to be a comma between the second “hello” and “world”, or that
the quotation marks that were removed to illustrate the results are
actually supposed to be reinstated as literals. We can wait for a
ruling from Vivek, though; he’s now got just about every permutation
to choose from (Including shellwords, thanks to Harry, and that of
course is the best. Or at least, if Occam is right, then Harry is
right
Thanks for the replies…Indeed I don’t want the quotes to be a part
of the string
This one suggested above by works for me
I presume that should capture pretty much any kind of combination…
and I don’t have the case where there are nested " so that looks good.
(unless someone can think of a case that breaks )
thanks so much…I had hit a dead end trying to do this!!
I presume that should capture pretty much any kind of combination…
and I don’t have the case where there are nested " so that looks good.
(unless someone can think of a case that breaks )
thanks so much…I had hit a dead end trying to do this!!
Don’t forget the shellwords library though – a very convenient way to
do this.
On Sun 13 Jul 2008 11:06:23, David A. Black wrote:
illustrate the results are actually supposed to be reinstated as
I presume that should capture pretty much any kind of combination…
and I don’t have the case where there are nested " so that looks
good. (unless someone can think of a case that breaks )
thanks so much…I had hit a dead end trying to do this!!
Don’t forget the shellwords library though – a very convenient way
to do this.
I suspect I copied the wrong line from my transcript!
But…
The "'s are returned as part of the string ‘“hello world”’. Also, you
get the wrong result if you have two quoted strings in a row, because
of the greediness:
str = 'hi hello "hello world" "hey yo"'
p str.scan(/(?:".*")|(?:\w+)/)
=> [“hi”, “hello”, ““hello world” “hey yo””] # bad
The "'s are returned as part of the string ‘“hello world”’. Also, you
=> [“hi”, “hello”, ““hello world””, ““hey yo””] # good!
I don’t think the OP wanted the literal quotation marks as part of the
results, though. In other words you’d want the third string to be:
The "'s are returned as part of the string ‘“hello world”’. Also, you
get the wrong result if you have two quoted strings in a row, because
of the greediness: