Strange result using String#split

I am trying to split a string on all occurrences of ’ AND ’ except
where it appears within quotes. I have a regular expression which
works but generates strange output in a certain case:

“text_search ‘(large red spear OR axe) AND wood’ AND material
1”.split( /(?: AND )?(\S+ ‘.+’)(?: AND )?|(?: AND )/ )

gives:

["", “text_search ‘(large red spear OR axe) AND wood’”, “material 1”]

I don’t understand why my regular expression is producing the blank
entry at the beginning of the array. Can anyone lend some insight?

Thanks,
Ryan Wallace

Ryan Wallace wrote:

“text_search ‘(large red spear OR axe) AND wood’ AND material
1”.split( /(?: AND )?(\S+ ‘.+’)(?: AND )?|(?: AND )/ )

gives:

["", “text_search ‘(large red spear OR axe) AND wood’”, “material 1”]

I don’t understand why my regular expression is producing the blank
entry at the beginning of the array. Can anyone lend some insight?

“text_search ‘(large red spear OR axe) AND wood’” is what your regex
matches.
“” is what comes before the match and “material 1” is what comes after
the
match. If a split-regex matches the beginning of a string, the first
item in
the returned array will be “”. Compare:

“a1b”.split(/1/)
=> [“a”, “b”]

“1b”.split(/1/)
=> ["", “b”]

“1b”.split(/(1)/)
=> ["", “1”, “b”]

I also notice that you have a greedy quantifier between the ‘’ in the
regex.
This will likely cause unwanted result when you have more than one pair
of ‘’
in your string.

HTH,
Sebastian