I have a problem with regexp. I have some document like :
A method for detecting a post-translationally modified protein with a
glycosyl group comprising contacting the protein with a glycosyl
transferase enzyme and a labeling agent, wherein the labeling agent
comprises a chemical handle and a transferable glycosyl group.
I want to divide it to some string follow a rule that string start with
“a, an, the” like :
"A method for detecting "
"a post-translationally modified protein with "
"a glycosyl group comprising contacting "
"the protein with "
"a glycosyl transferase enzyme and "
"a labeling agent, wherein "
"the labeling agent comprises "
"a chemical handle and "
“a transferable glycosyl group.”
I use the code
while element.size do
if element =~ /([Aa]|[Aa]n|[Tt]he)( [^ ]+)(?:[Aa]|[Aa]n|[Tt]he)?/
temp_string = $1 +$2
temp_array << temp_string
but it’s not ok. the result is
A method a post-translationally a glycosyl the protein a glycosyl a labeling the labeling a chemical a transferable
Can anyone help me about the code ?