Help with regexp in matcher


#1

This is giving me an error:

When /?:(log|sign)?:(i|o)n success message/ do
Then “welcome message”
end

To the effect that:

/usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in
gem_original_require': ./features/components/login/step_definitions/login_steps.rb:21: invalid regular expression; there's no previous pattern, to which '?' would define cardinality at 1: /?:(am|is) not ?:(logg|sign)ed ?:(i|o)n/ (SyntaxError) ./features/components/login/step_definitions/login_steps.rb:25: invalid regular expression; there's no previous pattern, to which '?' would define cardinality at 1: /?:(log|sign) ??:(i|o)n request message/ ./features/components/login/step_definitions/login_steps.rb:41: invalid regular expression; there's no previous pattern, to which '?' would define cardinality at 1: /?:(log|sign)?:(i|o)n success message/ from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:inrequire’

I would appreciate it very much if someone could tell me what i am doing
wrong here.


#2

This is giving me an error:

When /?:(log|sign)?:(i|o)n success message/ do
Then “welcome message”
end

I could be wrong, but I believe you’re looking for this instead?
When /(?:log|sign)(?:i|o)n success message/ do

hope that helps,
timg


#3

On 1/12/09 2:50 PM, James B. wrote:

This is giving me an error:

When /?:(log|sign)?:(i|o)n success message/ do
Then “welcome message”
end

The ?: needs to be inside your group if you don’t want to capture it…
so:

When /(?:log|sign)(?:i|o)n success message/ do

FWIW, I have found http://www.rubular.com an excellent resource when
creating my steps.

-Ben


#4

On Mon, Jan 12, 2009 at 11:35 PM, Tim G. removed_email_address@domain.invalid wrote:

When /(?:log|sign)(?:i|o)n success message/ do

Actually - it should be: /(?:log|sign) (?:i|o)n success message/ (a
space
was missing too)

http://www.rubular.com

Aslak


#5

Thanks for all the advise and corrections. I ended up with this:

When /\bsee a (?:log|sign)(?: ?)[io]n success message/ do

login | log in | logon | log on | signin …

Then “see the login ok message”
end

As to the issue of whether this is being too clever by half: Perhaps.

I have to consider though, that various people are going to be working
on features relating to this project under a wide range of circumstances
and that features will develop over a long period of time. While it
might appear attractive to simply insist that a session is always a
login the fact is that language is not so precise; login, logon, log
in, log on, signin, signon, sign in and sign on are all common synonyms
for the same action. Internally, the action is just login.

Logins are a pervasive feature of this application and so, rather than
waste effort on policing the feature syntax, I thought it best just to
accommodate the likely variations from the start. Admittedly, I also
availed myself of this opportunity to gain additional knowledge
regarding regexp and so this example is perhaps overwrought for the
actual purpose at hand.

Finally, thank you Ben very much for the reference to
http://www.rubular.com


#6

On Mon, Jan 12, 2009 at 4:50 PM, James B. removed_email_address@domain.invalid
wrote:

When /?:(log|sign)?:(i|o)n success message/ do
Then “welcome message”
end

It’s a syntax error on that first question mark, the one right after
the slash. A ? in a regex signifies that whatever came just before it
may appear 0 or 1 times. Just like a + signifies that whatever came
before it must appear 1 or more times, and a * signifies that
whatever came before it can appear any number of times. There are a
few other things a ? can mean (non-greedy matching, etc.) but they all
come after something. Not at the beginning of your expression.

That’s the bug. Beyond that, what you’re trying to do with the regex
itself seems just a little too clever; do your features or your app
messages really vary randomly between the terms “login,” “logon,”
“signin” and “signon,” all meaning the same thing? If so, jumping
hoops to account for it in the tests might be a hint to change your
app language just for clarity. But if you have to have them all, just
writing /(login|logon|signin|signon) success message/ would be a lot
easier to read and understand.

(Final nit, because I’m a smellfungus: what is this step supposed to
do, anyway? Do you really have scenarios that include the line “When
login success message?” ‘When’ steps imply action taken by the
imaginary user. What’s the action here? What’s the verb?)


Have Fun,
Steve E. (removed_email_address@domain.invalid)
ESCAPE POD - The Science Fiction Podcast Magazine
http://www.escapepod.org


#7

On Tue, Jan 13, 2009 at 7:41 AM, James B. removed_email_address@domain.invalid
wrote:

Logins are a pervasive feature of this application and so, rather than
waste effort on policing the feature syntax, I thought it best just to
accommodate the likely variations from the start.

Premature flexibility is one of the roots of all evil. :slight_smile:

Seriously, your code has two types of users. Yes, you should make
writing
features easier for biz, but you should also make reading steps easier
for
dev. Given that, I like the suggestion of explicitly enumerating the
choices
of verbiage. A clear pointer toward that choice is the comment. A
comment is
an apology for unclear code. All unclear code should be commented, but
unclear code should be avoided whenever possible.

All IMO, of course.

///ark


#8

On Tue, Jan 13, 2009 at 6:14 PM, Mark W. removed_email_address@domain.invalid wrote:

Seriously, your code has two types of users. Yes, you should make writing
features easier for biz, but you should also make reading steps easier for
dev. Given that, I like the suggestion of explicitly enumerating the choices
of verbiage. A clear pointer toward that choice is the comment. A comment is
an apology for unclear code. All unclear code should be commented, but
unclear code should be avoided whenever possible.

Another principle that calls for a more rigid Regex:
http://domaindrivendesign.org/discussion/messageboardarchive/UbiquitousLanguage.html.

Everybody should speak the same language and know what it means. Having
4
different ways of saying the same thing will just add to confusion.

Aslak


#9

On Tue, Jan 13, 2009 at 10:41 AM, James B. removed_email_address@domain.invalid
wrote:

Logins are a pervasive feature of this application

Which is exactly why you should standardize. If you try to be
accommodating toward unclear communication, you’re just going to
create confusion when people need to get things done. Someone won’t
remember whether the action is named “login” or “signin” or “sign_on”
and will waste time looking for the wrong thing – or worse, write the
same function over again under a different name. That’s not the fault
of the feature, but it doesn’t help. This step doesn’t do as much
as it could to accurately document your application.

and so, rather than
waste effort on policing the feature syntax, I thought it best just to
accommodate the likely variations from the start.

Well, first, if that was really your goal you’re not going nearly
far enough. If a teammate isn’t going to take the trouble to review
existing steps, he’s probably more likely to screw up the “success
message” part than the “login” part. How many variations on the
concept of “success message” do you think you can fit in a regex?
(“acknowledgement” and “acceptance” both start with “ac,” so I guess
you could start your conditional branching logic there… Just
remember that the number of e’s in “acknowledgment” can vary…)

Beyond that, though… It’s really not a problem. Issues like this
tend to be self-correcting. Most developers (well, most competent and
properly lazy ones) will read the existing features before writing
their own. They’ll know that if they saw a clause already that does
something they want, they should use it again. And if they
misremember and type something else, they’ll get a failure and think,
“Whathuh? James got his features to work yesterday and HE needs to
log in!” And then they’ll think to look back at the step code, see
one called 'When “I see a success message”," and figure out what they
really ought to say. In looking it up, they’ll come to understand the
existing functionality better.

And they’d certainly be able to look it up in less time than it takes
to figure out all the regexes. (Or me to be a smartass about them.
Hmm.)


Have Fun,
Steve E. (removed_email_address@domain.invalid)
ESCAPE POD - The Science Fiction Podcast Magazine
http://www.escapepod.org


#10

On 13 Jan 2009, at 17:14, Mark W. wrote:

writing features easier for biz, but you should also make reading
steps easier for dev. Given that, I like the suggestion of
explicitly enumerating the choices of verbiage. A clear pointer
toward that choice is the comment. A comment is an apology for
unclear code. All unclear code should be commented, but unclear code
should be avoided whenever possible.

All IMO, of course.

+1 to all that. I feel like you get lectured quite a bit by this list
James, but you’d do well to heed the advice of some battle-hardened
journeymen, IMO.

Read Eric Evans’ excellent book ‘Domain Driven Design’, which actually
inspired a lot of this BDD stuff you’re using, to hear how keeping
faithful to a ‘Ubiquitous Language’ can make a big difference to the
success or failure of a project.

You’re not just policing syntax when you encourage people to use the
same words for things, you’re actually protecting the integrity of
your system by reducing the opportunities for misunderstanding.

Matt W.
http://blog.mattwynne.net
http://www.songkick.com


#11

Matt W. wrote:
.

+1 to all that. I feel like you get lectured quite a bit by this list
James, but you’d do well to heed the advice of some battle-hardened
journeymen, IMO.

I do hope that I do not give the impression that I resent anything that
anyone has written in response to my many inquiries. I have been
enlightened on a number of things that I was either unaware of or only
dimly perceived. I am transitioning from a completely different
environment and need all the help and guidance I can obtain. I value
this and the ruby-on-rails list very much for that reason.

Regards,


#12

Stephen E. wrote:

On Tue, Jan 13, 2009 at 10:41 AM, James B. removed_email_address@domain.invalid
wrote:

Logins are a pervasive feature of this application

Which is exactly why you should standardize. If you try to be
accommodating toward unclear communication, you’re just going to
create confusion when people need to get things done.

I appreciate the advice and accept the wisdom that it contains. I have
no intention of handling with a regexp every situation where there might
be more than one English expression available to express a concept. Nor
do I intend to otherwise permit multiplicities of expression to exist.
However, on the matter of log in versus log on and its common
variations, I think I will stick with my initial instinct. Initially I
provided the different variants of login matchers along the lines shown
below:

When /see a login success message/ do
have_selector("#login_current")
end

When /see a log in success message/ do
Then “see a login success message”
end

When /see a sign on success message/ do

The revised regexp version simply puts all of these together in one
place for me. As for forcing people to remember that it is login and
not logon; well this project does not exist in a vacuum. The people
involved deal with at least three different operating systems every day,
each one of which has its own dialect with respect to what constitutes
an authenticated user. I will accept a little flexibility of expression
here in the service of user comfort.

In any case the term login, in the context of a web application
environment, seems a bit of a misnomer from the outset. I cannot get
too worked up over the idea of unclear communication when one is dealing
with as muddy a concept as that represented by login. Really, what I
should be saying is:

Given user “myuser” has a current authenticated session
And I see the session authenticated message
When I terminate my current session
Then the current session is destroyed
And I should see the user authentication request message

However, current authenticated session tends to be a little unwieldy in
casual speech.


#13

On 13 Jan 2009, at 20:10, James B. wrote:

I appreciate the advice and accept the wisdom that it contains. I
shown
When /see a sign on success message/ do
here in the service of user comfort.
When I terminate my current session
Then the current session is destroyed
And I should see the user authentication request message

However, current authenticated session tends to be a little unwieldy
in
casual speech.

I actually rather like the unambiguous, and jargon-free ‘current
authenticated session’ phraseology, but I’m not one of your users, so
that doesn’t really matter!

I have one more suggestion. From all these different ways of saying
the same thing, pick a winner, then write a regexp step that catches
all the possible alternative ways of expressing it you can think of,
and raises an error saying “did you mean ‘current authenticated
session’” (or whatever).

That will meet both your original goal of making the features
approachable for your users who are awash with different terminology
from different systems, and the one that we’re pushing you towards of
trying to keep your domain vocabulary as simple as possible.

Matt W.
http://blog.mattwynne.net
http://www.songkick.com


#14

On Tue, Jan 13, 2009 at 12:55 PM, Matt W. removed_email_address@domain.invalid wrote:

but you’d do well to heed the advice of some battle-hardened journeymen,
IMO.

Read Eric Evans’ excellent book ‘Domain Driven Design’, which actually
inspired a lot of this BDD stuff you’re using, to hear how keeping faithful
to a ‘Ubiquitous Language’ can make a big difference to the success or
failure of a project.

You’re not just policing syntax when you encourage people to use the same
words for things, you’re actually protecting the integrity of your system by
reducing the opportunities for misunderstanding.

This last paragraph was beautifully said Matt. I am going to steal it
(and give you credit of course). :slight_smile:


Zach D.
http://www.continuousthinking.com
http://www.mutuallyhuman.com


#15

Matt W. wrote:

+1 to all that. I feel like you get lectured quite a bit by this list
James, but you’d do well to heed the advice of some battle-hardened
journeymen, IMO.

I thought that you might like to know that, after reflecting on this
overnight, I took this matter up in a design meeting today. After a
“frank exchange of ideas” it was accepted that login, and all its
related ilk, did not describe what the application was doing and was
therefore misleading and depreciated. The clincher was when I raised
the issue of authentication via end user x.509 certificates.

So, what we have now are features that read like this:

When the user “myuser” authenticates with a password

When the user is not authenticated

Then they should see the authentication page

Then they should see an authentication request message

Then they press the authenticate button

When the user session is not current

We have, in consequence, gone through and removed the term login from
all code use as well; replacing it with authenticate. So, for example,
the authentication form now says: To Proceed Please Authenticate
Yourself

I am sometimes (ok, mostly) slow to understand what I am told, but I do
hear it. Thanks for all the help on this. I believe that the
discussion here regarding the entire nomenclature issue provided a real
improvement to our design.


#16

Matt W. wrote:

I actually rather like the unambiguous, and jargon-free ‘current
authenticated session’ phraseology, but I’m not one of your users, so
that doesn’t really matter!

I do too, but I am not the president either…

I have one more suggestion. From all these different ways of saying
the same thing, pick a winner, then write a regexp step that catches
all the possible alternative ways of expressing it you can think of,
and raises an error saying “did you mean ‘current authenticated
session’” (or whatever).

That is a very good idea. Yesterday, I caught myself about to write
another multi-matching step definition regexp and I said, uuummm…
better not. But catching those type of variants and failing the step
with a meaningful warning message might just be the answer to this.

Regards,


#17

On Thu, Jan 15, 2009 at 11:13 PM, James B. removed_email_address@domain.invalid
wrote:

“frank exchange of ideas” it was accepted that login, and all its
Then they should see the authentication page
Yourself

This is awesome. It sounds like you have all come to a better
understanding
of how the application should behave through the use of more precise
language that is shared by everyone. Well done!

Aslak


#18

On 15 Jan 2009, at 22:13, James B. wrote:

real
improvement to our design.

Hooray! Ubiquitous Language FTW!

Matt W.
http://blog.mattwynne.net
http://www.songkick.com