[cucumber] Cucumber and CI


#1

The rhythm for wrking with cucumber advertised by http://cukes.info/ is
to
write tests that fails first, then code that fixes it. Now my question
is,
what is the implication when combine this with Continuous Integration?

We all know when we do TDD/BDD in unit level, one test can be fixed
fairly
quick in a coupe minutes and we can check in and kick off a build. It is
a
ideal scene for doing CI: frequent checkin and fast feedback on build
results.

Cucumber, as far as my understanding goes, works on feature level. It
could
take people days to finish a cucumber feature. In the meantime, the
cucumber
test remains broken. What do we do then? We cannot check in any code
because
that’ll break the build. So we can only checkin code after several days?
It
doesn’t sound right to me. Any takes on this issue? Thanks in advance.

Yi


#2

On Sun, Feb 22, 2009 at 2:47 PM, Yi Wen removed_email_address@domain.invalid wrote:

take people days to finish a cucumber feature. In the meantime, the cucumber
test remains broken. What do we do then? We cannot check in any code because
that’ll break the build. So we can only checkin code after several days? It
doesn’t sound right to me. Any takes on this issue? Thanks in advance.

I use git, create a new branch for a feature, and work in that branch
while I’m implementing the feature. As I reach stable points in the
feature (even though it may not be done) I will merge into master and
push the changes. Usually at this point I’ve reached the end of a step
definition so the next step isn’t failing, it’s just pending.

If unable to reach a stable point to merge back into master one option
is to call “pending” inside of the step definition you were last
working on, so CI doesn’t use that to signify that someone broke the
build. This lets you merge back into master and push.

Most of the time I work in the feature branch, continually updating
and rebasing as others push changes so I’m not out of sync, and then I
merge back into master and push when the feature is either stable or
done.

I’m sure there are lots of ways to go about this though,


Zach D.
http://www.continuousthinking.com
http://www.mutuallyhuman.com


#3

On Sun, Feb 22, 2009 at 8:47 PM, Yi Wen removed_email_address@domain.invalid wrote:

The rhythm for wrking with cucumber advertised by http://cukes.info/ is to
write tests that fails first, then code that fixes it. Now my question is,
what is the implication when combine this with Continuous Integration?

  • Nobody checks in code with failing tests (cucumber features, rspec
    tests, anything else).
  • If someone accidentally does, CI will run all tests and tell the team.

We all know when we do TDD/BDD in unit level, one test can be fixed fairly
quick in a coupe minutes and we can check in and kick off a build. It is a
ideal scene for doing CI: frequent checkin and fast feedback on build
results.

Cucumber, as far as my understanding goes, works on feature level. It could
take people days to finish a cucumber feature. In the meantime, the cucumber
test remains broken. What do we do then? We cannot check in any code because

A feature typically consists of several scenarios. You don’t have to
implement all scenarios before you commit. You don’t have to write all
scenarios when you start working on a feature. I recommend you never
have more than one yellow scenario at a time.

The same goes for scenarios, which consist of several steps.

I recommend you commit every time you have made a step go from yellow
to green (via red).
This way, many successive commits will gradually build the whole
feature.

In my experience, getting a step to go from yellow to green rarely
takes more than an hour (usually less).
Is there anything preventing you from working this way?

Aslak


#4

On Sun, Feb 22, 2009 at 1:36 PM, aslak hellesoy
removed_email_address@domain.invalid wrote:

You don’t have to write all
scenarios when you start working on a feature. I recommend you never
have more than one yellow scenario at a time.

Whereas I use scenarios as a “to-do” list. I’ll keep adding them as I
think of them or as they come up in discussion.

///ark


#5

I totally agree with you on this. I have a feeling a lot of people kind
of
use cucumber as a sexy way for doing waterfall.


#6

On Sun, Feb 22, 2009 at 11:47 AM, Yi Wen removed_email_address@domain.invalid wrote:

Cucumber, as far as my understanding goes, works on feature level. It could
take people days to finish a cucumber feature. In the meantime, the cucumber
test remains broken. What do we do then? We cannot check in any code because
that’ll break the build. So we can only checkin code after several days? It
doesn’t sound right to me. Any takes on this issue? Thanks in advance.

What I do is write the feature, and check that in. Then I work on each
step definition TDD-wise, checking in each as it runs without error
and without failing the expectation.

So I check in yellow (with pending steps) and when there are no more
pending steps, I mark the feature as finished.

///ark


#7

On Sun, Feb 22, 2009 at 6:14 PM, Yi Wen removed_email_address@domain.invalid wrote:

I totally agree with you on this. I have a feeling a lot of people kind of
use cucumber as a sexy way for doing waterfall.

That may be so, but one view of agile is that each iteration is a
mini-waterfall. BDD suggests that we should define all of the
scenarios in the iteration planning meeting because we use them as a
planning tool (how can we estimate a feature at all before we’ve
talked about the acceptance criteria?).

FWIW,
David


#8

Yi Wen wrote:

I totally agree with you on this. I have a feeling a lot of people kind
of use cucumber as a sexy way for doing waterfall.

“Storytests” are very well represented in the Agile development
community in
general. Cucumber is a (slam-dunk) reinterpretation of Ward Cunningham’s
FIT
concept.

(Naturally, born of Java, FIT had no direct translation to Ruby, and
that’s
probably a good thing!)

It’s only waterfall if your product-owner writes or commissions
thousands of
story tests before doing any of them.

I heavily suspect that the author of a cucumber “feature” can hardly
wait to see
it pass, and I suspect they will refrain from diverting energy to
writing
another one. That is the heart of Agile - the feedback loop.

So what’s the maximum number of cucumber features that anyone has ever
seen
on-deck but not yet passing? That’s a bad metric, exactly like excess
inventory
in a warehouse.


Phlip


#9

Wait for an hour before I can checkin something is still too long for
me.
I’d like to checkin every couple minutes most of time.

But I think to make each step just pending first and then make it green
when
I finish implementation for the step makes sense. I probably will still
use
unit tests passing as checkin points.

Thanks

Yi


#10

yeah, you guys are probably right on this. I was just over stating. :slight_smile:

Yi


#11

Yi Wen wrote:

yeah, you guys are probably right on this. I was just over stating. :slight_smile:

Ah, but Waterfall is indeed like the Emissaries of the Shadow. If you
defeat one
of them, in one form, another one will always appear, take shape, and
grow…

I myself have heard well-meaning product managers say “When we go for
the
rewrite, we are going to make sure we specify each feature, first,
before coding
them!” They said that because the last effort - code-and-fix - had grown
until
it died under the weight of its cruft. They perceived their inability to
safely
request new features as evidence that they did not specify enough,
up-front.

These managers knew better than to use classic Waterfall, but they still
didn’t
understand that Big Requirements Up Front is essentially Waterfall’s
worst
aspect. And so they re-invented Waterfall, yet again, in yet another
form.

So keep flying that flag of vigilance, there!


Phlip


#12

On Sun, Feb 22, 2009 at 6:36 PM, Phlip removed_email_address@domain.invalid wrote:

So what’s the maximum number of cucumber features that anyone has ever seen
on-deck but not yet passing? That’s a bad metric, exactly like excess
inventory in a warehouse.

I don’t know that it’s bad. At the beginning of an iteration, I have
most of the features & scenarios that I’ll be working on. So I start
off with a big pile of yellows, and as the iteration moves on it
gradually turns green. I’d say we average 8-10 pending features at
the beginning of each iteration maybe.

Pat


#13

aslak hellesoy wrote:

So what’s the maximum number of cucumber features that anyone has ever seen
on-deck but not yet passing? That’s a bad metric, exactly like excess
inventory in a warehouse.

Do you mean “on filesystem”? I have used Cucumber on 5-6 projects now,
and I never exceed 1. If there is a bigger backlog “somewhere else”
(pile of cards, word documents…) then I keep them there for as long
as possible.

Among the Agile consultants, the metric there is:

Time between fully specifying a feature and profiting from its use.

You use each cards as a tickler for one last conversation with an onsite
customer, before cutting the test and code, right?

BTW, another CI metric for cucumberists to answer is:

After passing a cucumber test, it latches, and gates integration.

I think cucumber builds that latch effect in with the ‘pending’ keyword,
right?
Pass the cuke, take it out, integrate, and then it runs in your
integration
batch, right?


Phlip


#14

I’d question the wisdom of checking into an integration server every
couple
of minutes. I’m not sure if you meant that but if you did then I think
these
sort of checkins have to be in bigger chunks. The reason is that each
checkin to an intergration server is asking my colleagues to checkout my
code and integrate it into their current work. So everything I check in
to
the intergration server should be fit for them to use, and ideally it
should
have been reviewed (self review, or even better a bit of peer review -
easy
enought if pairing). You just can’t do that in 2 minutes. IMO a complete
scenario is about the smallest size chunk to integrate with, and a
complete
feature about the largest

Of course if your using Git (or any distributed vcs) you can just
branch,
commit locally and rebase from the master. If you want to push to get a
backup as well, you can always have a backup target in addition to your
integration target. If your not using Git (or something similar) locally
I’d
highly reccomend that.

I think its reasonable to make failing steps pending for an integration
commit, but its not something I would like to do regularly, much prefer
to
wait a bit longer before integrating and make them green.

I really don’t like working with more than 1 failing step, but find that
occasionally I end up doing that (normally because a new step prompts a
refactoring of and older step and that then breaks as well)

HTH

Andrew

2009/2/23 Yi Wen removed_email_address@domain.invalid


#15

On Mon, Feb 23, 2009 at 3:36 AM, Phlip removed_email_address@domain.invalid wrote:

probably a good thing!)

It actually did. It’s just that very few have ever used it:
http://fit.rubyforge.org/

It’s only waterfall if your product-owner writes or commissions thousands
of story tests before doing any of them.

I heavily suspect that the author of a cucumber “feature” can hardly wait to
see it pass, and I suspect they will refrain from diverting energy to
writing another one. That is the heart of Agile - the feedback loop.

Well said. That’s the feeling I get when I work with this Cucumber.

So what’s the maximum number of cucumber features that anyone has ever seen
on-deck but not yet passing? That’s a bad metric, exactly like excess

Do you mean “on filesystem”? I have used Cucumber on 5-6 projects now,
and I never exceed 1. If there is a bigger backlog “somewhere else”
(pile of cards, word documents…) then I keep them there for as long
as possible.


#16

Mark W. wrote:

I’d question the wisdom of checking into an integration server every couple
of minutes.

Our mantra is ABC: Always Be Committing. So we commit anytime we feel
like it, as long as it doesn’t break the build. This makes life a lot
easier when there is merging to do.

In a post-Agile world, we often need to remind the juniors about the
Best
Practices that started the movement. Integrate every time you could use
a
roll-back. Use incremental testing, and a test server. You can’t
integrate if
your changed tests fail. The first step of integrating pulls

And work in one room, so if you know another pair is in the same module,
you
just holler to them to integrate as soon as possible, each time you do
it.


Phlip


#17

On Mon, Feb 23, 2009 at 6:56 AM, Andrew P. removed_email_address@domain.invalid
wrote:

I’d question the wisdom of checking into an integration server every couple
of minutes.

Our mantra is ABC: Always Be Committing. So we commit anytime we feel
like it, as long as it doesn’t break the build. This makes life a lot
easier when there is merging to do.

I’m not sure if you meant that but if you did then I think these
sort of checkins have to be in bigger chunks. The reason is that each
checkin to an intergration server is asking my colleagues to checkout my
code and integrate it into their current work.

Just because I push doesn’t mean my coworkers have to pull.

IMO a complete
scenario is about the smallest size chunk to integrate with, and a complete
feature about the largest

A refactoring, a new method (and its tests), a new test, a fixed typo

  • these are all appropriate chunks of code to check in.

I think this is far superior to making massive checkins at the end of
each iteration. We usually fall somewhere in between.

///ark


#18

On Mon, Feb 23, 2009 at 9:56 AM, Andrew P. removed_email_address@domain.invalid
wrote:

I’d question the wisdom of checking into an integration server every couple
of minutes. I’m not sure if you meant that but if you did then I think these
sort of checkins have to be in bigger chunks.

To me the answer is just what Zach said: commit as often as you want,
every couple of minutes or whatever, but do it in a separate branch.
A branch per actively developed feature isn’t unreasonable. You get
to decide whether that branch is shared remotely or just lives on your
machine. Across a team, if the project is big and structured enough,
you could even have an ‘integration’ branch that you merge into for
CI, then a ‘release’ branch for rolling into production, and leave
‘master’ for point releases or dispense with it entirely. There’s
nothing magic about the ‘master’ branch, it’s just the default name
when others aren’t specified.

If you do end up doing all your work on the master branch, for
whatever reason, it still doesn’t hurt to commit all the time. Just
don’t push it until everything works. Or if you do, don’t push it
to your integration server. Git gives you a lot of control over this
stuff.

Of course if your using Git (or any distributed vcs) you can just branch,
commit locally and rebase from the master. If you want to push to get a
backup as well, you can always have a backup target in addition to your
integration target. If your not using Git (or something similar) locally I’d
highly reccomend that.

To me the single biggest advantage of Git over Subversion and other
prior ilk is the ease of branching. You can branch in Subversion, but
it’s a pain in the ass, requiring some manual repository configuration
and a lot of annoying drudgework on merging. It discourages
developers from doing it casually. In Git, branching is utterly
trivial: creating a branch takes seconds, and merging back is
automatic about 90% of the time. There is no reason not to branch as
often as convenient, and leave the main branch for stuff that’s known
to work.

That’s the win. Offline committing and networks of distributed
repositories are just sort of a bonus for most people. (Particularly
now that Github has helped to reestablish a ‘centralized repository’
culture for the majority of shared Ruby projects.)


Have Fun,
Steve E. (removed_email_address@domain.invalid)
ESCAPE POD - The Science Fiction Podcast Magazine
http://www.escapepod.org


#19

Just on a side note, how many features / stories have people seen on
their projects and how much of their project was covered by
features/stories? I refrain from the terms average and typical because,
there ain’t no much thing. But I would be interested in getting an idea
of how many features and scenarios people have used to complete a
project along with a few brief comments giving the scale (number of
total/concurrent users) and nature (order entry/ inventory
control/social networking/financial services/government regulatory) of
the project concerned. Something like:

F=31, S=165, PC=100%, TU=30, CU=21, financial services (insurance
claims)


#20

On Mon, Feb 23, 2009 at 10:22 AM, Mark W. removed_email_address@domain.invalid wrote:

Our mantra is ABC: Always Be Committing. So we commit anytime we feel
like it, as long as it doesn’t break the build. This makes life a lot
easier when there is merging to do.

I think your “doesn’t break the build” condition is a lot bigger than
you make it sound. >8-> What’s the definition of “the build” in your
work culture? Do you run all tests every time before committing? Or
just before pushing?


Have Fun,
Steve E. (removed_email_address@domain.invalid)
ESCAPE POD - The Science Fiction Podcast Magazine
http://www.escapepod.org