Mocking: brittle specs and tight coupling?

devbanana · April 12, 2009, 1:01am

Hi,

I’ve read of complaints that mocking can make the specs very brittle, in
that if the method is changed at all, it will break the specs, even if
the
behavior remains the same. Do you find this to be the case? How do you
deal
with this if so?

Also, when I used to do TDD in PHP, well there wasn’t the ability to
modify
the class on the fly like in Ruby, so you actually had to do dependency
injection, but people generally looked at this as a good thing, for
loose
coupling. So if a controller method for instance used a User model, then
it
would be preferred to get that instance from somewhere, either passing
it to
the method itself or calling another method to instantiate it.

I notice this isn’t a concern in RSpec or in Ruby in general. Do you
view
this differently, or what is the preferred way to deal with
dependencies? Is
it fine just to do:

User.should_receive(:new).and_return(User.new)

Just as a very simple example?

Thanks,
Brandon

Brandon

devbanana · April 12, 2009, 1:15am

On Sat, Apr 11, 2009 at 3:59 PM, Brandon O.
[email protected] wrote:

I’ve read of complaints that mocking can make the specs very brittle, in
that if the method is changed at all, it will break the specs, even if the
behavior remains the same. Do you find this to be the case? How do you deal
with this if so?

I change the spec. In theory, this may sound onerous, but I haven’t
found it to be so in practice. Testing in general can introduce
dependencies, just as testing can introduce bugs. The idea however is
that tests are simpler than the code they’re testing, so the bugs in
them are easier to fix and the dependencies fairly mechanical to deal
with.

///ark

devbanana · April 12, 2009, 1:27am

On Sat, Apr 11, 2009 at 3:59 PM, Brandon O.
[email protected] wrote:

Hi,

I’ve read of complaints that mocking can make the specs very brittle, in
that if the method is changed at all, it will break the specs, even if the
behavior remains the same. Do you find this to be the case? How do you deal
with this if so?

http://patmaddox.com/blog/you-probably-dont-get-mocks

User.should_receive(:new).and_return(User.new)

Just as a very simple example?

I have an example of this in my Legacy Rails talk and say it’s the
sort of thing that would make a Java programmer run for the fucking
hills. That’s not entirely true because there are a couple mock
frameworks that do let you do that, but in general they prefer to
avoid it because it requires bytecode manipulation.

Ruby is much more flexible and gives us a couple ways of injecting
dependencies. You’ve got traditional DI:

class Order
def calculate_tax(calculator)
calculator.calculate total, us_state
end
end

You’ve got traditional DI + default args

class Order
def calculate_tax(calculator=TaxCalculator.new)
calculator.calculate total, us_state
end
end

You can partially mock on the target object:

class Order
def calculate_tax
calculator.calculate total, us_state
end

def calculator
TaxCalculator
end
end

order = Order.new
order.stub!(:calculator).and_return(mock(‘calculator’, :calculate =>
1.25))

or you can use partial mocks somewhere inside of the target object,
like you showed.

The pattern you showed is popular because often you won’t ever want to
pass in a different object. In your UsersController you’re only ever
going to deal with the User class as a repository, and if you change
it then it’s a fairly big change and you don’t mind updating your
tests.

I find that you can use mocks to express the intent of the class well.
Don’t use constructor/setter/method dependency injection if you don’t
need it…accept a bit tighter coupling and use partial mocking if all
you’re trying to do is isolate behaviors.

Pat

devbanana · April 12, 2009, 3:42am

On 11 Apr 2009, at 23:59, Brandon O. wrote:

I’ve read of complaints that mocking can make the specs very
brittle, in
that if the method is changed at all, it will break the specs, even
if the
behavior remains the same. Do you find this to be the case? How do
you deal
with this if so?

I’ve found the opposite more often - changing the specs of one side of
an interaction leaves them passing, but the code no longer joins up.
You should, after all, usually be changing the specs first.

Recently, I’ve been leaning towards limiting mocks to points in the
code where either the number of logic branches starts to explode (read

you’ve got if statements), or the interactions are more important
than the implementation (ie - you’re more likely to change how the
other object works than what it/its interface is). Basically, I use
them when it makes life easier*. These are rules of thumb to me right
now, so I probably couldn’t explain well what I mean, although I’m
sure I should try.

I like to view it as a sliding scale…

Pure mock specs Pure integration specs
|--------------------------------------------------------------|
Lead more directly to good OOP Allow rapid hack-solutions
Force you to think things through Encourage experimentation
Fragile Sturdy
Fast to run Slow to run
Localised failure errors Vague failure errors
High risk of false positives High risk of whack-a-mole debugging

Or at least those are how I find them.

I think I need to sit down some time and think through more formal
strategies to choose when to use mocks.

Right now I’d say, use mocks aggressively, work through the problems
you find (which tend to highlight poor OO design), learn the design/
refactoring tricks, then take a step back. But make sure you’ve got
some layer of integration spec (maybe Cucumber) above mocked specs,
and make sure you don’t give up before you feel like you’ve tamed
mocks. Ignoring either of those could leave you thinking they are a
bad tool that cause brittle specs, when actually they are a very
powerful tool, just hard to use at first*.

Ashley

some things that eventually make like easier seem to make it harder
initially

–
http://www.patchspace.co.uk/
http://www.linkedin.com/in/ashleymoran

http://twitter.com/ashleymoran

devbanana · April 12, 2009, 6:22am

Ashley M. wrote:

I like to view it as a sliding scale…

Pure mock specs Pure integration specs
|--------------------------------------------------------------|
Lead more directly to good OOP Allow rapid hack-solutions

I think I need to sit down some time and think through more formal
strategies to choose when to use mocks.

There’s a third alternative: Your sliding scale is really a pyramid with
a peak.

The third alternative is you never need to mock, yet both your tests
and
target code are highly decoupled. That is the goal.

Under TDD, you get that by avoiding a design smell called “expensive
setup”.
Your scale assumes that something actually must be set up - either
mocks, or
end-to-end class models. The best tests only need a few stubbed objects,
each
easy to construct.

You avoid the design smell by writing the simplest new test you can
think of,
and writing simple code to pass the test. And if you don’t want that
code to be
so simplistic, you practice “triangulation”.

Triangulation occurs when you have a big bowl of spaghetti, a meatball
under it,
and two chopsticks. You want to lift the meatball out, so you probe with
one
chopstick until you feel it, then you probe with the other until you can
seize
it. The two chopsticks now form a triangle, locating the meatball.

The spaghetti is the design you don’t want. The meatball is the design
you want.
Each chopstick is a simple test, and the angle between the chopsticks
represents
the difference between the two tests. If each test is simple yet is
set up
differently, then the simplest code which can pass both simple tests
approaches
the most elegant design.

Mock abuse just enables runaway dependencies.

–
Phlip

devbanana · April 12, 2009, 1:59am

Pat,

Thank you very much for the link, and dependency injection explanation.
That
helps a lot. I like your blog, too.

Brandon

devbanana · April 12, 2009, 5:37pm

The third alternative is you never need to mock, yet both your tests
and target code are highly decoupled. That is the goal.

Another tip: To TDD a new feature, don’t clone a high-level test which
calls
code which calls code which calls the code you need to change.

Start by cloning the lowest level test possible. If there is none, write
one.
And if the test still wouldn’t be low level enough, start by refactoring
the
line you need to change, to pull it out into a method you can test
directly.

devbanana · April 12, 2009, 11:21pm

On Sun, Apr 12, 2009 at 12:32 PM, Phlip [email protected] wrote:

The third alternative is you never need to mock, yet both your tests and
target code are highly decoupled. That is the goal.

Another tip: To TDD a new feature, don’t clone a high-level test which calls
code which calls code which calls the code you need to change.

FWIW, what you propose is the exact opposite of BDD, which suggests we
start at the highest levels and work our way in.

How can you know what the lowest level code is supposed to do before
you have higher level code invoking it? You can certainly make good
guesses, and they might end up being good choices in the end, but they
tend to bind you to a pre-determined design.

Your recommendation also starts with cloning a pre-existing example,
so I’m assuming this is a very specific sort of situation, where you
have a certain amount of code in place. What about when you are
starting a project for the first time?

David

devbanana · April 13, 2009, 2:26am

David C. wrote:

Another tip: To TDD a new feature, don’t clone a high-level test which calls
code which calls code which calls the code you need to change.

FWIW, what you propose is the exact opposite of BDD, which suggests we
start at the highest levels and work our way in.

My current day-job’s most important project has a test suite that
suffered from
abuse of that concept. The original team, without enough refactoring,
were
cloning and modifying very-high-level tests, making them longer, and
using them
to TDD new features into the bottom of the models layer. Don’t do that.

How can you know what the lowest level code is supposed to do before
you have higher level code invoking it? You can certainly make good
guesses, and they might end up being good choices in the end, but they
tend to bind you to a pre-determined design.

That sounds like James K.'s over-educated arguments against TDD:

established fact, which wishful thinking seems to cause some
people to ignore. Tests definitly have their role, but until
you know what the code should do, you can’t begin to write them.
And until you’ve written something down in black and white, you
don’t know it.

From there, he’ll wander off into “too smart to just try it”
bullshit…

Your recommendation also starts with cloning a pre-existing example,
so I’m assuming this is a very specific sort of situation, where you
have a certain amount of code in place. What about when you are
starting a project for the first time?

Programming bottom-up gives you decoupled lower layers. Top-down gives
you a way
to tunnel from a new View feature into the code that supports it. The
goal is
you could start a new feature either way, and you get the benefits of
both
techniques.

I thought of describing that specific tip while adding “any!” to
assert_xhtml.
It would have been too easy to start with the high-level tests:

def test_anybang_is_magic
assert_xhtml SAMPLE_LIST do
ul.kalika do
any! ‘Billings report’
end
end

 assert_xhtml_flunk SAMPLE_LIST do
   without! do
     any! 'Billings report'
   end
 end

end

Some of my features indeed started there, and some of them indeed do not
yet
have low-level tests.

But the entire call stack below that also at least has tests on each
layer -
except the actual code which converts a tag like select! into the
fragment of
XPath which matches //select[]. Oh, and that code around it had grown a
little
long. So in this case, I started there, and refactored out the single
line that
needed the change:

def translate_tag(element)
element.name.sub(/!$/, ‘’)
end

Then I can TDD translate_tag directly:

def test_any_element
bhw = assemble_BeHtmlWith{ any :attribute => ‘whatever’ }
element = bhw.builder.doc.root
assert{ bhw.translate_tag(element) == ‘any’ }
bhw = assemble_BeHtmlWith{ any! :attribute => ‘whatever’ }
element = bhw.builder.doc.root
assert{ bhw.translate_tag(element) == ‘*’ }
end

…

def translate_tag(element)
if element.name == ‘any!’
‘*’
else
element.name.sub(/!$/, ‘’)
end
end

Only then I wrote the high-level tests, and they passed.

Note that RSpec requires the constructor to BeHtmlWith to be a little
…
fruity, so I wrapped it and its Builder stuff up into
assemble_BeHtmlWith…

–
Phlip

devbanana · April 13, 2009, 4:38am

On Sun, Apr 12, 2009 at 8:23 PM, Phlip [email protected] wrote:

from abuse of that concept. The original team, without enough refactoring,
were cloning and modifying very-high-level tests, making them longer, and
using them to TDD new features into the bottom of the models layer. Don’t do
that.

This sounds like an issue with the team.

James K. wrote:
you know what the code should do, you can’t begin to write them.
Programming bottom-up gives you decoupled lower layers. Top-down gives you a
way to tunnel from a new View feature into the code that supports it. The
goal is you could start a new feature either way, and you get the benefits
of both techniques.

Having done both for a few years each, in my experience I’ve found
that outside-in gives me better decoupled layers, where each layer is
more focused on its responsibility. Inside-out usually leaked a lot of
things into outer layers, because as I was building out I was making
assumptions on what I was going to need.
Individually, each thing that leaked was small and manageable, but a
few weeks, and months later, all of those little manageable oddities
now may the code base hell to work with, slowing down progress and
making it much harder for new devs to get ramped up.

I have worked bottom-up with Java and Ruby, and I’ve shipped those
apps. Not all of them were webapps either. IME, the bottom up approach
works. But after sharpening my outside-in skills, I have found it to
be all around much better development approach for the apps that I
deliver.

end

Then I can TDD translate_tag directly:
…

–
Zach D.
http://www.continuousthinking.com

devbanana · April 13, 2009, 4:13am

On Sun, Apr 12, 2009 at 9:23 PM, Phlip [email protected] wrote:

from abuse of that concept.
Any concept can be abused. No reason to ignore its merits. Similarly,
ideas that work well in one situation shouldn’t be assumed to work
well in all situations. It cuts both ways.

The original team, without enough refactoring,

“without enough refactoring” suggests that the problem was the team,
not the concept. No?

I’m not asking anything. I’m simply stating an easily
have a certain amount of code in place. What about when you are
starting a project for the first time?

Programming bottom-up gives you decoupled lower layers. Top-down gives you a
way to tunnel from a new View feature into the code that supports it.

I don’t think these are mutually exclusive. You can develop decoupled
layers driving from the outside-in. And when you do, the inner layers,
which serve the needs of the outer layers, tend to be more focused on
that responsibility than if you start with the inner layers because
you’re not guessing what you’ll need, you’re satisfying an existing
need.

 ul.kalika do
element.name.sub(/!$/, ‘’)
assert{ bhw.translate_tag(element) == ‘*’ }
end

Only then I wrote the high-level tests, and they passed.

From what I’m reading, this seems like a very specific situation
you’re describing in which you did, in fact, start from the
outside-in. The lower level test you added was the result of
refactoring, which seems perfectly reasonable to me. But I’m not
seeing how this applies as a general all-purpose guideline. What am I
missing?

Note that RSpec requires the constructor to BeHtmlWith to be a little …
fruity

Huh? RSpec does not construct matcher instances for you, so how is it
enforcing any constructor restrictions, fruity or otherwise?

devbanana · April 13, 2009, 5:30am

Zach D. wrote:

My current day-job’s most important project has a test suite that suffered
from abuse of that concept. The original team, without enough refactoring,
were cloning and modifying very-high-level tests, making them longer, and
using them to TDD new features into the bottom of the models layer. Don’t do
that.

This sounds like an issue with the team.

And yet they didn’t abuse mocks. (-:

devbanana · April 13, 2009, 6:59am

On Sun, Apr 12, 2009 at 8:27 PM, Phlip [email protected] wrote:

My current day-job’s most important project has a test suite that suffered from abuse of that concept. The original team, without enough refactoring

Would you have called it abuse were the tests well-factored? I don’t
think it was abuse of acceptance tdd, or outside-in, or mocks, or any
other concept that caused them suffering. You’re always going to end
up fucking yourself if you don’t refactor.

Pat

devbanana · April 22, 2009, 6:14am

On 12 Apr 2009, at 05:19, Phlip wrote:

…

The spaghetti is the design you don’t want. The meatball is the
design you want. Each chopstick is a simple test, and the angle
between the chopsticks represents the difference between the two
tests. If each test is simple yet is set up differently, then the
simplest code which can pass both simple tests approaches the most
elegant design.

Hi Phlip

Sorry for the late reply epicfail. I’ve been trying to visualise what
you mean here. Can you give me a more concrete example of what you
mean by “meatball-alone” vs “spaghetti-laden” design?

BTW, as a pretty strict almost-carnivorous grain-avoider, I approve of
your analogy =)

Cheers
Ashley

–
http://www.patchspace.co.uk/
http://www.linkedin.com/in/ashleymoran

http://twitter.com/ashleymoran