Testing Data Scrapes

Hello all,

I’m here not for code help, but more for some opinions. At the moment
I’m writing a library to interact with Snipt. However, I have no idea
how to test it. I don’t want to include my username and password in the
tests, and I know if I create a test account and leave the credentials
in some jerk will login to the account and change the password. This
doesn’t just apply to Snipt, but rather to any library that connects to
a web service. Thoughts?

  • Michael B.

2008/12/25 Michael B. [email protected]:

Hello all,

I’m here not for code help, but more for some opinions. At the moment
I’m writing a library to interact with Snipt. However, I have no idea
how to test it. I don’t want to include my username and password in the
tests,

I’m doing it this way:

Creates a new Google spreadsheet object.

def initialize(spreadsheetkey,user=nil,password=nil)
@filename = spreadsheetkey
@spreadsheetkey = spreadsheetkey
@user = user
@password = password
unless user
user = ENV[‘GOOGLE_MAIL’]
end
unless password
password = ENV[‘GOOGLE_PASSWORD’]
end

so, if i do not provide the username and password from the test script
it will be taken from environment variables.

-Thomas

Thomas,

Thanks for the response. That’s a good idea, but after that, what? I
would need to test the data against hard data, wouldn’t I? And if each
person running the tests has a different set of credentials, then the
hard data would work only for me. Or were you talking about doing that
just to test to make sure methods return proper instances of Array,
GoogleSpreadsheet, etc.?

  • Michael B.

Thomas P. wrote:

2008/12/25 Michael B. [email protected]:

Hello all,

I’m here not for code help, but more for some opinions. At the moment
I’m writing a library to interact with Snipt. However, I have no idea
how to test it. I don’t want to include my username and password in the
tests,

I’m doing it this way:

Creates a new Google spreadsheet object.

def initialize(spreadsheetkey,user=nil,password=nil)
@filename = spreadsheetkey
@spreadsheetkey = spreadsheetkey
@user = user
@password = password
unless user
user = ENV[‘GOOGLE_MAIL’]
end
unless password
password = ENV[‘GOOGLE_PASSWORD’]
end

so, if i do not provide the username and password from the test script
it will be taken from environment variables.

-Thomas

Michael,

Thanks for the link to Mocha, it looks like what I need. However, I
still don’t understand what to do to test live data. This is my project:
GitHub - michaelboutros/rsnipt: A Ruby library that interacts with Snipt.net.. How would you go at
what I’m trying to do?

Thanks,
Michael B.

Michael G. wrote:

On Wed, Dec 24, 2008 at 7:20 PM, Michael B. [email protected]
wrote:

Hello all,

I’m here not for code help, but more for some opinions. At the moment
I’m writing a library to interact with Snipt. However, I have no idea
how to test it. I don’t want to include my username and password in the
tests, and I know if I create a test account and leave the credentials
in some jerk will login to the account and change the password. This
doesn’t just apply to Snipt, but rather to any library that connects to
a web service. Thoughts?

Any time that I am testing an external service like this, I tend to
cache the response as a fixture and use mocks or stubs in place of the
actual call to the service.

If you’re using test/unit I’d use Flexmock or Mocha, RSpec has a
mocking/stubbing component built in.

HTH,
Michael G.

On Wed, Dec 24, 2008 at 7:20 PM, Michael B. [email protected]
wrote:

Hello all,

I’m here not for code help, but more for some opinions. At the moment
I’m writing a library to interact with Snipt. However, I have no idea
how to test it. I don’t want to include my username and password in the
tests, and I know if I create a test account and leave the credentials
in some jerk will login to the account and change the password. This
doesn’t just apply to Snipt, but rather to any library that connects to
a web service. Thoughts?

Any time that I am testing an external service like this, I tend to
cache the response as a fixture and use mocks or stubs in place of the
actual call to the service.

If you’re using test/unit I’d use Flexmock or Mocha, RSpec has a
mocking/stubbing component built in.

HTH,
Michael G.

On Wed, Dec 24, 2008 at 11:55 PM, Michael B.
[email protected] wrote:

Michael G. wrote:

a web service. Thoughts?

Any time that I am testing an external service like this, I tend to
cache the response as a fixture and use mocks or stubs in place of the
actual call to the service.

If you’re using test/unit I’d use Flexmock or Mocha, RSpec has a
mocking/stubbing component built in.

I am of the opinion that when testing external services you should not
actually be hitting that external resource during the test. As I said
before the way I handle this is to capture the response and store it
is as a fixture and then mock/stub the appropriate method.

In your library’s particular case, I would start with your Snipt#login
method. First you may need to break your methods up into smaller
chunks in order to mock/stub the appropriate piece of the method. I
should also mention that I’ve never done any of this type of testing
with Mechanize before.

Disclaimer: I’m sure someone else can point out a better way to do
this. This is also largely untested…

The first line of your Snipt#login method:
login_form = @agent.get(‘http://www.snipt.net/login’).forms.first

I would break this up into two new methods:

class Snipt
def login_page
agent.get(‘http://www.snipt.net/login’)
end

def login_form
login_page.forms.first
end
end

By doing this, it allows you to mock the Snipt#login_page method.

Moving along, I now think to myself, what do I need to do to make
Snipt#login_page return the type of object that I am expecting? I
know that in this case, @agent.get is going to return a
WWW::Mechanize::Page object. Not only that, but that it is going to
use the response from http://www.snipt.net/login in order to construct
this object.

The next step is to capture the response of http://www.snipt.net/login
in a test fixture.

require ‘open-uri’
File.open(‘test/fixtures/login.html’, ‘w’) do |f|
f.write open(‘http://www.snipt.net/login’).read
end

Now this makes the assumption that the login page is never going to
change, which is reasonable since your library is only made to work
with this particular version of the form. If something should change
on snipt.net/login you could refresh the fixture with the same code as
above and run your tests to make sure nothing is broken. If you’re
testing against a fast moving target it may be a good idea to put the
above snippet in a Rake task so that you can easily refresh all of
your test fixtures.

Then I would look into constructing a WWW::Mechanize::Page object from
the fixture that you have saved in your test fixture directory and
setting up the mock to work properly. To do this I would create a
WWW::Mechanize::Page object and use that as the return for your mock.

As I reached this point in writing the response I noticed another
thing that I would do in order to make testing easier. I’d like to
reiterate that this is all just a matter of opinion, but I cannot see
any other way to do this.

I would make your constructor simply return the Snipt object and allow
another method to handle logging in.

class Snipt
def initialize(username, password)
@detailed_return = false

@username, @password = username, password

@logged_in = false
@lexers = {}

end

def self.login
snipt = Snipt.new(username, password)
snipt.login
snipt
end

def agent
@agent ||= WWW::Mechanize.new
@agent.user_agent_alias = ‘Mac FireFox’
@agent
end
end

This gives you the ability to construct a Snipt object that is not
logged in for easier testing and gives you a class method for
convenience of creating a logged in Snipt object. I also moved the
construction of the Mechanize::Agent object into it’s own method for
easier testing/mocking.

require ‘snipt’
require ‘test/unit’
require ‘flexmock/test_unit’

class TestSniptLogin < Test::Unit::TestCase
def test_login_should_be_false_when_unsuccessful
snipt = Snipt.new(‘foo’, ‘bar’)

login_page = WWW::Mechanize::Page.new(nil, { 'content-type' =>

‘text/html’ }, open(‘test/fixtures/login.html’).read, 200)

flexmock(snipt.agent) do |mock|
  mock.should_receive(:get).and_return(login_page)
end

# THIS IS INCOMPLETE
# You will also need to mock the submission of the login_form in

your Snipt#login method.
# You will have to create a fixture representing the failed login
form for this case.
# For the case of a successful login, you will want to create a
fixture representing the response
# during success.

assert ! snipt.login

end
end

Again, this code is largely untested, but it should give you more than
enough information to get started. Originally I was just going to
fork your project and submit a pull request with some changes, but I
decided it would probably be better to give an explanation and some
thoughts along the way.

Happy Holidays,
Michael G.