Is this good OOP structuring?

derek · April 15, 2010, 2:46am

Hello everyone. I’m trying to get a hang of object-oriented programming
and I have a question about the best practices for my program.

I’m making a program to recommend courses to me for my school. To
simplify things, let’s just say every course has 3 variables (they
actually have a lot more): a name, a day, and a time.

I’ve made the following, let me know if I can improve on it or am making
any conventional errors (note, again I’ve simplified the classes by
omitting irrelevant methods):

class Course
attr_reader :title, :days, :time
def initialize(title, days, time)
@title = title
@days = days
@time = time
end
end

class CourseController
attr_reader :courses_all, :courses_MW, :courses_TR
def initialize(html_file)
# gets the HTML course data
# creates array where each element is a instance of Course
# creates arrays using original array’s select to get all courses on
MW
# and TR, so they can be manipulated separately if desired.
end
end

class Main
def initialize
courses = CourseController.new(“www.schoolcourselisting.com”)
courses.courses_MW each { |i| puts i } # Shows MW courses
end
end

What do you think? Any improvements I should make? CourseController is
doing a lot more work than I show. For example, it parses the HTML for
data, and returns that data.

– Derek

derek · April 16, 2010, 12:25am

On Thu, Apr 15, 2010 at 8:00 AM, Robert K.
[email protected] wrote:

any conventional errors (note, again I’ve simplified the classes by

That’s perfectly OK although you can shorten that by using Struct’s power:

Course = Struct.new :title, :day, :time

But please be aware that the Struct based implementation is a super
implementation of OP’s as it exposes the attributes write-ably too.
Cheers
R.

derek · April 15, 2010, 8:00am

On 15.04.2010 02:46, Derek C. wrote:

class Course
attr_reader :title, :days, :time
def initialize(title, days, time)
@title = title
@days = days
@time = time
end
end

That’s perfectly OK although you can shorten that by using Struct’s
power:

Course = Struct.new :title, :day, :time

If there are more methods you can do

Course = Struct.new :title, :day, :time do
def another_method
end
end

class CourseController
attr_reader :courses_all, :courses_MW, :courses_TR
def initialize(html_file)
# gets the HTML course data
# creates array where each element is a instance of Course
# creates arrays using original array’s select to get all courses on
MW
# and TR, so they can be manipulated separately if desired.

Depending on whether your CourseController is immutable or not: if it is
immutable (i.e. the list of courses does not change) then there is no
harm in separating courses. But if the list can change you need to
maintain consistency between various views on the courses. In that case
I’d start out with a single list and do the selection on demand.

What do you think? Any improvements I should make? CourseController is
doing a lot more work than I show. For example, it parses the HTML for
data, and returns that data.

And apparently it is also downloading the data from a URL, or is it? if
it does then I would probably keep the IO out of the constructor. If
you make the constructor accept the raw text and provide class methods
for convenient access you gain modularity:

class CourseController
def initialize(…) # whatever it needs
end

def self.read_url(url)
…
cc = new(…)
…
cc
end

def self.read_file(file_name)
end
end

Kind regards

robert

derek · April 16, 2010, 6:39am

class CourseController
def initialize(…) # whatever it needs
end

def self.read_url(url)
…
cc = new(…)
…
cc
end

def self.read_file(file_name)
end
end

“def self.name” makes the method static, right? Is there any particular
reason why it is best to do this than have the url part in the
constructor? What exactly is “modularity”? What do you gain from doing
this?

Thanks,
Derek

derek · April 16, 2010, 7:50am

On Thu, Apr 15, 2010 at 11:39 PM, Derek C.
[email protected]wrote:

“def self.name” makes the method static, right?

Yes (in the Java sense of the word).

Is there any particular reason why it is best to do this than have the url
part in the constructor?

Then you can create objects of this class that are not based off of that
url. What if you decide you want to group courses together, under
different
controllers? Maybe the website isn’t updated, and you want to enter them
from the course catalogue instead of the website, or maybe you want to
bring
them in from a second site, or maybe store them locally rather than
querying
the web every time you run the program. This way the class gives you the
ability to create through that website, but doesn’t mandate it, or tie
the
class itself to this particular implementation of how to get it started.

What exactly is “modularity”?

Code that doesn’t know about, or rely on internals of other code. Thus
it
will not break if used differently, or in a different context. It does
not
rely on special knowledge of implementation, or details that may change.
When this happens it is called coupling, when your code is not coupled,
you
can take it and combine it together with many other pieces of code and
it
will operate as desired. For example, if you download the web data into
a
file, you can then use it to pull from the file rather than the page,
because it is modular enough to support a different source.

Think how unix commands are small with a specific function, but you can
combine them in powerful ways. These small programs are modular.

Or relative vs absolute file paths. When they are relative, you can
place
them anywhere, and they still work. Drag and drop that directory, give
it to
friends, etc. If they are absolute, you must go alter that file path
every
time you move it. The absolute file path is coupled to the file’s
location.

What do you gain from doing this?

On a program that will never change, or that is very small, you don’t
really
see the benefits. But when you have to go back later and make a change,
you
want to be able to only make it in the places that logically deal with
that
change. Everything else should still work fine.

If your file is big, and coupled, and you need to make a change, you
will
know, because you will have this scared feeling in your stomache that
you
may have just made a change that could break something completely
different
somewhere else that you can’t even conceive of. Or you may want to make
some
specific change, but then wind up going through every line of code you
have
in order to update all sorts of things all over the place. This
shouldn’t
happen.

derek · April 16, 2010, 8:54am

2010/4/16 Josh C. [email protected]:

On Thu, Apr 15, 2010 at 11:39 PM, Derek C.
[email protected]wrote:

Derek,thanks for stepping in with good explanations!

What exactly is “modularity”?

Think how unix commands are small with a specific function, but you can
combine them in powerful ways. These small programs are modular.

I’d really like to stress this point of yours: for code to be properly
modularized it is essentially important that every single piece of
code (class, method) has a particular purpose and does one thing
well. This one thing can be even as complex as “managing lists of
items” (class Array does that) or “convert to an integer” (method
#to_i does that). But “managing lists of items and send them to
PostScript printers” would not be modular. Class Array would be
bloated with printer control code that most people do no need most of
the time. Plus, you change printing code and suddenly list management
needs to change as well which could have negative side effects on
other code using class Array. Keeping things focused not only helps
reuse but also understanding (if you need to maintain code someone
else wrote you will see what I mean).

Kind regards

robert

derek · April 16, 2010, 9:49am

Robert, I think you were meaning to thank Josh! Anyway, thanks very much
Josh for the great explanations. That answered all my questions! And
Robert, your example really helped – it was a great concrete example.

I guess my only question now is: when is it appropriate to make methods
static? In the example that was given earlier:

class CourseController
def initialize(…) # whatever it needs
end

def self.read_url(url)
…
cc = new(…)
…
cc
end

def self.read_file(file_name)
end
end

I understand how one would use CourseController.read_url(xxx) and
CourseController.read_file(xxx), but I don’t understand why I’d need it.

Without it being static, I could just make references to the instance
variables, in this case, the url – which would be passed through in the
initialize constructor). That would eliminate the need for the parameter
that both static methods carry, right?

Am I missing something about when static methods should be used?

Thanks,
Derek

derek · April 16, 2010, 9:58am

On Fri, Apr 16, 2010 at 9:49 AM, Derek C.
[email protected] wrote:

I understand how one would use CourseController.read_url(xxx) and
CourseController.read_file(xxx), but I don’t understand why I’d need it.

Without it being static, I could just make references to the instance
variables, in this case, the url – which would be passed through in the
initialize constructor). That would eliminate the need for the parameter
that both static methods carry, right?

Am I missing something about when static methods should be used?

In this particular idiom, the class methods (what you are calling
static) are used to construct the object in different ways. You
abstract what you need to construct your object, which is a string,
and provide utility class methods to obtains the string through
different means (url, file, etc). The object is more modular/reusable,
since it’s constructor doens’t depend on a URL, but on the final
string you will parse. If a certain client of that object gathers the
string from a place that you didn’t (or couldn’t) imagine, he can
still use your class, since he can call the constructor directly.

string = get_string_from_some_place
controller = CourseController.new string

This is the effect of CourseController being more modular: it can be
used with other pieces of code, because it has a single, well-defined
responsibility.

Jesus.

derek · April 16, 2010, 1:23pm

2010/4/16 Derek C. [email protected]:

Robert, I think you were meaning to thank Josh!

Oh, yes of course. Thanks for the correction!

Anyway, thanks very much
Josh for the great explanations. That answered all my questions! And
Robert, your example really helped – it was a great concrete example.

Good.

...
Without it being static, I could just make references to the instance
variables, in this case, the url – which would be passed through in the
initialize constructor). That would eliminate the need for the parameter
that both static methods carry, right?

No. The static methods would just be convenience methods for frequent
ways to construct the CourseController. Example

require ‘open-uri’

class CourseController
def initialize(courses_string)
# parse string and create internal structure
end

def self.read_file(file_name)
new(File.read(file_name))
end

def self.read_url(url)
open(url) {|io| new(io.read)}
end
end

You can even modularize it more by adding a method that will do the
parsing and output a complete initialized CourseController.
#initialize then might have an empty argument list or a different
list. Of course I am speculating a bit here since I do not know your
application case’s details.

Kind regards

robert

derek · April 17, 2010, 7:18am

Thanks Jesus and Robert for your responses.

You can even modularize it more by adding a method that will do the
parsing and output a complete initialized CourseController.
#initialize then might have an empty argument list or a different
list. Of course I am speculating a bit here since I do not know your
application case’s details.

To be more specific with you:

In my program, I’m getting data from an HTML file row by row. Each row
has 20 columns, and each of those columns makes up one variable in the
Course class. I’m using an HTML/CSS/XML parser known as “Nokogiri”.

In CourseController, I have 4 methods for parsing the HTML:

def count_rows

Returns total number of rows (by counting how many tags there
are)
end

def get_row(row)

Returns an array of 20 elements, each being a column from the

specified row

This is mainly focused around the code:

@doc.css(“tr:nth-child(#{row}) .dddefault”).each { temp.push(e) }
return temp
end

def collect_courses

Returns an array of arrays of courses. E.g.: [[course1,data,data],

[course2,data,data], etc…]

(Works by using total row count (@row) and the get_row method for

each course)
return all_courses
end

def get_courses

Returns an array of Course instances, where each instance is a

unique course.
all_courses = collect_courses
all_courses.each { |i| temp.push(i) }
returns temp
end

In the CourseController constructor, my code is as follows:

def initialize(url)
@doc = Nokogiri::HTML(open(url))
@row = count_rows
@courses = get_courses
end

How in this case do I improve my code to use the self.read_file and
self.read_url methods? I can’t entirely see how this would work, and
what would change.

And are this many methods acceptable for parsing the HTML or is it
better if I merge them to make one, clean method?

– Derek

derek · April 17, 2010, 1:13pm

On Sat, Apr 17, 2010 at 10:10 AM, Derek C.
[email protected] wrote:

And are this many methods acceptable for parsing the HTML or is it
better if I merge them to make one, clean method?

I guess I’m ultimately asking here: What should be the criteria for
creating something in a new method?

Because in this case, I could really just combine count_rows, get_row,
collect_courses, and get_courses… They’re not being used for any other
purpose but to ultimately serve get_courses.

Given that certain people (YHS included) consider methods of a length
of more than ~10 lines very bad[1] you will re-factor your “one clean
method”. Given that most of these folks consider long classes (and/or
source files) a problem too (for me 300 lines is a pain, I prefer 100)
you will get lots of modules to be mixed in. Sometimes I find that
this kind of decomposition is a joy and leads to a very modular and
readable design, sometimes, I have to admit, code fragmentation
becomes a pain too[2]. I guess there is no answer for “ultimately
asked questions” (forgive the pun please).
HTH
R.

[1] except the long case statement, which I try to avoid for that very
reason
[2] but I am a very bad large systems engineer

derek · April 17, 2010, 10:10am

And are this many methods acceptable for parsing the HTML or is it
better if I merge them to make one, clean method?

I guess I’m ultimately asking here: What should be the criteria for
creating something in a new method?

Because in this case, I could really just combine count_rows, get_row,
collect_courses, and get_courses… They’re not being used for any other
purpose but to ultimately serve get_courses.

derek · April 17, 2010, 2:38pm

Given that certain people (YHS included) consider methods of a length
of more than ~10 lines very bad[1] you will re-factor your “one clean
method”. Given that most of these folks consider long classes (and/or
source files) a problem too (for me 300 lines is a pain, I prefer 100)
you will get lots of modules to be mixed in.

Thanks for the info. Based on what you’ve said on classes, do you think
I should create another class for comparing the classes? For example:
Course (to hold individual course data), CourseController (to hold the
instantiated Courses), and CourseComparer (to provide methods to
determine if courses overlap, courses are compatible, etc). Does this
sound like better OOP?

derek · April 17, 2010, 3:18pm

On Sat, Apr 17, 2010 at 2:38 PM, Derek C.
[email protected] wrote:

determine if courses overlap, courses are compatible, etc). Does this
sound like better OOP?

Posted via http://www.ruby-forum.com/.
IMVHO it sounds like as a Course class should be responsible of
comparing itself, there is no a priori reason for refactoring this
into a different class. But there might be many reasons for doing so
in given contexts.
Maybe you should follow Einstein’s advice: “As simple as possible, but
not simpler” for starters and follow with your code as it evolves,
that’s a little bit what agile is about, right?
An alternative approach is to do implement it both ways and judge for
yourself what is “better” in ‘given contexts’, e.g. what would I need
to do if I abstracted over Courses, what if I wanted to split/unify
certain responsibilities, what if I needed a plugin system, etc. etc.
This all is much work, I agree, but you will probably learn much more
about your system than by any other means. Of course there are
“guidelines” and “philosophies” but the best way to construct a
given application is the best way to construct that given
application and I doubt that it can be known in advance. After all SW
design is an art, not a science.
So I only can humbly hope that you like some of my ideas, but if you
do not, just do not follow them and see what happens, might be as
valuable.

HTH
R.

Is this good OOP structuring?

Returns total number of rows (by counting how many tags there are) end def get_row(row)

Returns an array of 20 elements, each being a column from the

This is mainly focused around the code:

Returns an array of arrays of courses. E.g.: [[course1,data,data],

(Works by using total row count (@row) and the get_row method for

Returns an array of Course instances, where each instance is a

determine if courses overlap, courses are compatible, etc). Does this sound like better OOP?

Returns total number of rows (by counting how many tags there
are)
end

def get_row(row)

determine if courses overlap, courses are compatible, etc). Does this
sound like better OOP?