Hard drive backup program in 100% pure ruby

Just figured I’d mention that I’m working on this:

http://www.subspacefield.org/security/hdb/

I think that I’ll run into needing some features from the File or
FileUtil classes (such as mapping from filename to inode numbers) that
likely won’t be present, or won’t be implemented on certain file
systems, OSes, and so on, so I’ll end up being a contributor.

This is my first ruby project, however, and I’m still learning.

By the way, if you know anyone (e.g. a student) who could benefit from
a case study in applying OOP/OOD to a real project, I’m writing it up
here:

http://www.subspacefield.org/~travis/hdb_history.html

My background is python, and here were my reactions to ruby:

  1. Assignment is weird. Both names point to the same object,
    unless it’s a “small” type (fits in a register).

    a = 5
    b = a
    b = 3
    puts a # prints 5

    a = “hello”
    b = a
    b[‘hell’] = ‘t’
    puts a # prints ‘to’

    That really confused me for a bit. Fix is easy - always use dup in
    your initializers. It reminds me of the pass-by-value
    vs. pass-by-reference semantics.

  2. I originally wrote some fancy set operations (like Set class but
    not using a hash) using array operators like -=. Problem is that
    these operates remove the same object from an array, but not an
    equivalent (eql?) object. So when I compute metadata entries for
    a file based on the file, and then based on saved metadata, the
    same file with the same metadata is not the same object, so the -=
    operator won’t actually remove it from the array.

    I’m considering the work it would require to implement -= using eql?
    but it seems complicated, so I’m thinking of switching to using Set.
    Downside of this is memory overhead for hashing and the random access
    I won’t use much, also the fact that I build up these sets of
    metadata
    incrementally and so I’d be rehashing all the time. Building an
    array
    and then freezing it into a set seems possible, but cumbersome.

  3. What’s the syntax for class method constructors? I.E. I want four
    different
    ways to create metadata entries, some of which have overlapping
    signatures,
    so I can’t use a single initialzer method. (Yes, I know initialize is
    not
    a constructor).

    Currently I do this by having an empty initializer and doing
    something like:
    a = Whatever.new().create_from_string(string_args)

In general I like ruby - it’s a very terse language but not overly
cryptic.
The documentation could be improved, but that’s another list.

Also, whatever MLM you’re using (FML?) doesn’t recognize valid
RFC [2]822 addresses.

[email protected] IS a valid address, contrary to
what it believes. This seems to be the only MLM I’ve encountered that
doesn’t do correct parsing/identification of email addresses (web apps
are a different story… I wish people would RTFM…)

On Wednesday, July 28, 2010 01:06:56 pm
[email protected]
wrote:

Just figured I’d mention that I’m working on this:

Hard Drive Backups With HDB

Looks interesting…

I think that I’ll run into needing some features from the File or
FileUtil classes (such as mapping from filename to inode numbers)

Really? Check out File#stat.

My background is python, and here were my reactions to ruby:

  1. Assignment is weird. Both names point to the same object,
    unless it’s a “small” type (fits in a register).

I don’t think so. For example:

irb(main):001:0> 5.instance_variable_set :@foo, ‘five’
=> “five”
irb(main):002:0> 3.instance_variable_get :@foo
=> nil
irb(main):003:0> 5.instance_variable_get :@foo
=> “five”
irb(main):004:0>

In other words, the “small” types point to the same object. As an
implementation detail, Ruby likely does store them as plain old numbers
in a
register, but you don’t have to care about that.

That really confused me for a bit. Fix is easy - always use dup in
your initializers.

What? Why?

I mean, I guess – there’s certainly precedent for it. But most of the
time,
when I’m passing something to an initializer, I’m not keeping a
reference
around – or if I am, it’s because I actually want the object to be
shared.

  1. I originally wrote some fancy set operations (like Set class but
    not using a hash) using array operators like -=.

Careful. Remember:

a -= b

always expands to:

a = a - b

In other words, you’re duping your entire array every single time you do
that.
Is that what you want? You could always do array.delete_if (or
array.reject,
if you want the duping behavior) – those take blocks which let you set
your
own comparison.

I don’t know offhand how any of the builtin array operations compare
objects,
or what you need to override. There are at least four builtin
equivalence
operators, which all do slightly different things:

a == b
a === b
a.eql?(b)
a.equals?(b)

And I don’t know offhand what each of those do, but you should look them
up
before overriding.

  1. What’s the syntax for class method constructors?

…what?

I.E. I want four
different ways to create metadata entries, some of which have overlapping
signatures, so I can’t use a single initialzer method.

Are you sure there’s no way your initializer can tell what it’s dealing
with?
For example, say you ultimately expect a string:

def initialize anything
@foo = anything.to_s
end

And while I don’t ordinarily encourage type checking, I think it’s OK if
you’re dealing with option arguments:

def initialize options
if options.kind_of? Hash
@a = options[:a]
@b = options[:b]
else
@a = options.to_s
end
end

Currently I do this by having an empty initializer and doing something
like: a = Whatever.new().create_from_string(string_args)

By the way: The parens are optional in Ruby. Even in the DataMapper
code,
where they’re encouraged, I don’t think I ever see the empty parens like
that.

That said, there’s nothing stopping you from doing this:

class Whatever
class << self
def create_from_string(string_args)
new.tap{|obj| obj.create_from_string(string_args)}
end
end
def create_from_string_args(string_args)

end
end

That, or just do your initialization inside the body of
‘create_from_string’
– or accept a single standardized format (like an options hash) in the
initialize method, and just convert the specialized argument (like the
argument to create_from_string) into something more generic.

Remember, there’s no special “new” syntax, so there’s absolutely nothing
stopping you from making as many factory methods as you like. (Perl is
the
same way, and I’ve always wondered why more languages aren’t.)

There’s a limit to how much advice I can give you, though, without a
little
more details about what you’re trying to do. For example, what is it
that
you’re trying to put into set-like behavior, and under what
circumstances?
What kind of object are you trying to create_from_string, and what are
the
possible ways you want of initializing it?