On Fri, Aug 21, 2009 at 1:45 PM, Robert
Klemme[email protected] wrote:
- All of your Content will be immediately deleted from the Service upon
cancellation. This information can not be recovered once your account is
cancelled.
IANAL but…
First as I read this it seems to apply to termination of paid github
accounts.
Github has both paid and unpaid accounts. They provide free accounts
for open source projects. All git repositories for unpaid accounts are
publicly accessible.
Paid accounts can have both public and private repositories.
Now I don’t know if _why’s account was paid or not, but the
repositories in question were clearly public.
Getting back to those terms of service, I may be wrong, but it seems
clear to me that this is about giving github the right to delete data
from a paid account if and when the account holder wants to stop
paying for the account and cancel it. If you cancel YOU will lose
access to the data through that account, it’s not necessarily a
promise to delete the data immediately or even ever (see the third
point below) but a warning that it is a (likely) consequence of
canceling the account.
Second. Here is the license from the hpricot gem (which used to be on
github along with other stuff), I’d suspect that _why used a similar
license for all his stuff:
file COPYING
Copyright © 2006 why the lucky stiff
Permission is hereby granted, free of charge, to any person obtaining a
copy
of this software and associated documentation files (the “Software”), to
deal in the Software without restriction, including without limitation
the
rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or
sell copies of the Software, and to permit persons to whom the Software
is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included
in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS
OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
So as is typical of an open source license anyone who has a copy of
the code is free to “use, copy, modify, merge, publish, distribute,
sublicense, and/or sell copies of the Software, and to permit persons
to whom the Software is
furnished to do so” as long as any such copies retain his copyright
AND the license.
The very purpose of this restriction is to ensure eternal availability
of the software no matter what the original author does.
The social contract of open source relies on the promise of perpetuity
of such a license grant.
Third
If you haven’t been using github you might not realize that it
actually consists of networks of repositories. It leverages the
distributed nature of git, and provides a “social network” repository
which has powerfully altered the open-source experience for those who
use it. Git (and github) encourages forking a repository by cloning
it in order to customize someone elses open source, and in many cases
to contribute back. Instead of submitting a patch to one of the
maintainers and hope that it gets accepted into the mainline, you can
have your own repo, make changes, and yes, submit patches (or pull
requests) to the maintainer. Even if your contributions aren’t
accepted, you can maintain your changes on a branch and keep it up to
date with the mainline because git makes this much easier that older
scm systems.
Now one of the features of the git architecture is that, rather than
keeping deltas between versions, it keeps the source code in
‘content-addressible’ files names with a sha hash of their contents.
This keeps the repository physically small because typically few files
change between any given commit and its parent, so there is a lot of
sharing of physical files between commits. This content
addressibility is also what makes exchanging/syncing distributed
versions of git repositories so easy.
I don’t know this for sure, but it’s quite possible that github
actually shares the files between different forks of a repository
which are on github.
In this case if an account is cancelled which has public repositories
which have been forked, the “data” couldn’t be simply deleted without
killing the forks. Instead, github would need to cut the ties between
the canceled account and the data, but not delete the data itself.
Think of it as a garbage collection problem, you don’t delete data
which has live references from ANY accounts repository.
And if it’s not the case, it’s clear that github wouldn’t intend the
cancelation clause to mean that OTHER users copies of the data in
forked repositories would be deleted underneath them.
I suspect that what really happened is that this might have been the
first time that an account with public repositories (or at least one
with such widespread interest) got cancelled, and exposed some bugs in
the ‘cut the ties’ code, or more likely in the github ui code which
allows you to navigate between forks.
But the bottom line is that preservation of the code is clearly
permitted by the very licenses under which _why originally published
them.
–
Rick DeNatale
Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale