SPDX (and the glazing of ones eyes)

dubstep · June 25, 2011, 5:01pm

Never ceases to amaze me how complicated “enterprisey” peoples can
make things.

http://spdx.org/

I start with a very simple critique. Have any of these folks ever
heard of YAML?

[Example] http://spdx.org/system/files/spdxspreadsheetexample.rdf_.txt

tcsnide · June 25, 2011, 5:11pm

On Sat, Jun 25, 2011 at 10:00 AM, Intransition [email protected]
wrote:

Lol, what are you working on, Trans? Why the typecasting and the XML?

tcsnide · June 25, 2011, 5:55pm

On Sat, Jun 25, 2011 at 5:00 PM, Intransition [email protected]
wrote:

Never ceases to amaze me how complicated “enterprisey” peoples can
make things.

http://spdx.org/

I start with a very simple critique. Have any of these folks ever
heard of YAML?

Let’s look at all the OSI approved open source licenses:

That’s over 70 licenses. So, off the top of your head: How many, and
which ones, are compatible with the GPLv2.0 and GPLv3.0 at the same
time?

Which licenses require the contributors to a project to give a patent
grant? Which ones require a copyright grant? Which ones require both?
Are they LGPL compatible?

If you want to see a real monster of a specification, look at Dublin
Core. Then answer: What’s an author? Does your definition encompass
all possible scenarios (TV shows, plays, poetry, scientific works,
novels, ghost-authorship, translations, co-authorship)?

As a rule of thumb: Data formats are complex because the data is
complex.

–
Phillip G.

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibnitz

tcsnide · June 25, 2011, 7:06pm

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibnitz
Sorry for that OT post, but that kind of sounds the most anti agile
quote I have ever seen.

tcsnide · June 25, 2011, 7:24pm

On Sun, Jun 26, 2011 at 02:05:43AM +0900, Robert D. wrote:

A method of solution is perfect if we can forsee from the start, and
even prove, that following that method we shall attain our aim.
– Leibnitz

Sorry for that OT post, but that kind of sounds the most anti agile
quote I have ever seen.

. . . unless you take it as meaning “There is no such thing as a perfect
method of solution.”

tcsnide · June 25, 2011, 9:22pm

On Sat, Jun 25, 2011 at 12:05 PM, Robert D.
[email protected]wrote:

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibnitz
Sorry for that OT post, but that kind of sounds the most anti agile
quote I have ever seen.

As a domain, Software Engineering probably fails the antecedent.

tcsnide · June 25, 2011, 7:50pm

On Sat, Jun 25, 2011 at 7:23 PM, Chad P. [email protected] wrote:

method of solution."
Good point, I guess I am too upset with Rusell’s and Hilbert’s way of
thinking (which went as far as refuting Gdel’s theorem) but I really
should get lost here (blush).

tcsnide · June 25, 2011, 9:41pm

On Jun 25, 11:11am, Josh C. [email protected] wrote:

Lol, what are you working on, Trans? Why the typecasting and the XML?

In this case, how best to detail copyright and license info in my
project’s metadata. I thought about leaving it out altogether and
seeing if something like SPDX offered a viable alternative. (obviously
my conclusion was HELL NO!).

tcsnide · June 25, 2011, 11:53pm

On Sun, Jun 26, 2011 at 04:41:06AM +0900, Intransition wrote:

On Jun 25, 11:11am, Josh C. [email protected] wrote:

Lol, what are you working on, Trans? Why the typecasting and the XML?

In this case, how best to detail copyright and license info in my
project’s metadata. I thought about leaving it out altogether and
seeing if something like SPDX offered a viable alternative. (obviously
my conclusion was HELL NO!).

Have you considered a LICENSE or COPYING file?

tcsnide · June 25, 2011, 9:37pm

On Jun 25, 11:54am, Phillip G. [email protected]
wrote:

As a rule of thumb: Data formats are complex because the data is complex.

No doubt. It’s not the basis of the whole idea that boggles my mind –
yes that is pretty complex by its very nature (though damn maybe some
one should work on simplifying that too). But the way the implement
the final result. I mean the XML in itself is enough to make you want
to run away waving arms. But how is anyone supposed to implement this?
The format speaks of scanning each file for copyright info. That
should be a fun trick.

I think a better approach would be a simple dependency code list. They
already have the codes. So you just pick yours, maybe pick them up
automatically from your dependencies list, and feed them into a web
api and it spits out the full text of a license conforming to all of
them, or reports an incompatibility issue and it’s details.

Maybe I missing the point, but I think generating this XML is going to
prove a nightmare, probably won’t work half-the time, will waste lots
of disk space, and what the hell is anyone going to do with the XML
once they have it anyway? Really this looks like nothing more that a
new “hype-tech” to push high $$$ asset management apps (that will
basically just do what the web api I just mentioned will).

tcsnide · June 25, 2011, 9:52pm

On Sat, Jun 25, 2011 at 9:36 PM, Intransition [email protected]
wrote:

I think a better approach would be a simple dependency code list. They
already have the codes. So you just pick yours, maybe pick them up
automatically from your dependencies list, and feed them into a web
api and it spits out the full text of a license conforming to all of
them, or reports an incompatibility issue and it’s details.

And how do you propose such a dependency list is generated? Remember
that you have to conform to several, very different, national laws
concerning corporate governance and reporting.

Maybe I missing the point, but I think generating this XML is going to
prove a nightmare

As much as any sort of XML generation will be a nightmare.

probably won’t work half-the time

That’s the implementor’s fault. Skimming the spec, it’s well formed,
and actually a good use of XML as it was intended.

Of course, the trick is generating the proper licensing, but that’s
left as an exercise for the reader: This is definitely a data-exchange
format, not storage.

will waste lots of disk space

Yeah, I really need a way to fill up the remaining 100 GB on my
internal 500GB HD…

Point being: Unless you are stuck on tiny HDs (smaller than your
average SSD these days), disk space is a non-issue. Your average “one
file per class” approach wastes much more diskspace than this would
(on NTFS, the smallest block size is 4KB, which means anything that’s
smaller than 4KB is lost. This is worse on *NIX filesystems with
inodes).

and what the hell is anyone going to do with the XML
once they have it anyway?

Avoid loads of fines, and / or bills for licensing of supposedly
unlicensed software. The only time you don’t have to deal with
licensing compliance as a corporation is when you are dogfooding. As
soon as you buy something external, you’ll have to manage your
licenses.

Really this looks like nothing more that a
new “hype-tech” to push high $$$ asset management apps (that will
basically just do what the web api I just mentioned will).

Yeah, those apps actually exist already:
http://www.google.com/search?&q=license+management+software

That’s because there’s a a real need to manage the licensing of
workstations and servers. Nobody is going to do this by hand, since
these sort of things are what we invented computers for to start with.

–
Phillip G.

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibniz

tcsnide · June 26, 2011, 12:28am

On Jun 25, 5:45pm, Chad P. [email protected] wrote:

Have you considered a LICENSE or COPYING file?

Yes, of course. But there is no consistent structure to them. Though I
certainly try to make mine as consistent as possible. What I started
doing was to have a COPYING or NOTICE file that lists all licenses and
copyrights in brief with reference to the full license in a separate
file. If there is only one license the file is in the project root, if
more that one then they go in license/ directory. I try to name the
files more or less after the standard codes, e.g. GPL-3.0.txt.

tcsnide · June 25, 2011, 11:56pm

On Sun, Jun 26, 2011 at 04:51:55AM +0900, Phillip G. wrote:

Point being: Unless you are stuck on tiny HDs (smaller than your
average SSD these days), disk space is a non-issue. Your average “one
file per class” approach wastes much more diskspace than this would (on
NTFS, the smallest block size is 4KB, which means anything that’s
smaller than 4KB is lost. This is worse on *NIX filesystems with
inodes).

On the other hand, disk space sometimes serves well as a rule of thumb
for determining human readability – where the more space it takes up
relative to the actual data it contains, the harder it is likely to be
for a human to make sense of it in a text file.

tcsnide · June 26, 2011, 1:05am

On Sat, Jun 25, 2011 at 11:52 PM, Chad P. [email protected] wrote:

On the other hand, disk space sometimes serves well as a rule of thumb
for determining human readability – where the more space it takes up
relative to the actual data it contains, the harder it is likely to be
for a human to make sense of it in a text file.

Not when I’m using a computer for data storage and retrieval. Calling
a data-format “human readable” is a nice piece of marketing, but is
worthless as a feature: Flat text always lacks the ability to
visualize data in an easy manner, something a computer program can
provide with ease (not to mention the side effects of easy search).

That’s why we use diagrams to present statistics, for example: A table
can present numerical data, too, but a chart or diagram is much faster
to digest, no reading required.

Just for laughs, here’s the first record of US Census 2010 data for
the state of Mississippi, delivered as plain text:

DPST,MS,000,01,0000001,2967297,210956,205672,208248,224619,210894,199082,188171,187368,187579,208369,208607,186569,160756,120523,93946,69876,51703,44359,1441240,107465,105042,106606,114145,105433,98610,92461,91364,91664,101271,100994,89321,77111,56200,42228,28913,19366,13046,1526057,103491,100630,101642,110474,105461,100472,95710,96004,95915,107098,107613,97248,83645,64323,51718,40963,32337,31313,36.0,34.5,37.5,2299852,1100634,1199218,2211742,1055477,1156265,2072004,985147,1086857,473309,204243,269066,380407,159753,220654,2967297,2933190,1754684,1098385,15030,25742,5494,4474,3562,807,1537,7025,2843,1187,252,560,135,240,38162,34107,6714,4205,11088,3715,1782807,1115801,25910,32560,2776,44114,2967297,81481,52459,5888,2063,21071,2885816,2967297,81481,32397,4873,1185,265,239,36334,6188,2885816,1722287,1093512,13845,25477,948,1828,27919,2967297,2875333,1115768,506633,880481,620956,238272,119879,21751,134179,10837,5334,63270,91964,55135,40253,14882,36829,18760,18069,1115768,770266,336280,506633,198257,57661,26355,205972,111668,345502,293807,133519,30324,160288,75845,399211,279605,2.58,3.11,1274719,1115768,158951,44735,1920,16886,4915,28867,61628,2.1,11.6,1115768,777073,338695,2017902,857431,2.60,2.53

That’s a nice and concise 1212 bytes, isn’t it?

–
Phillip G.

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibniz

tcsnide · June 26, 2011, 1:41am

On Jun 25, 7:04pm, Phillip G. [email protected]
wrote:

visualize data in an easy manner, something a computer program can
provide with ease (not to mention the side effects of easy search).

Haha. Human readable is worthless. I now think you must be arguing for
arguing sake.

DPST,MS,000,01,0000001,2967297,210956,205672,208248,224619,210894,199082,18
8171,187368,187579,208369,208607,186569,160756,120523,93946,69876,51703,443
59,1441240,107465,105042,106606,114145,105433,98610,92461,91364,91664,10127
1,100994,89321,77111,56200,42228,28913,19366,13046,1526057,103491,100630,10
1642,110474,105461,100472,95710,96004,95915,107098,107613,97248,83645,64323
,51718,40963,32337,31313,36.0,34.5,37.5,2299852,1100634,1199218,2211742,105
5477,1156265,2072004,985147,1086857,473309,204243,269066,380407,159753,2206
54,2967297,2933190,1754684,1098385,15030,25742,5494,4474,3562,807,1537,7025
,2843,1187,252,560,135,240,38162,34107,6714,4205,11088,3715,1782807,1115801
,25910,32560,2776,44114,2967297,81481,52459,5888,2063,21071,2885816,2967297
,81481,32397,4873,1185,265,239,36334,6188,2885816,1722287,1093512,13845,254
77,948,1828,27919,2967297,2875333,1115768,506633,880481,620956,238272,11987
9,21751,134179,10837,5334,63270,91964,55135,40253,14882,36829,18760,18069,1
115768,770266,336280,506633,198257,57661,26355,205972,111668,345502,293807,
133519,30324,160288,75845,399211,279605,2.58,3.11,1274719,1115768,158951,44
735,1920,16886,4915,28867,61628,2.1,11.6,1115768,777073,338695,2017902,8574
31,2.60,2.53

That’s a nice and concise 1212 bytes, isn’t it?

Loose the program that understands this jibberish and it’s worthless.
On the other hand it doesn’t need a verbose 100x the size rendering in
XML either. The best formats are Goldilocks.

tcsnide · June 27, 2011, 7:22am

On Jun 25, 6:27pm, Intransition [email protected] wrote:

files more or less after the standard codes, e.g. GPL-3.0.txt.
I should note that it was Phillip that basically suggested this
design. Thank you Mr. Gawlowski!

tcsnide · June 27, 2011, 8:21am

On Mon, Jun 27, 2011 at 7:21 AM, Intransition [email protected]
wrote:

I should note that it was Phillip that basically suggested this
design. Thank you Mr. Gawlowski!

You are welcome. However, it’s merely the distillation of what I’ve
seen in the wild, and what makes the most sense to me. That it works
for you: Great!

–
Phillip G.

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibniz