Samizdat 0.6.1: Security, MVC, RSS import

What is Samizdat?

Samizdat is a generic RDF-based engine for building collaboration and
open publishing web sites. Samizdat provides users with means to
cooperate and coordinate on all kinds of activities, including media
activism, resource sharing, education and research, advocacy, and so on.
Samizdat intends to promote values of freedom, openness, equality, and
cooperation.

Samizdat library includes four stand-alone modules that can be used
outside the Samizdat engine: Cache (thread-safe time-limited object
cache with flexible replacement policy), Storage (RDF storage over a
relational database), Sanitize (whitelist XSS filter based on HTMLTidy
and REXML), and Antispam (simple wiki spam filter).

What’s new in Samizdat 0.6.1?

Main goal of 0.6.x series is to address the shortcomings that were
identified in the IMC CMS Survey in November 2006 [0]. This version
takes
care of the most important part: security. New security features in
Samizdat 0.6.1 include: CSRF protection, Antispam module, per-resource
moderation logs, moderation requests tracker.

[0] http://samizdat.nongnu.org/doc/CMSSurveyReportSamizdat.html

Samizdat’s internals have changed beyond recognition since previous
release. The engine code is refactored into MVC architecture, Samizdat
Cache now uses a deadlock-proof two-level locking algorithm, RDF Storage
has undergone a massive overhaul that allowed to add support for
optional sub-patterns in Squish queries. Apache/PostgreSQL combo is no
longer the only way to install Samizdat: Lighttpd web server and MySQL
and SQLite3 databases are now supported. The database schema is changed
once again, see below on how to upgrade.

There’s also a lot of small features and usability improvements here and
there. The tired “next page” link is replaced with proper pagination
system, file sizes are displayed next to download links, replies are
sorted by id instead of last edit date, posting comment to a multi-page
thread redirects to thread’s last page, translations don’t appear in the
replies list and can’t be replied to, error reporting is more detailed
and less confusing to users. User interface was translated into several
more languages, with varying degrees of completeness.

And the “cherry on top” prize goes to RSS import module, with special
thanks to Boud who evangelized this feature for a long time and created
the first implementation.

Changes in more detail:

  • RSS import: each site can configure a list of feeds to be syndicated
    into the front page, feeds are stored in persistent DRb cache and
    updated by samizdat-import-feeds script

  • CSRF protection: cross-site request forgery is a type of web exploit
    that relies on one site fooling user’s browser into submitting to
    another site a request that would apply changes on user’s behalf; to
    prevent that, every form that submits changes to a Samizdat site
    includes a unique ID that is also stored on the server and
    cross-checked when the form is submitted

  • Antispam module: a list of regular expressions is loaded from a
    configured location and stored in persistent DRb cache, messages by
    users with configured access levels (by default, only guests) are
    compared against the list and rejected if a match is found

  • per-resouce moderation logs: whenever a resource was moderated (for
    message, this includes moderation actions applied to its replies), a
    link to resource’s moderation log will appear in its header; the “show
    hidden” option was removed from the UI as it was made obsolete by this
    feature

  • moderation requests tracker: users with roles that allow to post
    messages are also able to request moderation of a message;
    unacknowledged moderation requests are listed on a tracker page that
    allows moderators to action requests on the spot

  • new translations: Spanish, German, Japanese (very rough), Chinese (in
    progress)

  • MVC architecture: originally, Nitro and Rails frameworks were reviewed
    as candidates, both were discarded: Nitro was missing some important
    features, while Rails required too many code changes and wasn’t
    friendly to “multiple sites per host” setups; instead, a tiny
    Samizdat-specific MVC library was implemented: in just 300 lines, it
    provides dispatcher, controller, and view classes (no ORM as we
    already have RDF for that), later a 100-line DataSet class was added
    to support the new pagination system

  • deadlock-proof Cache: new two-level locking algorithm has made
    Samizdat Cache re-enterable (it’s now possible to invoke fetch_or_add
    from inside a block passed to an outer fetch_or_add invokation), and
    protected against dead-locks and live-locks (beware: the algorithm
    relies on RubyForge patch #11680 which as of today is included in
    Debian package of Ruby 1.8, but not upstream); replacement policy can
    now be overridden (see CacheEntry#replacement_index); configurable
    rate limit ensures that at least given amount of time passes between
    two flushes and prevents a situation where rapid site updates don’t
    give the engine enough time to update the cache

  • optional sub-patterns in Squish: new OPTIONAL section of Squish query
    allows to augment the query pattern graph with sub-graphs that may or
    may not match against the site knowledge base; per-statement FILTER
    conditions help to put restriction on variable values closer to where
    the variables are defined; these changes bring Samizdat Squish
    semantically closer to W3C-recommended SPARQL RDF query language

  • Lighttpd: see doc/examples/lighttpd.conf on how to setup Samizdat with
    Lighttpd in FastCGI mode; due to limitations of Lighty’s rewrite
    capabilities, it’s tricky to make it properly handle static content
    (e.g. site logo), otherwise this setup is well-tested and stable

  • MySQL: now that MySQL 5 supports triggers and transactions, it is
    possible to run Samizdat on MySQL, database generation scripts are
    included in the package (and no, there’s no performance difference
    between PostgreSQL and MySQL); one gotcha when migrating from
    PostgreSQL to MySQL is the latter’s peculiar understanding of Unicode
    string equality: if your database has member logins that only differ
    in case, you will not be able to migrate it to MySQL as it will
    consider it a clash on unique field; to prevent that from happening in
    the future, Samizdat now enforces lowercase login names

  • SQLite3: if you want to play with Samizdat without installing a
    heavy-duty DBMS, or if your hosting only allows you to run scripts
    from your home directory and doesn’t provide database access, you can
    hook Samizdat to SQLite3 and still get a functional site; beware that
    SQLite3 (or at least it’s Ruby/DBI driver) has a tendency to lock up
    under heavy load, so if you expect lots of traffic, PostgreSQL is
    still the way to go

How to upgrade?

Apache configuration was changed to rewrite everything to the MVC
dispatcher, review the changes in doc/examples/apache.conf and merge
them into configurations of your sites.

Following SQL commands need to be run by database user that owns your
tables to bring it up to date with version 0.6.1:

ALTER TABLE Member RENAME COLUMN passwd TO password;
UPDATE Moderation SET action = ‘replace’ WHERE action = ‘displace’;
CREATE INDEX Resource_uriref_idx ON Resource (uriref);
CREATE INDEX Resource_published_date_idx ON Resource
(published_date);
CREATE INDEX Statement_object_idx ON Statement (object);
CREATE INDEX Vote_proposition_idx ON Vote (proposition);
CREATE INDEX Moderation_resource_idx ON Moderation (resource);

Where to get it?

Project page: http://samizdat.nongnu.org/
Download:
http://savannah.nongnu.org/download/samizdat/samizdat-0.6.1.tar.gz
Debian package: apt-get install samizdat
(Debian Package Tracking System - samizdat)

Are there any plans for factoring out some of these modules into
separate gems, such as the RDF module?

On Wed, Mar 5, 2008 at 12:13 AM, Avdi G. [email protected] wrote:

Are there any plans for factoring out some of these modules into
separate gems, such as the RDF module?

Maybe, but I don’t see it as a priority: nothing stops you from
installing whole Samizdat package and then only using the parts that
you need. In case of RDF module, all you need is require
‘samizdat/storage’.

That doesn’t mean I oppose the idea of splitting it into smaller bits,
merely don’t want to do it myself :wink: