Best Practices: Escaping text on input or output?

olbrich · January 31, 2006, 4:10pm

In web applications that have user generated content, it is clearly
necessary to provide some ability to ‘escape’ user generated text to
avoid SQL injection, XSS, and other nasty attacks. The existing dogma
on this point seems to favor escaping text as it comes out of the
database, rather than doing it on the way in.

I’m not sure that I understand the logic behind escaping text as it
comes out of the database. This approach seems to put more load on the
processor (as an escape call is generated every time data is served),
and it seems to be more prone to detrimental errors. This is a ‘default
insecure’ system that relies on the programmer to remember to escape
fields in all their views. Failure to escape an attribute leaves you
vulnerable to malicious users and provides no warning that this is the
case.

If the system escapes all text going into a database, then the
responsibility for security falls to one or two well defined functions
in the code (the ones that actually insert or update records), instead
of a multitude of views. This would be a ‘default secure’ system.
Typically there are more places that data is displayed then where it is
entered. Of course eventually one will want to display html from the
database, but this can be accommodated by unescaping the text as needed.
Using this approach, the programmer is required to explicitly over-ride
the defaults to get at the ‘unsafe’ data. Failure to properly unescape
code when needed will result in broken functionality, but no security
risk.

Am I missing something here?
So, for all you web application development professionals out
there…why escape text on output and not on input?

_Kevin

olbrich · January 31, 2006, 4:33pm

So what kind of escaping do you suggest? SQL, HTML, Javascript,
everything combined? PHP’s magic_quotes is a good example that automatic
escaping can create a lot of headache and corrupted data.

Escaping depends on the output context - maybe you’ll just need pure
text?

olbrich · January 31, 2006, 4:41pm

More than one application might need to use the data. If you booger
it up on input it might be unusable for the other applications.

On 1/31/06, Guest [email protected] wrote:

Rails mailing list
[email protected]
http://lists.rubyonrails.org/mailman/listinfo/rails

–
"Her faults were those of her race and sex; her virtues were her own.
Farewell, and if for ever - "

– “Travels with a Donkey in the Cevennes” by Robert Louis Stevenson

olbrich · January 31, 2006, 4:49pm

Guest wrote:

So what kind of escaping do you suggest? SQL, HTML, Javascript,
everything combined? PHP’s magic_quotes is a good example that automatic
escaping can create a lot of headache and corrupted data.

Escaping depends on the output context - maybe you’ll just need pure
text?

Not being that familiar with magic_quotes, could you describe how it
caused problems?

FYI, for simplicity I’m looking at essentially using the equivalent of
h() on input vs. output. I think that just escapes the angle brackets
and ampersand. Escaping quotes may also be necessary.

And as for other applications using the data, that seems like a problem
that can be handled by proper coding and doesn’t justify compromising
the security of your application for convenience.

_Kevin

olbrich · January 31, 2006, 6:10pm

And as for other applications using the data, that seems like a problem
that can be handled by proper coding and doesn’t justify compromising
the security of your application for convenience.

Ditto to your proposal. Applications are responsible for ensuring
validity of input from untrusted sources. This is a tree that has
been barked up before.

_Kevin

–
Posted via http://www.ruby-forum.com/.

Rails mailing list
[email protected]
http://lists.rubyonrails.org/mailman/listinfo/rails

–
"Her faults were those of her race and sex; her virtues were her own.
Farewell, and if for ever - "

– “Travels with a Donkey in the Cevennes” by Robert Louis Stevenson

olbrich · January 31, 2006, 9:10pm

Kevin O. wrote:

Am I missing something here?
So, for all you web application development professionals out
there…why escape text on output and not on input?

Because Rails’ standard methods handle input escaping automatically. As
for output, you only need to worry about values that could contain
malicious code (like XSS), so use h() and/or sanitize().

Joe

olbrich · January 31, 2006, 6:34pm

On 1/02/2006, at 4:49 AM, Kevin O. wrote:

Not being that familiar with magic_quotes, could you describe how it
caused problems?

I’m a full time PHP developer at the moment (ugh), so I have a fair
idea of how it works.

magic_quotes_gpc works by inspecting all the input values (Get/Post/
Cookie) and escaping all quotes and backslashes with a backslash,
though single quotes can be escaped with ‘’ if you set an ini option.

Sounds great, but it doesn’t unescape, so you have to do that
manually on every data retrieval, otherwise it escapes the escape on
postback. I’ve seen PHP sites that haven\\\\\\\\'t unescaped
the text many times.

magic_quotes_runtime also escapes programatically generated text in a
dumb way, so you can imagine what it does to "can’t "+“work”.

I think it’s always best to explicitly escape, then you know that
you’ve done it.

–
Phillip H.
[email protected]
http://www.sitharus.com/

olbrich · February 1, 2006, 1:34am

On 1/31/06, Ben M. [email protected] wrote:

“integration” dbs – it is extremely likely that someday the powers that be will want
another app to talk to the same db.

Multiple apps accessing the same db is why you put your business logic
and validation rules into stored procedures. Then set the privs on the
tables so that no one can get to them, thus forcing everyone to use
the stored procedures.

If you try to reimplement the business logic in every app that touches
the db you violate DRY. And for sure, somewhere the two
implementations won’t be identical.

I’d like to see better integration of stored procedures into RoR.

–
Jon S.
[email protected]

olbrich · February 1, 2006, 1:19am

I have been gradually working my way around to the opinion that the
application
architecture as layer cake is a flawed model. Rather my new paradigm
states that the
application code – the model – is the defensive core and all
approaches must be
throroughly guarded: including data from the db.

Historically this is a reasonable position simply because there’s always
someone with
command-line access to the db. I have seem plenty of situations where
bad data caused an
error… and the client didn’t give a damn that it was his data that
caused the error.
More recently, web services have created another route into the app that
avoids the UI
layer. And – despite DHH’s claim to using “application” dbs over
“integration” dbs – it
is extremely likely that someday the powers that be will want another
app to talk to the
same db.

If you simply write your model with the assumption that any data that is
handed to it is
suspect, then you’re way ahead of the game. It’s a bit more work but it
makes for a more
robust, long-lived system.

This is a somewhat tangential to the original point, but it’s
applicability is this: just
put your data into the database as plain as possible and do all
necessary checking,
validating and converting in the application (validation in the model
and converting in
the view, since each view (and I use that term generically: a “view”
could be an html
page, a pdf, a word doc, an email, an xml file, another app’s input,
etc.) knows best how
it wants to see the data.

Of course, I haven’t yet applied this as rigorously as I would like…
all in good time.

b

olbrich · February 1, 2006, 1:49pm

Phillip H. wrote:

I think it’s always best to explicitly escape, then you know that you’ve
done it.
+1 to that. Moreover, if you get used to magic_quotes, you’re more
likely to make the assumption that they apply when you’re developing
something to be deployed where they’re actually turned off. Can be
nasty.

olbrich · February 1, 2006, 3:11pm

Adam D. wrote:

I think this is something that someone should add to the BestPractices
page on the rails wiki.
Not sure how relevant that would be, given that it’s a PHP tip

–
Alex

olbrich · February 1, 2006, 2:44pm

I think this is something that someone should add to the BestPractices
page on the rails wiki.

http://wiki.rubyonrails.org/rails/pages/RailsBestPractices

adam

olbrich · February 1, 2006, 6:45pm

Alex Y. wrote:

Adam D. wrote:

I think this is something that someone should add to the BestPractices
page on the rails wiki.
Not sure how relevant that would be, given that it’s a PHP tip

–
Alex

This discussion is more general than PHP, and is intended to be a
discussion of when and how is the best way to escape potentially
dangerous text.

To summarize so far:

escaping on input—
improves performance slightly
requires you to trust all your input routines
May confuse some applications that utilize data
escaping on output–
slight performance hit
requires you to trust all output routines
default escaped
slight performance hit unless data was escaped on entry
failure to unescape when needed just breaks functionality and not
security
default unescaped
failure to escape properly results in security hole

Right now the rails default is unescaped data with default unescaped
output.
This is not a problem if programmers manually escape stuff as necessary.

It is also not a problem if programmers write tests that explicitly
check for proper escaping of text fields.

_Kevin

olbrich · February 1, 2006, 7:13pm

On 2/1/06, Jon S. [email protected] wrote:

avoids the UI layer. And – despite DHH’s claim to using “application” dbs over
implementations won’t be identical.

I’d like to see better integration of stored procedures into RoR.

Hi! In my world, things are not all The Rails Way. I am considering
migrating a huge system that has plenty of functions (stored
procedures) that are used for data access.

How in the world do you use them in Rails? I thought I could simply
Repeat Myself by re-writing the logic in ruby code, but I would die of
old age before I finished. I would be grateful for any pointers in
how to call a function with different arguments in place of the
default insert/update/deletes for object
creation/modification/destruction and catch database errors, returning
the painstakingly created custom error messages to the .errors of the
offending object.

Thanks in advance…

Ian

olbrich · February 1, 2006, 6:49pm

Jon S. wrote:

Multiple apps accessing the same db is why you put your business logic
and validation rules into stored procedures. Then set the privs on the
tables so that no one can get to them, thus forcing everyone to use
the stored procedures.

If you try to reimplement the business logic in every app that touches
the db you violate DRY. And for sure, somewhere the two
implementations won’t be identical.

I’d like to see better integration of stored procedures into RoR.

That’s a database centric view of the world. Rails doesn’t work like
that. In Rails the database is little more than a filesystem with a
fancy search engine. If ReiserFS ever fulfills its promise, or the
Longhorn filesystem ever sees the light of day, then I suspect the
database will be ejected in favour of more modern technology.

In Rails the constraints and structure are in your model,
object-oriented, in Ruby. If you have multiple apps then they should all
be using the same lower layer to the model from a single set of Ruby
scripts.

In RoR all access to the database goes via the model

If you have multiple languages accessing the bottom end of a database,
then you have an integration database and lots of other problems.

olbrich · February 1, 2006, 7:22pm

Kevin O. wrote:

Alex Y. wrote:

Adam D. wrote:

I think this is something that someone should add to the BestPractices
page on the rails wiki.
Not sure how relevant that would be, given that it’s a PHP tip

This discussion is more general than PHP, and is intended to be a
discussion of when and how is the best way to escape potentially
dangerous text.
Oh, I know. Magic_quotes is PHP-specific (which was the context of my
reply), but the concept behind it isn’t.

slight performance hit unless data was escaped on entry
failure to unescape when needed just breaks functionality and not
security

default unescaped
failure to escape properly results in security hole
That security hole only exists on input. Unless cross-site is an issue,
of course, but that really depends on what you classify as a security
hole.

Right now the rails default is unescaped data with default unescaped
output.
Not really - it’s only unescaped on input if you assume that all the
data going in is HTML. It’s escaped as SQL very well, which is correct.
The data is appropriately escaped on translation to its new context.

This is not a problem if programmers manually escape stuff as necessary.
The alternative is to keep enough metadata kicking around to fully
automate all of the escaping, both in and out - but you’d have to
guarantee that it happened without fail…

olbrich · February 1, 2006, 7:25pm

On 2/1/06, Neil W. [email protected] wrote:

be using the same lower layer to the model from a single set of Ruby
scripts.

You are making a big assumption that the other applications are
written in Ruby. In practice they are written in all different
languages and sometimes they are proprietary without source available.
Plus there is always someone that uses the command line db tools or
something like phpadmin to play with the data thinking they know how
it is formatted.

There is also the issue of security. The db centric model can keep
people out of things they shouldn’t be into like salary data. In the
RoR world you can’t give someone a function for generating an average
salary for a group without also giving them access to all of the
individual salaries. With sp’s you can give them a function for the
average and lock down the detail.

The database world started out like RoR. After a while they got smart
and added stored procedures.

In general the danger is not from users of the database app, it from
other programmers trying the access the data. If your RoR app becomes
successful it will collect a lot of data and people will want to
access it in different ways.

[email protected]
http://lists.rubyonrails.org/mailman/listinfo/rails

–
Jon S.
[email protected]

olbrich · February 1, 2006, 7:37pm

Jon S. wrote:

You are making a big assumption that the other applications are
written in Ruby.
Not really. In short, that’s what web services are for. If you want to
expose Rails’ data to another app, force it over port 80.

The database world started out like RoR. After a while they got smart
and added stored procedures.
In a sense, Rails sits at the same level as stored procedures. It’s
another way of forcing access through a set of hoops, with the added
bonus that we get to write the validation, constraints, and other
functionality, in Ruby. Big win, as far as I’m concerned.

Rails does what it does well. There’s no sense trying to shoe-horn it
into places it wasn’t designed to go, and then crying that it doesn’t
fit. SP-heavy legacy databases spring to mind as the perfect example.

olbrich · February 1, 2006, 7:43pm

Ian H. wrote:

Hi! In my world, things are not all The Rails Way. I am considering
migrating a huge system that has plenty of functions (stored
procedures) that are used for data access.

How in the world do you use them in Rails?
ActiveRecord::Base.connection.execute is your one and only friend in
those circumstances (except, maybe, find_by_sql). This is one of those
places that Rails wasn’t designed to go.

olbrich · February 1, 2006, 7:52pm

On 2/1/06, Alex Y. [email protected] wrote:

Ian H. wrote:

Hi! In my world, things are not all The Rails Way. I am considering
migrating a huge system that has plenty of functions (stored
procedures) that are used for data access.

How in the world do you use them in Rails?
ActiveRecord::Base.connection.execute is your one and only friend in
those circumstances (except, maybe, find_by_sql). This is one of those
places that Rails wasn’t designed to go.

Thanks! I will go bark up that tree, but after 2 weeks I am leaning
away from Rails for this project. I wish I had read this first

http://www.javaranch.com/journal/200601/rails.html

This should be linked to from www.rubyonrails.org under “Get Excited.”
Oh well, it fits my hobby projects well.