Ruby, Web Apps and Cross Site Scripting

Something I’ve seen in the blogosphere lately has been the
vulnerability of web applications to cross site scripting (XSS)
attacks. I think it was LiveJournal I was reading about which got hit
with a nasty one that compromised quite a few accounts.

This got me thinking about how to protect against this kind of stuff in
an automatic way. Because I’ve done all of my ruby web apps using
Rails, I thought about modifing view templates so they always
HTML-escape “dangerous” content. ERB templates as rhtml files is the
most common way for Rails to generate content, so that was the first
place I looked.

In erb.rb, the Buffer#compile method actually builds the template which
will get evaluated. I modified the source where it outputs code when a
close tag (i.e. “%>”) is found. The code went from (this starts on line
549 for 1.8.3):

case token
when ‘%>’
case scanner.stag
when ‘<%’

snip

when ‘<%=’
out.push("#{@put_cmd}((#{content}).to_s)")

snip

To:

snip

when ‘<%=’
out.push("#{@put_cmd}((#{content}).tainted? ?
html_escape((#{content}).to_s) : (#{content}).to_s)")

snip

I thought it was logical that if a piece of code was tainted, it should
be HTML escaped. Unfortuntately, this was a little too broad and it
ended up escaping some things I didn’t want escaped.

I didn’t have a chance to take this much further but does it strike a
thought in anyone else?

Justin

Quoting m4dc4p [email protected]:

I thought it was logical that if a piece of code was tainted, it
should be HTML escaped. Unfortuntately, this was a little too
broad and it ended up escaping some things I didn’t want escaped.

Hmm, that approach makes sense, actually. Maybe you really should
be inspecting and untainting those additional things.

-mental

m4dc4p wrote:

Something I’ve seen in the blogosphere lately has been the
vulnerability of web applications to cross site scripting (XSS)
attacks. I think it was LiveJournal I was reading about which got hit
with a nasty one that compromised quite a few accounts.

This got me thinking about how to protect against this kind of stuff in
an automatic way. Because I’ve done all of my ruby web apps using
Rails, I thought about modifing view templates so they always
HTML-escape “dangerous” content. ERB templates as rhtml files is the
most common way for Rails to generate content, so that was the first
place I looked.

In erb.rb, the Buffer#compile method actually builds the template which
will get evaluated. I modified the source where it outputs code when a
close tag (i.e. “%>”) is found. The code went from (this starts on line
549 for 1.8.3):

case token
when ‘%>’
case scanner.stag
when ‘<%’

snip

when ‘<%=’
out.push("#{@put_cmd}((#{content}).to_s)")

snip

To:

snip

when ‘<%=’
out.push("#{@put_cmd}((#{content}).tainted? ?
html_escape((#{content}).to_s) : (#{content}).to_s)")

snip

I thought it was logical that if a piece of code was tainted, it should
be HTML escaped. Unfortuntately, this was a little too broad and it
ended up escaping some things I didn’t want escaped.

I didn’t have a chance to take this much further but does it strike a
thought in anyone else?

It is a good idea for many use cases but sometimes that behaviour
just is not desirable; currently one could use #html_escape in the
ERB template to force escaping (Rails aliases this to #h, as far as
I know). In your case, with slight processing overhead, you could
maybe alias #u for #html_unescape for those cases.

Or just manually stick an escape there yourself :slight_smile:

Justin

E

What I was trying for was an automatic way to escape dangerous content.
Having to put h() or html_escape() into each of my views directly is
the problem I’m trying to solve :wink:

You got me thinking with that. The ERB solution was too broad, so I
looked at the view itself. What about the instance variables? Many
times, those are how you are getting information back to the page
(either through params or direct copy of instance variables from the
controller to the view).

My idea was to iterate through the instance variables on the view and,
for any tainted ones, override their to_s method so it called
CGI::escapeHTML. With a little hacking (and the realization I needed to
look inside Hashes and Arrays), I came up with the below. Put an HTML
element in the ‘name’ field and it will tell you what you submitted,
properly escaped. Go on - try a tag - I dare you!

<%
def escape_tainted(val)
if val.is_a?(Array)
return val.each { |v| escape_tainted(v) }.any?
elsif val.is_a?(Hash)
return val.each { |v, k | escape_tainted(v) || escape_tainted(k)
}.any?
elsif val.tainted?
class << val
def to_s
CGI::escapeHTML(super)
end
end
return true
else
return false
end
end

self.instance_variables.each { |o|
escape_tainted(instance_variable_get(o)) }
%>
<%= start_form_tag :action => “form_test” %>
Name: <input type=“text” name=“name” value="<%=@params[“name”]%>">

<%= submit_tag “Submit” %>

<%= end_form_tag %>

It’s still only about 50% of the solution but I did think it was
interesting. One problem with it is that Rails is smart enough to HTML
encode values passed to many of its functions, so you will see
double-escaping. I also noticed that if I dumped something like
<%=params.%> on the page (to see all key-value pairs) the “name” value
inside would NOT be escaped. I don’t how that value is displayed
without to_s being called, though.

Funny you should mention that - I just blogged about a solution very
much like what you describe. Check it out:

http://blog.explorationage.com/articles/2006/01/25/how-to-protect-your-rails-apps-against-cross-site-scripting-attacks

m4dc4p wrote:

Funny you should mention that - I just blogged about a solution very
much like what you describe. Check it out:

http://blog.explorationage.com/articles/2006/01/25/how-to-protect-your-rails-apps-against-cross-site-scripting-attacks

Good stuff, I posted a comment on the subject. The gist of it being
that is probably better make escaping the default action on all columns
and set up a mechanism to specifically over-ride the escaping to get the
real value. (yeah, you could probably get smart and not bother to
escape non-text columns).

_Kevin

m4dc4p wrote:

It’s still only about 50% of the solution but I did think it was
interesting. One problem with it is that Rails is smart enough to HTML
encode values passed to many of its functions, so you will see
double-escaping. I also noticed that if I dumped something like
<%=params.%> on the page (to see all key-value pairs) the “name” value
inside would NOT be escaped. I don’t how that value is displayed
without to_s being called, though.

I tried to stir up a little discussion about this a few days ago.
Personally I think the default behaviour of rails apps should be to
escape data by default and only return unescaped stuff when requested.

I don’t think mucking with ERb is such a great idea since it is used for
lots of other non-rails stuff.

One solution I have been pondering is to modify the method_missing
function so that when you request an attribute, it escapes the text
coming back. It should also define an alternate attribute called
‘unsafe_#{attribute}’ or ‘unescaped_#{attribute}’ in case you really
want the raw data.

This should protect against any text being retrieved by the database.
It won’t clean up instance variables if they don’t originate in the
database, but this should avoid the majority of problems.

_Kevin

m4dc4p wrote:

What about something where you subclass any strings coming out of
‘self[]’? This is some real quick and dirty pseudo code but you could
do:

This way, any time a string comes out of an ActiveRecord column, you
override it to automatically product safe HTML. If you don’t want that
behavior, you just call unsafe_to_s.

Perfect, safe by default is what is needed.

What about something where you subclass any strings coming out of
‘self[]’? This is some real quick and dirty pseudo code but you could
do:

class ActiveRecord::Base
def
v = super(value)
if (! v.nil?) && v.is_a?(String)
class << v
def to_s
CGI::escapeHTML(super)
end

    def unsafe_to_s
      superclass.to_s # Not sure if this is valid- but basically

get the raw string
end
end
end
end

This way, any time a string comes out of an ActiveRecord column, you
override it to automatically product safe HTML. If you don’t want that
behavior, you just call unsafe_to_s.