HTTP Accept header wildcard breaks rails app

The thunderstone crawler (Thunderstone Texis Default Script
websearch/about.html) sends the folliowing HTTP accept header when
requesting pages

Accept: text/*, application/javascript, application/x-javascript

This results in a “Missing template” exception

text/* is valid. How do I tell my rails app to treat this as rhtml by
default instead of returning a 500?

Missing template [controller]/[method] with
{:handlers=>[:erb, :rjs, :builder, :rhtml, :rxml], :formats=>[:“text/
*”, :js], :locale=>[:en, :en]}

I’ll post a response if I figure it out
Tony P.

well. adding

Mime::Type.register “text/*”, :html

to config/initializers/mime_types.rb works but it gives a warning on
startup

actionpack-3.0.3/lib/action_dispatch/http/mime_type.rb:98: warning:
already initialized constant HTML

Seems like rails doesn’t want us to extend the definition of an HTML
mime type. Probably for the better, any suggestions. I could ping
thunderstone and have them fix their crawler but I’m wondering if
there is a better way Rails an handle this case.

It should be noted that while adding this fixed my test cases it
breaks the application when viewed from a browser. :-\

I guess this is just the rails way. Unless a specific accept header
is sent rails does not know what to return.

AFAIK there is no way to specify a default handler. It seems safe to
assume that erb should be used when the Accept header doesn’t map to a
specific mime-type. Is there a way to make this happen or is there a
reason this shouldn’t be done?

On Dec 31 2010, 6:21pm, Tony P. [email protected]

I’m seeing a few of these per day from a few different IPs. As far as
I can see, and what I know of our customer base, I believe they’re
only crawlers.

Is there a real workaround for this, or does anyone know if the Rails
core been notified of this issue?

For anyone else who has this issue the solution was simple.

instead of
render :show_page

I used
render ‘show_page.html’

Don’t think. Just render the HTML!

In 99% of my code I let rails decide but there are a few places it
made sense for me to do this.

Of course, if a search engine asks sends “Accept: image/jpg” they will
get HTML from me. I tried this on some other sites and they returned
html even if an image was requested so I am no worse than them :slight_smile:

Same problem here, also only crawlers who use “text/" (on a Rails 3.0.4
application)
"text/
” should of course render the format.html in most cases.

I also had an instance of a request (on a Rails 3.0.0 application) which
only accepted “application/jxw”, in which case Rails should have thrown
a
406 (Not Acceptable), but it didn’t.

http://neeraj.name/2010/11/23/mime-type-resolution-in-rails.html
provides
more info.