More information out of error log?


#1

Hi All,

We are getting many of these errors:

2009/01/13 13:04:28 [error] 27100#0: *782718 open()
“/home/spellcit/public_html/letters/.mp3” failed (2: No such file or
directory), client: 204.38.160.220, server: www.spellingcity.com,
request:
“GET /letters/.mp3 HTTP/1.1”, host: “www.spellingcity.com
Essentially one of 2 things is happening:

(1) One of our php or flash files is trying to access this invalid file
“/letters/.mp3” or
(2) A user is trying to access this directly (unlikely).

The problem is that the log entry does not on the surface reveal enough
information about where the request is coming from so we’re having a
tough
time identifying the source of the request. Is there a way (is it
possible)
to add more information to the log entry to better identify the source
of
the request (swf, php, html, direct, etc.).

Thanks


#2

You could turn on access log and it might show the referrer (might).
Media players don’t seem to send a referrer typically


#3

Hi Ilan,

Try changing the output format for the error log file to include the
user
agent. Or, look for requests to /letters/.mp3 in your access log, which
may
already contain the user agent. You may find that these requests are
coming
from a bot that is parsing the links in your site incorrectly. I see
this
consistently on the ten or so sites I host, although the errors I
usually
see are for the directory and filename with no file extension (such as
/forms/document when the link actually points to /forms/document.pdf).

It might also help to compare the remote IP addresses to see whether the
bad
requests always come from the same IP or IP range which might indicate
that
it is a bot.


#4

Ilan,

A script kidie is targeting your site.

Aníbal Rojas
http://hasmanydevelopers.com
http://rubycorner.com
http://anibal.rojas.com.ve


#5

Mike,

I will go ahead and turn that on and see if I get more info.

Normally our .mp3 files are accessed via flash (swf) files, it could be
that
a swf has a problem (which is why I’m trying to get a more direct
referral
source).

Thanks


#6

In the current error log that I’m looking it, 50% of the bad requests
came
from one IP address and 50% of them came from another. It is unlikely
that
its a bot as both IP addresses are registered to educational
institutions
which are our primary users, but good point, I will keep an eye on it.

Thanks


#7

It is possible, however, the traffic that is causing this error is
spread
out across more than 1 distinct IP addresses, none the less, a
possibility.


#8

So I turned on the access log but I’m concerned that its size is getting
larger by the second (and this is off peak). By tomorrow AM, its size
may
exceed available disk space.

Is there a way to programmatically turn it on or off via the
configuration
file depending on whether or not an error was generated or a particular
type
of URL requested?


#9

On Tue, Jan 13, 2009 at 11:19:55PM -0500, Ilan B. wrote:

So I turned on the access log but I’m concerned that its size is getting
larger by the second (and this is off peak). By tomorrow AM, its size may
exceed available disk space.

Is there a way to programmatically turn it on or off via the configuration
file depending on whether or not an error was generated or a particular type
of URL requested?

URL:

  access_log off;

  location = /letters/.mp3 {
      access_log  /path/to/log  format_name;
  }

404 error:

  location / {
      error_page  404  /404;
      access_log  off;
  }

  location = /404.html {
      root        /path/to/page;
      access_log  /path/to/log  format_name;
  }

#10

On Tue, Jan 13, 2009 at 05:42:00PM -0600, Nick P. wrote:

Hi Ilan,

Try changing the output format for the error log file to include the user
agent.

the error_log does not allow to change format.