On Thu, Jan 14, 2010 at 01:52:02PM +0100, Dennis J. wrote:
Why is logging into a pipe considered a waste of CPU?
The log parser throws away some data, aggregates the rest and then writes
it to a remote database. The “tail -f” approach would waste lokal disk i/o
by writing data unnecessarily to disk which i would then have to read again
with the script.
Why is this considered more efficient than handing the data directly over
to a script?
It is not considered as more efficient. It may be more efficient because
bulk data processing. Note also, that logged data are written to disk,
are not read because they are already in OS cache: they are just copied.
Logging to pipe is a CPU waste because it causes a lot of context
and memory copies for every log operation:
- nginx writes to a pipe,
- context switch to script,
- script reads from the pipe,
- script processes line,
- script writes to a database,
- context switch to nginx.
instead of single memory copy operation to a log file.
Is there a nginx equivalent to apaches CustomLog directive with the “|” prefix so it logs into stdin of another program/script? I need to do real-time processing of the access log data and I’m wondering how I can accomplish this once I switch to nginx.