On Sat, Jun 28, 2008 at 9:54 PM, Grzegorz N.
[email protected] wrote:
I’d like to gather ideas about
how to notify the outside world. A log message? Sending a signal
somewhere? An SNMP trap? Every way has its advantages and disadvantages,
so I’d like to pick the one that sucks the least.
Why just one? A status page supplemented by machine-readable log
output is a good solution that I think would satisfy most sysadmins.
Pardon me for asking a naive question, but to change the list of
backends, would you not simply edit the config file and do a SIGHUP? I
would reset whatever internal structures that are kept by the workers,
but I can’t think of anything that’s not okay to lose.
Yes. That’s the obvious solution but apparently not always acceptable,
especially when you’d want to use an external monitoring system to do
this automatically.
What’s simpler for an external monitoring system than sending a signal
to a process?
Of course, you could go all the way and do a Varnish-style admin
interface. I have mentioned Varnish before on this list. Varnish has a
pretty clever admin/monitoring infrastructure. For example, you can
load multiple configs and selectively enable them:
$ varnishadm vcl.load test /etc/varnish/test.vcl
$ varnishadm vcl.use test
… something goes horribly wrong …
$ varnishadm vcl.use boot
The use of named configs means the input can be anything (even your
default set of config files). You can load it, try it out, and unload
it.
You could do worse than looking at Varnish’s logging system for ideas.
Varnish uses circular buffers in shared memory for logging, and its
logs are explicitly machine-readable, each line being a tag followed
by a value. So log output looks like this:
14 Debug c “Hash Match:
/-/cache/border/w=6;h=6;sw=true;sx=0;sy=3;sbr=10;sbs=5;sm=10;sp=0;c=fff;t=r_24.png#origo.no#”
14 Hit c 1402130806
14 VCL_call c hit
14 VCL_return c deliver
14 Length c 217
14 VCL_call c deliver
14 VCL_return c deliver
14 TxProtocol c HTTP/1.1
14 TxStatus c 200
14 TxResponse c OK
14 TxHeader c Status: 200 OK
and so on.
In addition to making it superbly easy for scripts to graph, analyze
and monitor activity in real time, this lets you tail the log for
specific events or strings, and since it’s all RAM-based, you can get
real-time, low-overhead debug log output immediately without changing
any configuration settings or reloading the daemon. As far as I know,
Varnish only logs when you listen to log output and filtered by what
you’re listening for, but I could be wrong.
Using shared memory with Nginx’s worker process model should not pose
any problems as each worker could maintain its own shared memory and
thus avoid the need for locking.
same reason. I presume that means the overloading of weight=X is at
least acceptable.
I think you have to push Igor for a more flexible internal
infrastructure.
Even something string-based would work, even if it would be hackier
than a true syntax:
server 127.0.0.1:10000 option = [option …];
Eg.,
server 127.0.0.1:10000 option fair.max_conns=5;
You should only return an error if a request cannot be served within a
given timeout, not when all backends are full.
Will have to think about it. This has the potential of busy-looping when
all the backends are indeed full (or down, but then one can just send a
hard error and be done with it). I don’t think nginx has a way to be
told “everything is unavailable now, come back to me in a second or
two” or even better “I’ll tell you when to ask me again”.
I think Nginx needs something like this.
Alexander.