On Sat, Oct 5, 2013 at 2:31 AM, mex wrote:
your points are valid, but i talk about heisenbugs and the ability
to monitor a certain ip; you know, theres WTF??? - errors
It’s trivial to do with dynamic tracing tools like systemtap and
dtrace. You don’t even need to reload Nginx for this or parse any log
We also have great success in tracing down the real cause of those
rare timeout requests in our production nginx servers Again, we
didn’t change the Nginx configuration file nor reload the server.
please note, on the infrastructure i talk about we have usually
debug-logs disabled, and the bottleneck is usually the app-servers.
For our online systems, Nginx is the application server
but thanx for your answer, i’ll invest some time and check your toolchains,
especially systemtap. is systemtap included in openresty? looks like the
perfect tool to create some nagios-plugins upon.
systemtap is the tool framework that can answer almost any questions
that can be formulated in its scripting language The real-world
questions may involve many software layers at once, like involving
Nginx and Linux kernel’s TCP/IP stack (and even the LuaJIT VM) at the
same time. And systemtap can associate events happening at different
layers of the software stack easily and efficiently.
The biggest selling point is that you don’t have to modify the
software nor the software’s configuration yourself to make things work
and you can always aggregate data at the data source, saving a lot of
resources in dumping, storing, and parsing the raw logging data.