Strange behavior for cache manager


#1

Hi,
we are current running nginx version 1.7.6, we use nginx primarily as a
reverse proxy on linux.
we have encountered a strange behavior for nginx cache manager,
everything is fine after restart nginx, the cache manage periodically
spawn
new process to check the meta data and honor the max cache size we are
setting.
but after running for like 6 hours, it stopped honor the max cache size
we
are setting and started to go over it and eventually reach full disk
size.
no matter what we do (reduce the cache size to half of disk, reduce the
active time for the cache) as long as it go over it, it will just keep
growing.
i did some strace to the cache manager, and it just showing some normal
epoll_wait, but nothing will even get unlinked. the process spawn cache
manager perfectly fine.

PS. each time i restart nginx, after cache loader process completed,
strace
to cache manage will show it starts to unlink file, and everything goes
back
to normal. cache manage also starts to control cache and keep total
cache
size under max cache size we set. after certain period of time. it will
fail
again.

What could potentially cause this?

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,256168,256168#msg-256168


#2

Can confirm this bug, we have same problem. But i dont know yet how to
reproduce it.
Nothing strange in logs. error_log set to notice level

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,256168,257943#msg-257943


#3

Hello,

On 10 Apr 2015, at 13:36, stanojr removed_email_address@domain.invalid wrote:

Can confirm this bug, we have same problem. But i dont know yet how to
reproduce it.
Nothing strange in logs. error_log set to notice level

Heres a patch providing a workaround for the problem and logging more
information about cache entry lock problems.

http://mailman.nginx.org/pipermail/nginx-ru/2015-May/055937.html

Note that most probably this problem occurs when one of nginx workers is
killed manually. With the patch applied, nginx cache manager skips
cache
entries which seem to be locked by a killed worker and logs an error.

Wed like to receive more feedback from people experiencing this problem.
For this please apply the patch and post (or just watch) error.log with
the
notice loglevel since nginx start.


Roman A.


#4

I’m interested what linux kernel version and distribution do you use?