READYNAS NV+ Kernel Out of memory

Decided recently to pick up a Readynas NV+ to supplement our file storage at the office. I have the Readynas Duo at home and have been quite happy with it. It is a linux based NAS appliance and after installing a few add-ons you can have a remote root shell and apt packages to install things like munin and NRPE which happen to be what I use for monitoring.

As with most NAS appliances it supports SMB, AFP and FTP. It also can run an rsync daemon. Or with shell access you can setup cron jobs, run rsync from the shell just like a normal linux box. Other features which I don’t use include support for Apple’s Time Machine, Squeezebox, a web based bittorrent client and a handful of other stuff.

After I set the device up I figured I would burn it in for a couple of days to make sure everything was happy. I had upgraded the RAM from the default 256mb to 1 gig because I had read some benchmarking reports that indicated a noticeable increase in I/O performance. Much to my surprise with little or no activity for a day or two I got message from Nagios that it could not talk to the NRPE instance on the device. It was still responding to pings but all services on the device were horked. I rebooted it and checked the logs to find a bunch of “kernel: out of memory” messages indicating that both RAM and swap were entirely consumed and the kernel was attempting to kill off processes in an attempt to recover:

nas1corp kernel: Out of memory: Killed process 7823 (cron).
nas1corp kernel: Out of memory: Killed process 7824 (sh).
nas1corp kernel: Out of memory: Killed process 7840 (sh).

This seemed to be obviously a memory leak, however after some quick searching and ad-hoc research I found no obvious answers and decided to shelve it for a while.

Today I finally came across this thread where at least a few other people had the same problem. The person reporting the problem was also running the munin client on his Readynas as well.

So it appeared to be a kernel memory leak that in this case was triggered by bash which is used in a bunch of the munin plugins. I just upgraded to the firmware release posted on the mentioned thread and have noticed a huge difference already. I think its great that netgear finally fixed this problem. I also think its great that openness of their product allowed customers to troubleshoot it for them, however I was not all that impressed that it took them almost a year to get the issue resolved.

This entry was posted in /sysadmin, /unix, /work and tagged , , , , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

CommentLuv Enabled