Twice today, my server got unresponsive where I had to turn it off, and then turn it back on. When it was in this state, I noticed that the hard drive was thrashing. After rebooting it the first time, I noticed that apache died because it wasn't able to allocate memory.
I think that one of the processes is leaking memory so bad that it eventually uses all of the swap space. I don't think that this problem is caused by anything external, as I can see a log of all of the connections though my firewall. Now I just need to find out what is leaking the memory.
Update: I found out what causing the problem. I had installed version 1.3.0 of munin. Since I had a cron job that got the statistics every 5 minutes and it appears that some of the processes were not finishing, so eventually the computer would run out of memory. I updated munin to 1.3.2, and it doesn't seem to have this problem.
I feel for you! Had a similar issue this morning in that our mail web/email server was giving erratic responses due to insufficient memory.
ReplyDeleteOnly real problem being that nothing was touched in the way of changes! Turned out to be a corrupt email that was causing one of our scripts to go crazy.
Fun :)