Odi's astoundingly incomplete notes

OOM killer is not for userspace

When you look at Android it seems that they are relying on the Linux Out-Of-Memory (OOM) killer a lot. Also on the kernel mailing list every now and then someone sends in "improvements" for the OOM killer, so that it selects a more suitable task to kill. To me this looks a severely flawed concept.

The OOM killer is a last resort instrument for the Linux kernel when it runs out of memory. A situation that should ideally never occur. It is a desparate act of the kernel when it has to kill a userspace task in order to get back some memory to satisfy another allocation request. It is worth noting that the allocation request can be from within the kernel or from userspace.

The OOM killer is NOT an instrument for userspace to terminate unused applications. The kernel has insufficient information about the user's perception of the system. So it will always be poor at chosing an appropriate application to kill, no matter how much fancy heuristics you are trying to stuff into the OOM killer. Should it kill httpd? Maybe yes on a developer machine where a self-written Apache module went mad -- maybe not such a bright choice on a production webserver. Should it kill the window manager? No problem on a webserver. Not really a good choice on a desktop workstation. Should it kill the task that requested the memory? Might be clever. Might as well be horribly stupid and lead to thrashing. Should it rather kill a minimized editor that has unsaved files open or the video player that is playing a DVD?

In the end it would be much wiser to let userspace manage this problem on its own. Userspace can monitor memory use. It can have a lot of information from the desktop environment, user interaction, system profile, hardware information and make a much better decision whether it's wise to kill an unused background task. Also userspace can act much earlier than the OOM killer. When memory use reaches a limit it could actively signal applications to save memory, terminate gracefully or kill inactive background stuff based or even interact with the user and ask which one to close. And you can easily customize and exchange the logic.

KDE is already trying to detect unresponsive tasks and kill them. For instance when you click the close button of a window but the window doesn't close after a while, KDE will prompt you to kill the task.

So why are people still building userspace that relies on the OOM killer instead of trying to avoid that the OOM killer ever has to kick in?

posted on 2010-04-06 19:19 UTC in Code | 0 comments | permalink

Add comment