Debugging uninterruptible sleep
Andrew Stacey 118 posts |
I’m getting a slew of processes getting into “uninterruptible sleep” and staying there. They sit there, eating CPU and memory, until the system slows down enough that folks complain. Do you happen to know how to debug these? From what I’ve read, this is likely to be something getting stuck on I/O. One thing that occurred to me was that all the processes are logging to the same file. Could they get stuck in some sort of queue for that? (Also from my reading around, it would appear that the root cause of this is more likely to be at the kernel end, and thus an issue with drivers and hardware, than with instiki. Still, I’d like to know what it is that is triggering the sleep.) |
admin
Administator
63 posts
edited 12 years ago |
Attach gdb to the process and try to figure out where it’s stuck. It would be interesting to me if there was an Instiki-specific reason these processes were getting stuck. But, so far, there’s no evidence for that. |
Andrew Stacey 118 posts |
I strongly doubt that it is due to instiki itself, but it would be nice to isolate exactly what causes the problem. Thanks for the link. After watching To help in debugging this, I’m going to add the process pid to the logging messages as that’ll make it easier to link what I see in |
Andrew Stacey 118 posts |
I’ve added the timestamps and process ids to the logs and that’s made things a lot clearer. It looks as though it is the “All Pages” request that is clogging up the works, and there appear to be some spiders that don’t respect robots.txt and find “All Pages” fairly early on in their crawl. You’ve mentioned before the possibility of adding a pageinate routine to “All Pages”. Would that help me, do you think? Or is it easier just to disable it (at 7000 pages then it’s a bit cumbersome, to say the least, so I’ve no compunction at simply disabling it altogether). Incidentally, I found I’d forgotten that bzr doesn’t set permissions so some stuff in |
distler Moderator 123 posts |
I have no idea why that would be an issue. It’s not as if Instiki has to do anything with those 7000 pages, apart from retrieving an alphabetical list of their names (and URLs). If that’s indeed your problem, it would be nice to know why. |
Andrew Stacey 118 posts |
Agree completely. The processes that were handling the My suspicion is therefore that it relates to writing the file to disk for caching. So I suspect that there really is a problem with the hardware and that having several processes trying to write the same file was exposing it. (There was a change in hardware underpinning the VPS recently.) |
20tsed56 1 post |
that is huge data to handle … |