Forums Instiki

Debugging uninterruptible sleep

Subscribe to Debugging uninterruptible sleep 7 posts, 4 voices

 
Andrew Stacey 118 posts

I’m getting a slew of processes getting into “uninterruptible sleep” and staying there. They sit there, eating CPU and memory, until the system slows down enough that folks complain.

Do you happen to know how to debug these? From what I’ve read, this is likely to be something getting stuck on I/O.

One thing that occurred to me was that all the processes are logging to the same file. Could they get stuck in some sort of queue for that?

(Also from my reading around, it would appear that the root cause of this is more likely to be at the kernel end, and thus an issue with drivers and hardware, than with instiki. Still, I’d like to know what it is that is triggering the sleep.)

 
admin Administator 63 posts

edited 11 years ago

Attach gdb to the process and try to figure out where it’s stuck. It would be interesting to me if there was an Instiki-specific reason these processes were getting stuck.

But, so far, there’s no evidence for that.

 
Andrew Stacey 118 posts

I strongly doubt that it is due to instiki itself, but it would be nice to isolate exactly what causes the problem. Thanks for the link.

After watching top all day, I think things are going into D state far more than I would expect so I’m going to contact our server provider.

To help in debugging this, I’m going to add the process pid to the logging messages as that’ll make it easier to link what I see in top to what I see in the production.log.

 
Andrew Stacey 118 posts

I’ve added the timestamps and process ids to the logs and that’s made things a lot clearer. It looks as though it is the “All Pages” request that is clogging up the works, and there appear to be some spiders that don’t respect robots.txt and find “All Pages” fairly early on in their crawl.

You’ve mentioned before the possibility of adding a pageinate routine to “All Pages”. Would that help me, do you think? Or is it easier just to disable it (at 7000 pages then it’s a bit cumbersome, to say the least, so I’ve no compunction at simply disabling it altogether).

Incidentally, I found I’d forgotten that bzr doesn’t set permissions so some stuff in public was unreadable. That might explain why the SVG editor wasn’t working for me as there were a couple of files from that affected.

 
distler Moderator 123 posts

It looks as though it is the “All Pages” request that is clogging up the works, … at 7000 pages then it’s a bit cumbersome, to say the least,

I have no idea why that would be an issue.

It’s not as if Instiki has to do anything with those 7000 pages, apart from retrieving an alphabetical list of their names (and URLs). If that’s indeed your problem, it would be nice to know why.

 
Andrew Stacey 118 posts

Agree completely.

The processes that were handling the list call were entering uninterruptible sleep and were using a large amount of memory - of the order of 300Mb to 500Mb. The system would get bogged down if there were more than one of them, but even one would take a reasonable amount of time to complete.

My suspicion is therefore that it relates to writing the file to disk for caching. So I suspect that there really is a problem with the hardware and that having several processes trying to write the same file was exposing it.

(There was a change in hardware underpinning the VPS recently.)

 
20tsed56 1 post

that is huge data to handle …

Forums Instiki