Recent Posts by Andrew Stacey
posted 11 years ago
Andrew Stacey
118 posts
|
There’s a bug with maruku’s table handling: it doesn’t like whitespace at the end of the line (a previous version did, and it would seem to me that whitespace here should be fine). This causes tables that used to render to no longer do so. The fix is to modify the regexp for splitting cells: line 515 of
(I notice that this isn’t the same as the version of maruku on github, so don’t know if this would be superseded by updating to the latest version from there.) |
posted almost 12 years ago
Andrew Stacey
118 posts
|
With the above changes, then
|
posted almost 12 years ago
Andrew Stacey
118 posts
|
I’ve put those changes in place and will keep an eye on what happens. Incidentally, when manually clearing out the cache I note that |
posted almost 12 years ago
Andrew Stacey
118 posts
|
Spoke too soon. The cache bug is not dead. I shoved in a load of Anyway, what I found was interesting. I got different behaviour depending on whether I was creating a new revision or updating an old one (to switch between the two I used two author strings). Here’s the “same author” sequence of events:
As I said above, only the But here’s what happens when I change the author name to force a new revision:
Notice that the With this sequence then the Looking at the two different sequences of events, it would seem that the safest place to sweep the cache is when Of course, my analysis may well be incorrect as instiki (well, ruby-on-rails) is very much a black box to me so I have no idea as to what’s going on under the bonnet. Incidentally, the above was carried out using sqlite as the database, but I got the same results with mysql with the updated gem. |
posted almost 12 years ago
Andrew Stacey
118 posts
|
I put 2.9.1 in the Gemfile but I don’t know if that’s the minimum value. Interestingly, bundle won’t update to the latest version unless you specify such a minimum. Comparing the logs, then it would appear that something in the mysql module was returning the updated page name when instiki was expecting the old page name, so when instiki swept the cache using what it thought was the old page name, it was actually using the new one and thus the old cache page wasn’t being deleted. |
posted almost 12 years ago
Andrew Stacey
118 posts
|
I think I’ve killed the cache bug! I did a fresh install of Instiki+MySQL on a debian virtual machine and … no cache bug. So I went back to my live installs and … cache bug. Poking around, I discovered that my fresh install was using version 2.9.1 of the mysql gem but the live installs were using 2.8.1. So, on a hunch, I updated to 2.9.1 and tried reproducing the cache bug … and it had gone. I don’t know if I’ve well and truly killed it, but the steps I put above were reliably showing it for me on the nlab and on mathsnotes and now that I’ve updated the mysql gem then those steps no longer exhibit it so I figure that’s enough for a small celebration. |
posted almost 12 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Feature Requests Thanks. |
posted 12 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Feature Requests Okay, so that was a pretty dubious feature request! How about this one: if a page exists (meaning, really exists - not just a redirect) then a request to (This came up most recently because a Google search for a page led to the |
posted 12 years ago
Andrew Stacey
118 posts
|
… and in the time since you asked for clarification, it would appear that you’ve fixed it anyway as it no longer appears having just updated instiki. Thanks. |
posted 12 years ago
Andrew Stacey
118 posts
|
Distillation for Distler (sorry, …)
|
posted 12 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Feature Requests Not sure if this is a bug or a feature request … Google searches now include author information which it tries to glean from the page. It would appear that it uses the “Revised by XYZ” information to do this. It’s been suggested that this is because that is in a div with class name I’ll report back on whether or not it works. If it does, consider this a feature request for changing |
posted 12 years ago
Andrew Stacey
118 posts
|
There would appear to be a bug on pages with apostrophes in their names. See http://nforum.mathforge.org/discussion/4757/apostrophes-in-page-titles-lead-to-weird-behaviour for details. |
posted 12 years ago
Andrew Stacey
118 posts
|
I just tried on a completely fresh install and found that it happened just as you described: the name used was the old name (and it gets swept twice). That was a sqlite3 database. I’ve tried to install mysql on my mac to test this there, but get to a crash when I run instiki. Not sure why, seems to be related to the gem not finding my mysql lib files but it’s taken too much of my time already to try to fix it to test further. I can try it on my linux machine later. However, the evidence certainly suggests that there is a difference between mysql and sqlite3 on this one. |
posted 12 years ago
Andrew Stacey
118 posts
|
I know you don’t believe in the Cache Bug … I had a page on the nLab which I renamed. Looking at the logs, then I see the save command. It ends with:
The next log items are:
There are then a slew of more expirations, the first ones being I just tried on my course wiki. Here are the steps I took:
Result: the expiration sweep does not include the old name and includes the new name twice. I’m wondering if this could be the culprit:
I notice that in the Perhaps the |
posted 12 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Debugging uninterruptible sleep Agree completely. The processes that were handling the My suspicion is therefore that it relates to writing the file to disk for caching. So I suspect that there really is a problem with the hardware and that having several processes trying to write the same file was exposing it. (There was a change in hardware underpinning the VPS recently.) |
posted 12 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Debugging uninterruptible sleep I’ve added the timestamps and process ids to the logs and that’s made things a lot clearer. It looks as though it is the “All Pages” request that is clogging up the works, and there appear to be some spiders that don’t respect robots.txt and find “All Pages” fairly early on in their crawl. You’ve mentioned before the possibility of adding a pageinate routine to “All Pages”. Would that help me, do you think? Or is it easier just to disable it (at 7000 pages then it’s a bit cumbersome, to say the least, so I’ve no compunction at simply disabling it altogether). Incidentally, I found I’d forgotten that bzr doesn’t set permissions so some stuff in |
posted 12 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Debugging uninterruptible sleep I strongly doubt that it is due to instiki itself, but it would be nice to isolate exactly what causes the problem. Thanks for the link. After watching To help in debugging this, I’m going to add the process pid to the logging messages as that’ll make it easier to link what I see in |
posted 12 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Debugging uninterruptible sleep I’m getting a slew of processes getting into “uninterruptible sleep” and staying there. They sit there, eating CPU and memory, until the system slows down enough that folks complain. Do you happen to know how to debug these? From what I’ve read, this is likely to be something getting stuck on I/O. One thing that occurred to me was that all the processes are logging to the same file. Could they get stuck in some sort of queue for that? (Also from my reading around, it would appear that the root cause of this is more likely to be at the kernel end, and thus an issue with drivers and hardware, than with instiki. Still, I’d like to know what it is that is triggering the sleep.) |
posted 12 years ago
Andrew Stacey
118 posts
|
And another one, this time in how maruku parses its meta-data. It would seem that not leaving a space at the end of Presumably this is the case here as well:
I get:
in the source. |
posted 12 years ago
Andrew Stacey
118 posts
|
I’m unable to run the inbuilt SVG editor on my computer (running Mac OS X, Lion). The window launches but none of the icons are present and although the buttons highlight when I hover over them, nothing happens when I click on one. This is with Firefox, Chrome, and Safari. Not sure what additional information you would like on this. |
posted 13 years ago
Andrew Stacey
118 posts
|
Forum: itex2MML – Topic: itex and other languages I thought it might be worth noting that the nForum (and the other mathematical forums that I run) now use itex directly in PHP. (I probably won’t get the words right on this) That is, I’ve compiled itex2MML into a PHP extension (using swig) and am calling that now instead of farming the conversion off to the nLab. I’ve also installed MathJaX to support (as best I can) non-compliant browsers. |
posted 13 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Feature Requests I was checking this forum to see if you’d posted a notice that you’d fixed the bug, but I didn’t see that you’d edited your previous comment rather than posting a new one - and at the start of the week then I don’t always click through links or check RSS feeds. All of my instiki installations are now up to date. Thanks. |
posted 13 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Feature Requests Wrong thread, then! |
posted 13 years ago
Andrew Stacey
118 posts
edited 13 years ago |
Forum: Instiki – Topic: Feature Requests Take a look at http://ncatlab.org/nlab/show/Sandbox (feel free to ignore the request, I put that there to ensure that no-one messed with them before you’d seen it). Each theorem has Does the same issue happen here. Let’s experiment: TheoremThe first letter of the English alphabet is A. TheoremThe second letter of the English alphabet is B. TheoremThe third letter of the English alphabet is C. TheoremThe fourth letter of the English alphabet is D. TheoremThe fifth letter of the English alphabet is E. TheoremThe sixth letter of the English alphabet is F. Theorem 1 refers to A, Theorem 3 to C, and Theorem 6 to F. Yes, so it’s the same here. Thus, no need to check the Sandbox. Here’s the source of what I just typed:
|
posted 13 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Feature Requests The issue of needing ids to keep theorem numbers in step has been raised (again). Mike came up with a suggestion which seems reasonable on first reading which is that instiki (or whichever part of it is appropriate) adds an automatic id if one isn’t present, much in the way that the table of contents part does. So if I write
then instiki adds an automatic id, say |
posted 13 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Feature Requests
I, for one, would be against this. It would actually hinder collaboration as everyone would have to learn the local conventions every time they wanted to edit a page, and stuff that worked on one wouldn’t work on another. |
posted 13 years ago
Andrew Stacey
118 posts
|
Forum: Instiki – Topic: Feature Requests (Hopefully minor) feature request: this came up in a discussion on citing the nLab. It would be convenient to have the current revision number displayed on the page somewhere obvious. Perhaps the footer could read:
I know it’s easy to deduce - take the number after “Back in time” and add 1 - which is why I said “convenient” rather than anything stronger. |
posted 13 years ago
Andrew Stacey
118 posts
|
The “save”s were all due to one bot and none actually made it to the database. |
posted 13 years ago
Andrew Stacey
118 posts
|
Okay, so looking through the week’s log for bots (bot, spider, crawler), I get 33,517 hits (actual time period: 11th December 6:25am to 16th December 11:27am, so that’s an average of a little over 4 hits per minute). These break down as follows:
There’s a few that I’ve missed out in between - there are clearly some bad links to the nlab. I’d say that only Next is to analyse how those are distributed. |
posted 13 years ago
Andrew Stacey
118 posts
|
Any thoughts on the following idea? From time to time, the nLab gets a whole host of spiders and other bots crawling all over it. While I understand that they’re part of what makes the internet work, they can be a bit annoying and slow down the server for everyone else. So I thought of channelling requests a little more cleverly than I currently do. At the moment, I use a global queue in passenger which is fine until all the slots get a slow request. So what I thought was to have a semi-global queue with slow requests (like feeds and lists) and bots being handled by a few dedicated processes, normal requests by some others, and maybe a “priority” list as well. Since passenger doesn’t do this itself (it either has global queue or individual queues) I think that what I’d have to do is to have three virtual versions of the nLab, at least as far as apache and passenger are concerned. Then apache would examine the request and classify it according to which type it was and send it to the right version of the nLab. Passenger wouldn’t know that these are the same so would have a global queue for each, and that way requests get segregated and so don’t hold up others in other segments. The way that I’d have three virtual versions is simply with symlinks in the filesystem: “nlab”, “nlabPriority”, and “nlabSlow” would all be symlinks to the same instiki installation. Can you see any immediate problems with that? As far as Instiki is concerned, it’s just like being run under passenger as there will be multiple instances of instiki running concurrently, which is what already happens. So that shouldn’t be affected. Apache, also, eats this sort of thing for breakfast, and passenger can cope with different programs as well. So I don’t see an immediate flaw. (Of course, it may be that this won’t solve the blockage, but it’s less drastic than moving servers which is the other option.) |