Software Forum - Recent Posts by distler

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: nlab

Given the number of links that Urs has on some of the nLab pages…

It’s the number of inbound links that matters, but yeah.

If the page name doesn’t change, surely then you don’t have to expire any of the pages that refer to it?

You do, for a newly-created page… but not, I agree, for a revision of an existing page. I was, somewhat crudely, not distinguishing between those cases. It occurs to me that I can use an after_create hook to distinguish between those cases.

Thanks.

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: nlab

It seems to be the list and recently_revised ones that get expired several times in a row.

That would be a consequence of

When a page is saved, expire all pages that reference that page.
When you expire a page, also expire the corresponding “index pages’ (list, recently-revised, atom feeds).

The first is further-complicated by the facility for renaming pages. That means we need to expire all the pages that refer to the old page and all the pages that refer to the new page.

I guess that could be optimized better for the case where the page doesn’t change names, as we don’t have to expire the same pages twice. I think the current procedure was motivated by complaints (from y’all) that, in some circumstances, pages were not being expired when they should.

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Instiki – Topic: nlab

On my machine, each Expired fragment takes 0.1-0.2 ms. So to get 10-20 s, you’re talking about expiring $~ 10^{5}$ fragments?!

Wow! That’s impressive.

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: Bugs

I’m a little baffled. It should work. Sometimes it does (saving the file triggers the cache sweeper); sometimes it doesn’t. I can’t see what the difference is.

Will have to investigate further …

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Instiki – Topic: Feature Requests

I feel that there is a qualitative difference between a wikilink and a hyperlink

Indeed, there is. And that’s why there are no anchors allowed.

If page “Foo” has multiple WikiLinks to [[Bar]], there’s still only one corresponding entry in the database. That could no longer be true if “Foo” could link to [[Bar]] and to [[Bar#baz]].

What you want is a hyperlink, and Markdown provide a syntax for hyperlinks which permits anchors.

Update:

On reflection, I suppose that allowing the presence of anchors doesn’t strictly conflict with having just one entry in the database. I should think some more …

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Instiki – Topic: Bugs

Because of that last, my guess is that the pre-upload version is still in the cache, but that the page shown when the file is uploaded doesn’t read the cache version.

You’re probably correct. The rule is that pages with Flash messages on them (like the one that tells you that the file was successfully-uploaded) are not cached. So you get to see the correct page once, but if the incorrect one wasn’t deleted from the cache, that’s what you’ll see the second time.

Fixed in ~~Revision 756~~ Revision 757.

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Heterotic Beast – Topic: Rails 3.1.0

I upgraded Heterotic Beast to Rails 3.1.0. Despite all my prior testing, the process didn’t go as smoothly as I would have liked, and this forum was pretty disrupted for most of Friday.

Should be back to normal now. But leave a comment here, if something’s still broken for you.

The main new feature is the Asset pipeline, which supposedly speeds the delivery of static files (CSS, javascript, and images). Unfortunately, the result seems buggy.

The reference
```
"#{asset_path('something.png')}"
```
sometimes turns into (the correct)
```
"/forum/assets/something-5c4374aa4b1911ebbabb73883b3cd5c0.png"
```
and sometimes it turns into (the incorrect)
```
"/assets/something-5c4374aa4b1911ebbabb73883b3cd5c0.png"
```
I’m using some Apache-fu to redirect the latter, but that shouldn’t be necessary.
I had to switch to Sass (from .css.erb) to get URLs for background images to include the fingerprint. I.e., within a .css.erb file, the above generates
```
"/assets/something.png"
```

Other minor bugs include:

The acts_as_state_machine gem uses some deprecated methods, which generate a warning in the User model. There’s a Rails 3.1 fork which fixes the problem. But it’s unclear when, if ever, that will be released as a gem.
Prototype 1.7.0 generates a Javascript error
```
 Error: mismatched tag. Expected </link>.
 Source File:
 <div xmlns="http://www.w3.org/1999/xhtml"><link></div>
```
The problem is in the LINK_ELEMENT_INNERHTML_BUGGY function, where you could replace <link> with <link/> to silence the error. This was not a problem in Prototype 1.6.x.

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: Bugs

Of course they are different.

”<a> </a>” is an ”a” element with a single child node, which is a text node, containing ” “.
”<a></a>” is an ”a” element with no children.

In X(HT)ML, the latter is equivalent to ”<a/>”.

The short-tag construction does not exist in HTML and all browsers interpret the latter as the opening tag, ”<a>”, of an ”a” element (which is not what you want).

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: Bugs

<a id="anchor"> </a>

Here's a [[wikilink]

would not have triggered the bug. Only empty elements (which get converted to short-tag syntax, <a id="anchor"/>, in the output) triggered this bug. Since you probably don’t want empty a or code elements (they are perfectly correct in XHTML, but wreak havoc, when the same document is parsed as HTML), you probably didn’t want the problematic (if you prefer that to uselesss) empty elements in the first place.

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: Bugs

Same thing happen(ed) when you typed (the equally useless)

<code></code>


Here's a [[wikilink]].

Fixed in Revision 744.

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Heterotic Beast – Topic: Bugs

No it doesn’t.

Deleting the only post in a topic, also deletes the topic. Possibly, the redirect (which normally goes to topic#show but, in this case, should go to the forum#show because the topic no longer exists) is incorrect.

Fixed.

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Instiki – Topic: nlab

Thinking about Recently Revised and All Pages, you suggested (somewhere) taking them out of the sweeper as a way of stopping them being regenerated every time a page is edited (I don’t know if this was one of you “If you’re going to do something crazy, here’s a way of limiting how crazy you’re going to be” suggestions or if you thought this was actually a good idea).

The former. You’re trading off workload on the server for stale data. Since computers are supposed to serve humans, rather than the other way around, the question is: does this improve the user experience?

Say you implement the above suggestion. On the one hand, the user always (or almost always, depending on implementation) receives the cached page, i.e. gets a quick response. On the other hand, the data is invariably stale.

Leaving these alone, the user is guaranteed to receive fresh data, but there could be a significant delay, if the page has to be regenerated. What percentage of requests for these pages hit the cache?

A better solution is to pull in the will_paginate gem, and paginate the data returned. That makes the request $O (1)$ , again, instead of $O (N)$ . So the user gets both a quick response AND up-to-date data.

As to moving away from Maruku, to some peg-markdown-based clone, note that

This will benefit Heterotic Beast as well.
The task I’m asking of your guys is not “programming,” per se. It involves writing a formal PEG grammar for Maruku’s extended Markdown syntax (starting with the existing PEG grammar, already in peg-markdown).
On the other hand, if there were someone handy in C, that would be most appreciated, too, because I am crappy at C. Hooking in itex2MML would take mere minutes for a competent C-programmer.
I imagine that a such a fork of peg-markdown (as it’s written in C), would be useful in your other projects, as well.

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: nlab

Did a lot of profiling this weekend, and produced a few tweaks to Maruku’s parsing, which speeded it up a little.

Unfortunately, the main discovery was that (with that test page as input), 3/4 of Maruku’s time is spent in the #to_html output method; only 1/4 is spent in parsing the original input. Thus, my efforts, which maybe improved the parsing speed by 5%, contributed at best a 1% speedup in the total Instiki processing time, i.e something you would never notice.

I hope that one of your guys finds formal grammars sufficiently “categorical” to be worthy of a small bit of their attention.

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: nlab

Pretty much everything beyond the standard Markdown syntax needs to be written, though some folks in the peg-markdown network seem to have included Michel Fortin’s Markdown-Extra extensions (albeit, along with some of their own, incompatible, extensions).

Presumably, this fork of peg-markdown would be directly linked to the itex2MML functions which process inline and display equations (ie, it would not use the Ruby bindings provided by the itextomml gem). So there’s a little bit more to do than write the peg grammar. But not a lot more …

As to whether you want to ask someone to do this, I’ve already explained my desire to replace Maruku (for licensing reasons). Here’s another motivation, from efficiency. Instiki’s performance does genuinely suck, in this instance.

The question is, are they (your nlab colleagues) willing to do anything about improving it?

Feel free to point them to this discussion, and to the previous one on Markdown alternatives.

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Instiki – Topic: nlab

On golem (the iMac on my desk), rendering a copy of that page takes

Completed in 27588ms (View: 27356, DB: 192)

Note that it does spend (what I consider to be) a significant amount of time querying the database, but it is totally dwarfed by the rendering time. (I don’t know why yours is spending an order of magnitude longer querying the database. Seems like something’s very wrong, there, even though the conclusion is the same.)

The page, of course, contains a number of markup errors, like

... [[transversal map]s ...

which sent Maruku into convulsions. Surprisingly, correcting those errors did not appreciably affect the rendering time (the above-reported time is after making the relevant markup corrections).

On my laptop, a typical time was

Completed in 48775ms (View: 48635, DB: 77)

SQlite3 is faster than MySQL, but the machine itself is significantly slower than the iMac.

Of those 49 seconds, spent rendering the page, 43 of them were spent in Maruku (for obscure reasons, Maruku has to be run twice, so really, we’re talking about 22 seconds to process the 175KB source).

Maruku doesn’t particularly care about the number of WikiLinks, so that has nothing to do with why it takes so long render this particular page.

Of the remaining 6 seconds, 4 seconds were spent in the Instiki Sanitizer. I don’t think there’s much to optimize there.

The remaining 2 seconds were, largely, spent in the Chunk-Handler – the thing that processes Wikilinks (and, presumably, cares about how many of them there are). 2 seconds is still a long time, but it’s not surprising. Doing on the order of $10^{3}$ RegExp substitutions (5360 chunk-masking and 686 chunk-unmasking operations, to be precise) on a 175KB string, takes significant time. Using Regexps to process long strings sucks.

I have looked at various optimizations of the Chunk-Handler code, but nothing I can do will contribute much to the speedup of rendering this page, which is dominated by Maruku.

Now, if one of your nLab folks were to volunteer to write a PEG grammar for Maruku’s extended Markdown syntax, …

Update:

Now, if one of your nLab folks were to volunteer to write a PEG grammar for Maruku’s extended Markdown syntax, …

Since I’m not gonna hold my breath for that to happen, I decided to spend some time (alas, more than I expected) making Maruku faster. The new rendering times for that page are

Completed in 13228ms (View: 13024, DB: 198)

on golem and

Completed in 21666ms (View: 20979, DB: 83)

on my laptop. Roughly a factor of 2 improvement in the total rendering time, in both cases.

Still not great, but it’s the best that I am going to achieve.

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Heterotic Beast – Topic: Bugs

It would be more sensible if it took me to the point where I last read up to (which I presume it knows).

How does Vanilla keep track of what posts you’ve read?

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: nlab

One thing that I am pretty sure that slows down a page load is if the page has a lot of wikilinks on it. I don’t know how it checks all the links, but is there some way that that could be speeded up?

Could you compare (by deleting the cached page and reloading ) how long it takes to build a wikilink-heavy page, versus a “normal” one?

I’m particularly interested in the database-lookup times. As I said via email, the WikiReferences model uses a lot of raw SQL queries (which, therefore, do not benefit from ActiveRecord caching). But I am a little skeptical that is the cause of much of a slowdown.

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Instiki – Topic: nlab

It keeps 25 files, of size 1MB each.

Both of these numbers are configurable.

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Instiki – Topic: nlab

I wouldn’t count this morning’s crash as anything special.

You seem to be inured to the idea of Instiki crashing. I am not. It shouldn’t crash, and there’s something wrong if it does. I’m not even convinced that my PassengerPoolIdleTime theory explains the phenomenon. Let’s see if it can go a week without hiccup.

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: Feature Requests

As far as I can tell, updating the application files on a running Rails application (in production mode) has no effect, until the application is restarted. I, honestly, haven’t thought about the bundled Gems, but I expect the answer is the same.

In any case, ruby bundle is pretty fast (of course, that’s because it’s actually superfluous) if the Gemfile hasn’t changed. I suppose you can stat the Gemfile to see whether you actually need to run ruby bundle at all.

But I don’t think that’s your issue…

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Instiki – Topic: Feature Requests

Given that instiki can be installed as a gem …

Instiki cannot be installed as a Gem.

There’s an old (~0.10.x) version, which worked as a Gem, and which is probably still floating around (on the internets, nothing ever really disappears). But that was long before my time, and I have not even thought about packaging the current version as a Gem.

Of course, under Passenger, you can run multiple instances of a Rails application (including Instiki), under different subdirectories (or subdomains, if you have virtual hosts enabled).

(At least with Instiki, that would require separate copies of the code, as each instance would have to point to its own database (in config/database.yml). I suppose one could use soft-links astutely, so that there was really only one copy of the source code, shared by these different instances.)

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Instiki – Topic: nlab

Let’s keep it “off” for the time-being; I don’t have any plans to change anything for the next week, at least.

Do you have any other scripts/cron-jobs running?

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: Feature Requests

Something that is a “dummy” is not necessarily “dumb”. A “dummy” simply means a fake …

I am familiar with the usage (there’s not a UK/US distinction).

I was making a lame attempt at humour, whilst making the serious point that these CSS classes are used for styling (generated content), and as structural hooks (for converting the \ref{}s into hyperlinks), hence are not “dummy” in either sense.

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Heterotic Beast – Topic: Bugs

Hmmm. /users/nnn/posts?monitored=true doesn’t seem to return anything, here on Golem. But it works fine on my test installation.

Both are running in production mode. The test installation uses sqlite3; this one uses mysql. The (admittedly complicated) SQL join seems not to work on the latter.

Hah! Fixed, now. Boolean comparisons are not the same in SQLite3 and MySQL. Finding a syntax that works in both was … umh … fun.

posted almost 14 years ago

distler 123 posts

Forum: Heterotic Beast – Topic: Feature Requests

Hmm. I think tagging a post as having been edited, after the grace period, should suffice.

See what you think.

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: Feature Requests

…dummy CSS classes.

I bridle, only at referring to these as “dummy.” I thought that the whole mechanism I invented with those classes was very clever. And even more versatile than I had originally envisioned.

Not “dummy” at all….

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: Feature Requests

I’m not sure I understand. If an element has an explicit id, because you gave it a {: #foo} IAL, then you can refer to it a [fubar](#foo). \ref{...} is used for linking to things with generated numbers (like theorems, ‘n such).

I think your motivation is to have something that converts to LaTeX. \ref{foo} converts just fine. There are two issues, if I understand correctly.

The IAL on the list-item doesn’t get converted to a \label{foo} on the \item in the LaTeX output.
For the XHTML output to behave as you would like, you need to invoke Maruku’s internal counter, so that \ref{foo} is converted to a hyperlink, with anchor-text being the number of the list-item.

The latter is a bit awkward, as there can be several ordered-lists on a page. We’d need a separate counter for each one, no?

posted almost 14 years ago

distler 123 posts

edited almost 14 years ago

Forum: Heterotic Beast – Topic: Feature Requests

With the anonymous posting, then I can specify it on a “per category” basis in Vanilla. The hierarchy in Vanilla is:
Forum: Category: [Subcategory:...] Discussion: Post

I think this maps onto

Site: Forum: Topic: Post

with the proviso that (at least as currently implemented) “Sites” need to live on separate subdomains.

So you’re talking about allowing anonymous postings on a per-forum basis?

To avoid spam, the person has to solve a reCaptcha to post.

There’s some captcha mechanism built-in (but disabled) in Beast. Will have to explore …

Behind the scenes, there is a “guest” user and the software logs in the guest user, posts the post, and then logs out again. All such posts get authored by “Guest” and (this is less than ideal) …

I suppose just creating a “Guest” user, with a blank password, would not suffice, as “Guest” could then post to any forum.

Regarding “editable” posts, I’d go for “created_at”. A discussion is a linear thing, and most of the time edits will be for minor typos which certainly shouldn’t change the order. Deciding on the difference between minor and major is a human thing.

If you wanted the best of both worlds, an edit to a post could insert a line at the relevant time point saying “Post X was edited at …”. That wouldn’t disrupt the flow of the conversation but would signal that someone had potentially thrown a stone in to the water.

Or simply tag the post, itself, as having been “Edited at …”

Maybe best of all would be a 5-minute grace period: if updated_at time - created_at time > 5 minutes, then include such a tag.

… Like on this post.

posted almost 14 years ago

distler 123 posts

Forum: Instiki – Topic: Bugs

Ah, that’s unfortunate. What other markdown engines are available for Ruby?

There are several.

There’s rdiscount, based on discount.
There’s BlueCloth, which was the “original” Markdown interpreter for Ruby, which sucked bigtime. But Bluecloth 2.x has been rewritten to use discount.
There’s rpeg-markdown, based on peg-markdown.

The latter is probably the most promising. One “just” needs to write a PEG grammar for Maruku’s extended Markdown syntax, and then drop it in as a replacement.

“Just” …

posted almost 14 years ago

distler 123 posts

Forum: Heterotic Beast – Topic: Feature Requests

Andrew Stacey wants:

The ability to selectively allow anonymous posts (ie, posts by unregistered users). It’s not clear whether he wants that on a per-forum basis, or on a per-topic basis. It’s also not clear who gets to decide (moderator or admin).
Themes. I think themes_for_rails looks promising.
More fine-grained access-controls. Apparently, admin/moderator/user(/anon) is insufficiently fine-grained.

Other thing that bear looking at:

The Signup process.
Since posts are editable, should they be ordered by updated_at instead of by created_at dates?