Recent Posts by distler

Subscribe to Recent Posts by distler 81 posts found

posted 3 years ago
distler 81 posts

edited 3 years ago

Forum: Instiki – Topic: nlab

Thinking about Recently Revised and All Pages, you suggested (somewhere) taking them out of the sweeper as a way of stopping them being regenerated every time a page is edited (I don’t know if this was one of you “If you’re going to do something crazy, here’s a way of limiting how crazy you’re going to be” suggestions or if you thought this was actually a good idea).

The former. You’re trading off workload on the server for stale data. Since computers are supposed to serve humans, rather than the other way around, the question is: does this improve the user experience?

Say you implement the above suggestion. On the one hand, the user always (or almost always, depending on implementation) receives the cached page, i.e. gets a quick response. On the other hand, the data is invariably stale.

Leaving these alone, the user is guaranteed to receive fresh data, but there could be a significant delay, if the page has to be regenerated. What percentage of requests for these pages hit the cache?

A better solution is to pull in the will_paginate gem, and paginate the data returned. That makes the request O(1), again, instead of O(N). So the user gets both a quick response AND up-to-date data.

As to moving away from Maruku, to some peg-markdown-based clone, note that

  1. This will benefit Heterotic Beast as well.
  2. The task I’m asking of your guys is not “programming,” per se. It involves writing a formal PEG grammar for Maruku’s extended Markdown syntax (starting with the existing PEG grammar, already in peg-markdown).
  3. On the other hand, if there were someone handy in C, that would be most appreciated, too, because I am crappy at C. Hooking in itex2MML would take mere minutes for a competent C-programmer.
  4. I imagine that a such a fork of peg-markdown (as it’s written in C), would be useful in your other projects, as well.
 
posted 3 years ago
distler 81 posts

Forum: Instiki – Topic: nlab

Did a lot of profiling this weekend, and produced a few tweaks to Maruku’s parsing, which speeded it up a little.

Unfortunately, the main discovery was that (with that test page as input), 3/4 of Maruku’s time is spent in the #to_html output method; only 1/4 is spent in parsing the original input. Thus, my efforts, which maybe improved the parsing speed by 5%, contributed at best a 1% speedup in the total Instiki processing time, i.e something you would never notice.

I hope that one of your guys finds formal grammars sufficiently “categorical” to be worthy of a small bit of their attention.

 
posted 3 years ago
distler 81 posts

Forum: Instiki – Topic: nlab

Pretty much everything beyond the standard Markdown syntax needs to be written, though some folks in the peg-markdown network seem to have included Michel Fortin’s Markdown-Extra extensions (albeit, along with some of their own, incompatible, extensions).

Presumably, this fork of peg-markdown would be directly linked to the itex2MML functions which process inline and display equations (ie, it would not use the Ruby bindings provided by the itextomml gem). So there’s a little bit more to do than write the peg grammar. But not a lot more …

As to whether you want to ask someone to do this, I’ve already explained my desire to replace Maruku (for licensing reasons). Here’s another motivation, from efficiency. Instiki’s performance does genuinely suck, in this instance.

The question is, are they (your nlab colleagues) willing to do anything about improving it?

Feel free to point them to this discussion, and to the previous one on Markdown alternatives.

 
posted 3 years ago
distler 81 posts

edited 3 years ago

Forum: Instiki – Topic: nlab

On golem (the iMac on my desk), rendering a copy of that page takes

Completed in 27588ms (View: 27356, DB: 192)

Note that it does spend (what I consider to be) a significant amount of time querying the database, but it is totally dwarfed by the rendering time. (I don’t know why yours is spending an order of magnitude longer querying the database. Seems like something’s very wrong, there, even though the conclusion is the same.)

The page, of course, contains a number of markup errors, like

... [[transversal map]s ...

which sent Maruku into convulsions. Surprisingly, correcting those errors did not appreciably affect the rendering time (the above-reported time is after making the relevant markup corrections).

On my laptop, a typical time was

Completed in 48775ms (View: 48635, DB: 77)

SQlite3 is faster than MySQL, but the machine itself is significantly slower than the iMac.

Of those 49 seconds, spent rendering the page, 43 of them were spent in Maruku (for obscure reasons, Maruku has to be run twice, so really, we’re talking about 22 seconds to process the 175KB source).

Maruku doesn’t particularly care about the number of WikiLinks, so that has nothing to do with why it takes so long render this particular page.

Of the remaining 6 seconds, 4 seconds were spent in the Instiki Sanitizer. I don’t think there’s much to optimize there.

The remaining 2 seconds were, largely, spent in the Chunk-Handler – the thing that processes Wikilinks (and, presumably, cares about how many of them there are). 2 seconds is still a long time, but it’s not surprising. Doing on the order of 10 3 RegExp substitutions (5360 chunk-masking and 686 chunk-unmasking operations, to be precise) on a 175KB string, takes significant time. Using Regexps to process long strings sucks.

I have looked at various optimizations of the Chunk-Handler code, but nothing I can do will contribute much to the speedup of rendering this page, which is dominated by Maruku.

Now, if one of your nLab folks were to volunteer to write a PEG grammar for Maruku’s extended Markdown syntax, …

Update:

Now, if one of your nLab folks were to volunteer to write a PEG grammar for Maruku’s extended Markdown syntax, …

Since I’m not gonna hold my breath for that to happen, I decided to spend some time (alas, more than I expected) making Maruku faster. The new rendering times for that page are

Completed in 13228ms (View: 13024, DB: 198)

on golem and

Completed in 21666ms (View: 20979, DB: 83)

on my laptop. Roughly a factor of 2 improvement in the total rendering time, in both cases.

Still not great, but it’s the best that I am going to achieve.

 
posted 3 years ago
distler 81 posts

edited 3 years ago

Forum: Heterotic Beast – Topic: Bugs

Layer 1

It would be more sensible if it took me to the point where I last read up to (which I presume it knows).

How does Vanilla keep track of what posts you’ve read?

 
posted 3 years ago
distler 81 posts

Forum: Instiki – Topic: nlab

One thing that I am pretty sure that slows down a page load is if the page has a lot of wikilinks on it. I don’t know how it checks all the links, but is there some way that that could be speeded up?

Could you compare (by deleting the cached page and reloading ) how long it takes to build a wikilink-heavy page, versus a “normal” one?

I’m particularly interested in the database-lookup times. As I said via email, the WikiReferences model uses a lot of raw SQL queries (which, therefore, do not benefit from ActiveRecord caching). But I am a little skeptical that is the cause of much of a slowdown.

 
posted 3 years ago
distler 81 posts

edited 3 years ago

Forum: Instiki – Topic: nlab

It keeps 25 files, of size 1MB each.

Both of these numbers are configurable.

 
posted 3 years ago
distler 81 posts

edited 3 years ago

Forum: Instiki – Topic: nlab

I wouldn’t count this morning’s crash as anything special.

You seem to be inured to the idea of Instiki crashing. I am not. It shouldn’t crash, and there’s something wrong if it does. I’m not even convinced that my PassengerPoolIdleTime theory explains the phenomenon. Let’s see if it can go a week without hiccup.

 
posted 3 years ago
distler 81 posts

Forum: Instiki – Topic: Feature Requests

As far as I can tell, updating the application files on a running Rails application (in production mode) has no effect, until the application is restarted. I, honestly, haven’t thought about the bundled Gems, but I expect the answer is the same.

In any case, ruby bundle is pretty fast (of course, that’s because it’s actually superfluous) if the Gemfile hasn’t changed. I suppose you can stat the Gemfile to see whether you actually need to run ruby bundle at all.

But I don’t think that’s your issue…

 
posted 3 years ago
distler 81 posts

edited 3 years ago

Forum: Instiki – Topic: Feature Requests

Given that instiki can be installed as a gem …

Instiki cannot be installed as a Gem.

There’s an old (~0.10.x) version, which worked as a Gem, and which is probably still floating around (on the internets, nothing ever really disappears). But that was long before my time, and I have not even thought about packaging the current version as a Gem.

Of course, under Passenger, you can run multiple instances of a Rails application (including Instiki), under different subdirectories (or subdomains, if you have virtual hosts enabled).

(At least with Instiki, that would require separate copies of the code, as each instance would have to point to its own database (in config/database.yml). I suppose one could use soft-links astutely, so that there was really only one copy of the source code, shared by these different instances.)

 
posted 3 years ago
distler 81 posts

edited 3 years ago

Forum: Instiki – Topic: nlab

Let’s keep it “off” for the time-being; I don’t have any plans to change anything for the next week, at least.

Do you have any other scripts/cron-jobs running?

 
posted 3 years ago
distler 81 posts

Forum: Instiki – Topic: Feature Requests

Something that is a “dummy” is not necessarily “dumb”. A “dummy” simply means a fake …

I am familiar with the usage (there’s not a UK/US distinction).

I was making a lame attempt at humour, whilst making the serious point that these CSS classes are used for styling (generated content), and as structural hooks (for converting the \ref{}s into hyperlinks), hence are not “dummy” in either sense.

 
posted 3 years ago
distler 81 posts

edited 3 years ago

Forum: Heterotic Beast – Topic: Bugs

Hmmm. /users/nnn/posts?monitored=true doesn’t seem to return anything, here on Golem. But it works fine on my test installation.

Both are running in production mode. The test installation uses sqlite3; this one uses mysql. The (admittedly complicated) SQL join seems not to work on the latter.

Hah! Fixed, now. Boolean comparisons are not the same in SQLite3 and MySQL. Finding a syntax that works in both was … umh … fun.

 
posted 3 years ago
distler 81 posts

Forum: Heterotic Beast – Topic: Feature Requests

Hmm. I think tagging a post as having been edited, after the grace period, should suffice.

See what you think.

 
posted 3 years ago
distler 81 posts

Forum: Instiki – Topic: Feature Requests

…dummy CSS classes.

I bridle, only at referring to these as “dummy.” I thought that the whole mechanism I invented with those classes was very clever. And even more versatile than I had originally envisioned.

Not “dummy” at all….

 
posted 3 years ago
distler 81 posts

Forum: Instiki – Topic: Feature Requests

I’m not sure I understand. If an element has an explicit id, because you gave it a {: #foo} IAL, then you can refer to it a [fubar](#foo). \ref{...} is used for linking to things with generated numbers (like theorems, ‘n such).

I think your motivation is to have something that converts to LaTeX. \ref{foo} converts just fine. There are two issues, if I understand correctly.

  1. The IAL on the list-item doesn’t get converted to a \label{foo} on the \item in the LaTeX output.
  2. For the XHTML output to behave as you would like, you need to invoke Maruku’s internal counter, so that \ref{foo} is converted to a hyperlink, with anchor-text being the number of the list-item.

The latter is a bit awkward, as there can be several ordered-lists on a page. We’d need a separate counter for each one, no?

 
posted 3 years ago
distler 81 posts

edited 3 years ago

Forum: Heterotic Beast – Topic: Feature Requests

With the anonymous posting, then I can specify it on a “per category” basis in Vanilla. The hierarchy in Vanilla is:

Forum: Category: [Subcategory:...] Discussion: Post

I think this maps onto

Site: Forum: Topic: Post

with the proviso that (at least as currently implemented) “Sites” need to live on separate subdomains.

So you’re talking about allowing anonymous postings on a per-forum basis?

To avoid spam, the person has to solve a reCaptcha to post.

There’s some captcha mechanism built-in (but disabled) in Beast. Will have to explore …

Behind the scenes, there is a “guest” user and the software logs in the guest user, posts the post, and then logs out again. All such posts get authored by “Guest” and (this is less than ideal) …

I suppose just creating a “Guest” user, with a blank password, would not suffice, as “Guest” could then post to any forum.

Regarding “editable” posts, I’d go for “created_at”. A discussion is a linear thing, and most of the time edits will be for minor typos which certainly shouldn’t change the order. Deciding on the difference between minor and major is a human thing.

If you wanted the best of both worlds, an edit to a post could insert a line at the relevant time point saying “Post X was edited at …”. That wouldn’t disrupt the flow of the conversation but would signal that someone had potentially thrown a stone in to the water.

Or simply tag the post, itself, as having been “Edited at …”

Maybe best of all would be a 5-minute grace period: if updated_at time - created_at time > 5 minutes, then include such a tag.

… Like on this post.

 
posted 3 years ago
distler 81 posts

Forum: Instiki – Topic: Bugs

Ah, that’s unfortunate. What other markdown engines are available for Ruby?

There are several.

  1. There’s rdiscount, based on discount.
  2. There’s BlueCloth, which was the “original” Markdown interpreter for Ruby, which sucked bigtime. But Bluecloth 2.x has been rewritten to use discount.
  3. There’s rpeg-markdown, based on peg-markdown.

The latter is probably the most promising. One “just” needs to write a PEG grammar for Maruku’s extended Markdown syntax, and then drop it in as a replacement.

“Just”

 
posted 3 years ago
distler 81 posts

Forum: Heterotic Beast – Topic: Feature Requests

Andrew Stacey wants:

  • The ability to selectively allow anonymous posts (ie, posts by unregistered users). It’s not clear whether he wants that on a per-forum basis, or on a per-topic basis. It’s also not clear who gets to decide (moderator or admin).
  • Themes. I think themes_for_rails looks promising.
  • More fine-grained access-controls. Apparently, admin/moderator/user(/anon) is insufficiently fine-grained.

Other thing that bear looking at:

  • The Signup process.
  • Since posts are editable, should they be ordered by updated_at instead of by created_at dates?
 
posted 3 years ago
distler 81 posts

edited 3 years ago

Forum: Instiki – Topic: Bugs

Both this request (for “fenced” quotations and lists) and this one, in the other thread, are for extensions to the Markdown syntax in Maruku.

Frankly, I’m very reluctant to spend any time working on Maruku.

The author (who is no longer actively developing the software) insists on a GPL license, which conflicts with the licenses for both Instiki (Ruby) and Heterotic Beast (MIT). Unless he changes his mind (which seems unlikely, as I’ve asked several times), I would prefer to ditch Maruku, in favour of another Markdown engine.

Consequently, I’d rather spend my time extending that engine (whatever it turns out to be). Of course, for the present, I am still going to fix bugs in Maruku.

Update:

Well, OK, I didn’t exactly keep that promise…

 
posted 3 years ago
distler 81 posts

edited 8 months ago

Forum: Instiki – Topic: Shiny new forum

We can type equations

(1)(a b c d)(x y)=0 \fghighlight{red}{\begin{pmatrix}a&b \\ c&d\end{pmatrix}} \begin{pmatrix}x\\ y\end{pmatrix}=0

or put them in SVG graphics

Layer 1 ( a b c d ) ( x y ) = 0 \begin{pmatrix}a&b \ c&d\end{pmatrix} \begin{pmatrix}x\ y\end{pmatrix}=0

or type some code

require 'chunks/chunk'

# Contains all the methods for finding and replacing wiki links.
module WikiChunk
  include Chunk

  # A wiki reference is the top-level class for anything that refers to
  # another wiki page.
  class WikiReference < Chunk::Abstract
	
    # Name of the referenced page
    attr_reader :page_name

    # Name of the referenced page
    attr_reader :web_name

    # the referenced page
    def refpage
      @content.web.page(@page_name)
    end
	
end

and so forth.

Theorem

(Distler’s Theorem). Any given programming task is easier with Rails.