Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

May 16, 2026

Code

I’ve been playing around with Claude Code (Claude Opus 4.7 (1M context)) because, well, who hasn’t?

I am, so far, only moderately impressed by its abilities in physics. The most impressive bit so far was when, in response to a question about nilpotent orbits, it responded

I don’t know. I could waste your time by guessing, but …

I had never had an LLM tell me that it doesn’t know something, much less that it didn’t want to waste my time by making stuff up. So this was positively shocking to read.

Claude Code is, however, unfathomably good at generating code, so I set it the task of modernizing Instiki and Heterotic Beast, my forum software. Both are Rails applications and both have extensive test suites. So they use a software framework Claude is familiar with and have an objective standard for whether the changes made are correct.

When there is no test, however, things can go wildly off the rails (pun intended). For instance, consider the following snippet of Ruby code


def foo(text)
  ...
  con = text
  ...
  (now mutate con)
  ...
  con
end

If text is a frozen string, this will generate an error, as you can’t mutate a frozen string. Obviously, what you should do is write


def foo(text)
  ...
  con = text.dup
  ...
  con
end

which copies the caller’s string to a new unfrozen string which you can mutate to your heart’s content.

What did Claude do?


def foo(text)
  ...
  con = text.encode
  ...
  con
end

which also produces a new unfrozen string, transcoded from the caller’s encoding to Encoding.default_internal (which turns out to be nil). This is both (a) nondeterministic and (b) blows up spectacularly when text contains astral plane characters, like “𝔸”. I had to tell Claude not to do that, and to write some tests to check that astral plane characters are handled correctly.

Still …

I would set Claude the task of rewriting this blogging software, but alas I don’t have a test suite to compare with.

What I really should do, though, is find some physics I would trust it to work on.

Posted by distler at May 16, 2026 4:01 PM

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3634

5 Comments & 0 Trackbacks

Re: Code

Welcome back!

find some physics I would trust it to work on

Something to do with the polynomial constraints of type-S SCFTs of class E8? Not that I have understood every stage of that, but polynomials seem like something it could handle…

Posted by: Mitchell Porter on May 20, 2026 4:56 PM | Permalink | Reply to this

Re: Code

I’ve been working with Andreas Karch on a project which requires evaluating a bunch of Feynman diagrams in a certain 3d (large-NN) QFT.

He has been trying to get Claude to do the computation. And the results have been nothing short of hopeless. Now, mind-you, there have been software packages available for decades that can do exactly this (as can any 2nd year graduate student). So Claude’s incompetence at this task is pretty remarkable.

I’m just gonna have to sit down and do the whole computation myself.

Posted by: Jacques Distler on May 20, 2026 11:13 PM | Permalink | PGP Sig | Reply to this

Re: Code

Why not give the computation to your grad student to complete? :-)

Posted by: Derek on May 21, 2026 10:35 AM | Permalink | Reply to this

Re: Code

Most projects (including this one) have their interesting, intellectually-stimulating bits and their boring, tedious bits.

If you want to teach graduate students something, you want to involve them in the intellectually-stimulating part, and not just foist upon them the tedious part.

Posted by: Jacques Distler on May 21, 2026 10:52 AM | Permalink | PGP Sig | Reply to this

Re: Code

I understand. I look forward to seeing the paper on arxiv!

Posted by: Derek on May 21, 2026 1:24 PM | Permalink | Reply to this

Post a New Comment