Code
I’ve been playing around with Claude Code (Claude Opus 4.7 (1M context)) because, well, who hasn’t?
I am, so far, only moderately impressed by its abilities in physics. The most impressive bit so far was when, in response to a question about nilpotent orbits, it responded
I don’t know. I could waste your time by guessing, but …
I had never had an LLM tell me that it doesn’t know something, much less that it didn’t want to waste my time by making stuff up. So this was positively shocking to read.
Claude Code is, however, unfathomably good at generating code, so I set it the task of modernizing Instiki and Heterotic Beast, my forum software. Both are Rails applications and both have extensive test suites. So they use a software framework Claude is familiar with and have an objective standard for whether the changes made are correct.
When there is no test, however, things can go wildly off the rails (pun intended). For instance, consider the following snippet of Ruby code
def foo(text)
...
con = text
...
(now mutate con)
...
con
end
If text is a frozen string, this will generate an error, as you can’t mutate a frozen string. Obviously, what you should do is write
def foo(text)
...
con = text.dup
...
con
end
which copies the caller’s string to a new unfrozen string which you can mutate to your heart’s content.
What did Claude do?
def foo(text)
...
con = text.encode
...
con
end
which also produces a new unfrozen string, transcoded from the caller’s encoding to Encoding.default_internal (which turns out to be nil). This is both (a) nondeterministic and (b) blows up spectacularly when text contains astral plane characters, like “𝔸”. I had to tell Claude not to do that, and to write some tests to check that astral plane characters are handled correctly.
Still …
I would set Claude the task of rewriting this blogging software, but alas I don’t have a test suite to compare with.
What I really should do, though, is find some physics I would trust it to work on.
Posted by distler at May 16, 2026 4:01 PM
