Sister blog of Physicists of the Caribbean in which I babble about non-astronomy stuff, because everyone needs a hobby

Monday, 16 March 2026

The Logician's Swindle

What makes a puzzle annoying ? When is solving a problem rewarding, and when is finding out the answer just frustrating ? If we could answer this, we might get a long way towards making the world a happier place. Getting people to actually enjoy solving problems, rather than being pissed off at their opponents for discovering a flaw in their arguments, would surely benefit political discourse enormously.

I don't propose to try and answer all of this today. Instead, what I can do is address one particular aspect of the problem. I say that at least one major cause of puzzles being annoying rather than enjoyable is when you've been outright cheated, and that this happens far more often than it should.

Specifically, consider Newcomb's Paradox as described on Veritasium. The video begins :

You walk into a room, and there's a supercomputer and two boxes on the table. One box is open, and it's got $1,000 in it. There's no trick. You know it's $1,000. The other box is a mystery box, you can't see inside.

Now, the supercomputer says you can either take both boxes, that is the mystery box and the $1,000, or you can just take the mystery box.

So, what's in that mystery box?

Well, the supercomputer tells you that before you walked into the room, it made a prediction about your choice. If the supercomputer predicted you would just take the mystery box and you'd leave the $1,000 on the table, well, then it put $1 million into the mystery box. But if the supercomputer predicted that you would take both boxes, then it put nothing in the mystery box.

The supercomputer made its prediction before you knew about the problem and it has already set up the boxes. It's not trying to trick you, it's not trying to deprive you of any money. Its only goal is to make the correct prediction.

So, what do you do? Do you take both boxes or do you just take the mystery box?

Don't worry about how the supercomputer is making its prediction. Instead of a computer, you could think of it as a super intelligent alien, a cunning demon, or even a team of the world's best psychologists. It really doesn't matter who or what is making the prediction. All you need to know is that they are extremely accurate and that they made that prediction before you walked into the room.

I highlight certain parts because they feel crucial. To me, this is saying very explicitly, "don't think about this aspect of the problem, it's not important at all". Were this not so, I would otherwise object to how such a thing could be possible, and the details would certainly matter : was the machine running over a diverse sample of people, or was there something particular about them that helped its accuracy ? But no, this apparently isn't important, so whatever misgivings I have about free will and suchlike, I willingly surrender for the purpose of the test. I put them aside, still fully expecting to be fooled (I suck at logical puzzles) but in some other way.

Having made that assumption, the answer is obvious. If the machine is essentially always accurate, I take one box. It knows, magically, that this box will contain a million dollars, and I walk out happy and rich and in search for a bank offering a good exchange rate to a proper currency. 

But later in the video we get :

Here's how I think about the problem in a way that actually makes sense. You know that the supercomputer has already set up the boxes, so whatever I decide to do now, it doesn't change whether there's zero or $1 million in that mystery box, and that gives us four possible options that I've written down here.

If there is $0 in a mystery box, then I could one-box and get $0 or I could two-box and get $1,000, but there could also be $1 million in a mystery box. And in that case, I would get $1 million if I one-box or I would get $1,001,000 if I two-box. So, I'm always better off by picking both boxes.

Rubbish. Complete twaddle. You just told us that the machine is accurate and we shouldn't factor this in to our calculations, but in this way of thinking you cannot possibly ignore how the machine works. This is not even self-consistent ! By saying that the machine is essentially perfectly accurate, you've eliminated the very possibility of $1,001,000. That can only happen if the machine actually is inaccurate in some cases, which to my mind you've all but told us directly to discount.

This, then is a swindle, and one common to various logical puzzles. "Don't think about this aspect of the problem", they say, only later to say, "Hah ! You should have thought about this aspect of the problem after all, you fool !". Right, so you expect me to think you're a liar ? How is that a fair test ?

The rest of the video is a perfectly decent discussion of free will etc. (Veritasium is one of my favourite YouTube channels), but the poor description from the outset makes the whole thing a mess. Having been told that accuracy was not an issue, I expect something else I've overlooked to come into play. Naturally I overlooked determinism and all that because you told me to overlook it. The pettiness of it all annoys me quite intensely.

Don't worry, I'm not going down the free will avenue with this post. Rather, I just want to briefly outline that this kind of swindle is common to logic problems, and is itself one particular expression of a more general reason they're so often very irritating.


The closest similarity is surely the Monty Hall problem (the one with the prize goats). That one always confused the heck out of me because people never properly explained that I should have been paying crucial attention to the host's knowledge, not how many goats there are or how many doors. But any logic puzzle can suffer if you're not properly informed about what the key aspect of the problem is, or worse, if you're actively told to ignore it.

Not that framing doesn't sometimes reveal something very interesting. Wason's selection is fascinating in showing how the same people can have much more difficulty solving the same task if it's described slightly differently – especially so when the alternative form is nothing they wouldn't also be familiar with. But there, the whole point is to study psychology. No deception is employed, no swindle pulls the solution out from beneath the solver's feet. The facts are laid bare and it presents a straightforward yet surprising challenge to many people who take it. No, framing is only annoying when it's done to deliberately thwart the participant. 

There's also a common tendency for the puzzle-setter to declare the rational solution from authority, saying "this is obviously the correct solution because the alternative doesn't make sense to me". A classic example concerns people refusing small amounts of compensation when they would normally expect a much bigger payout. Time and time again we hear people declaring that accepting the small offer would be rational since they come out with a net cash gain. But to any sensible person there are a multitude of reasons why this would be an extremely foolish thing to do : accepting the initial offer may deny them any chance at the larger amount; they may simply feel insulted and disrespected, and responding to such behaviour is essentially letting the bully get away with it. It is only rational in an incredibly narrow and naive economic sense, and more broadly simply isn't rational at all*.

* Veritasium does this with a unique peculiarity, openly acknowledging that the "irrational" decision of choosing one box is the more profitable. I find this is going deep into "what's wrong with you ?" territory.

Again, this is a sort of swindle, denying the opposing argument by forbidding debate rather than engaging with it on an equal footing. You thought things were going to be fair and above-board only to find out that they were anything but, that the answer had already been decided without you.

Another similarity is the pettiness. Veritasium didn't have to pull the rug out from under the viewer's feet any more than anyone has to accept that getting a smaller payout is somehow rational. 

Very occasionally, I've run public surveys to help me with my own research. I've tried to ensure the wording was extremely careful, including omitting details when this would bias the result. For example I once ran a public poll on how many groups of points – galaxies – people could see in a plot, deliberately not telling them what they were looking at. Some people objected that there wasn't enough information (e.g. what sort of scales they should be considering), and I sympathise that they might find this annoying. But for me this was the whole point, to gauge what people's natural reactions were : I wanted to know if they would instinctively identify the same groupings that appeared natural and obvious to me (most of them did, as it turned out). I needed to know if my additional knowledge was biasing me, or if the groupings I identified would be readily visible without this extra information. 

The point here is that there's absolutely no reason for misdirection. It's perfectly possible to account for this in a way that will give you a meaningful result to the question you're asking. Sometimes, this can only become apparent after the fact, but in those cases the participant should feel relieved, not annoyed. Annoyance only happens when the misdirection was unnecessary. 

A second personal example : group meetings back in my PhD days. These served the valuable purpose of getting the students used to dealing with tough questions. But they also turned the experience into a weekly grilling that made the whole thing quite intensely annoying... instead of having an enjoyable, low-stakes discussion about science, we had to deal with supervisors being deliberately over-critical. That we all knew full well what was going on didn't help in the least. It would have been fine if such sessions had been clearly demarcated and set aside as such, with regular meetings more about science for its own sake. Trying to pretend this was how scientific discussion should happen, though, was just unfair.

Again, there was no reason for the misdirection. This too was a sort of swindle. Oh, you think you're here to discuss your work ? You thought I was being harsh because I wanted to be ? Hah hah, fooled you ! The idea that maybe they could have just not done that was never raised.

On an grander scale, problems with the alternatives to dark matter. This too feels like something of a swindle : proponents often raise objections to dark matter which are based entirely on the properties of the ordinary matter we can see. They make highly dubious inferences about the necessary connection to the dark matter they're trying to demonstrate doesn't exist, saying that the lack of a naively-expected correlation proves it can't exist. Some of these problems can become obvious, but sometimes it's worth spelling this out at the high level because it's all too easy to lose sight of the forest for the trees. Once you start questioning the underlying assumption and realising that maybe the connection isn't so direct after all, often the whole thing falls apart.

And in other arenas too we find possible swindles. As I've covered before, thought experiments become extremely annoying when changing a small detail would profoundly alter the result but the instigator refuses to consider any variation : no, you must focus on this aspect of the problem because I said so, even if my scenario is actually bunk. Just like insisting someone should accept a miniscule payout, it's disrespectful not to think the other person's opinion might have some value. 

Likewise with analogies. An indirect analogy can be extremely powerful when the relevant aspect is sufficiently similar to its comparison subject, becoming thought-provoking in both its similarities and in its minor, extraneous differences. When an analogy is intended to be direct, though, the seemingly-extraneous details can become crucial, so expecting people to shut up and ignore them is not realistic. It's extremely difficult to focus on the "relevant" bit (usually declared by authority) when there are obvious deficiencies in the whole thing. Conversely, it does no good to pretend similarities don't exist when they do, or to overlook them on grounds which are actually minor details or only quantitative differences.


All this sets out some conditions for when puzzles becoming annoying, and gives us a rough working definition : The Logician's Swindle is the use of unnecessary misdirection from a position of unjustified authority.

This is similar to but not quite the same as the Magician's Choice. In the latter, we know we're being denied crucial information, misdirected, and otherwise deceived. We go in with eyes open knowing we'll almost certainly be tricked and often paying for the privilege of suspension of disbelief. We know we won't be able to solve the problem and we enjoy our failed attempts to work out what's going on.

The Logician's Swindle is altogether nastier. Here, we're supposed to have all the information we need to reach the "correct" conclusion, but we find only afterwards that actually we don't – with the swindler often denying this for the sake of making us look foolish. And the conclusion itself may be open to dispute but the proponent argues from a completely artificial authority that it isn't. Worst of all is that "mistakes" can (though do not always actually) carry real-world consequences. In short, it's a scam : a discussion that should be in good faith which actually isn't.

And that's why I hate logic puzzles.

Friday, 13 March 2026

I'd Like To Teach Machines To Think

Today, a couple of contrarian pieces claiming that maybe LLMs do think and reason after all.

That is, not in a namby-pamby, "it's just something similar enough to thinking that we might as well call it that" sort of way. This is perfectly reasonable. I stand by that myself. To get hung up on saying "they're not really thinking" every time someone casually uses this instead of "processing data" is frankly just annoying, not productive. Likewise for intelligence : if they're taking input data and producing coherent output, well I call that a form of intelligence at the very least.

No such linguistic sleights of hand are to be found here though. No no, these pieces are much closer to the dreaded C-word... consciousness.

The first article makes much the weaker claims of the two. This one touches on the self-awareness issue, but its main point is simply that they're doing something more than pure word prediction.

Modern LLMs (Claude, GPT-4, and others) have an interesting feature, the humble thinking/reasoning tokens. Before generating a response, the model can generate intermediate tokens that the user never sees (optional). These tokens aren't part of the answer. They exist between the prompt and the response, modifying the context that the final answer is generated from and associated via the attention mechanism. A final better output is then generated.

Every token between the prompt and the response is, in information-theory terms, an opportunity for drift. The prompt signal should attenuate with distance. Adding hundreds of intermediate tokens into the context should make the answer worse, not better. But reasoning tokens do the opposite. They add additional machine generated context and the answer improves. The signal gets stronger through a process that logically should weaken it.

This process improves performance. That thought should give you pause, just like how a thinking model pauses to consider hard problems before answering. That fact should stop you cold.

As someone who's not in any way, to any degree, expert in LLMs, this is highly unintuitive. After all, when I start thinking deeply, I don't expect that the more I think, the more off-track I'll go and the worse my output will become. I expect that the more things I take into consideration, the more deeply I delve into the problem, the more accurate and insightful will be my final output. I guess I presume LLMs are doing something broadly analogous in that they are (in effect) considering ever-more voluminous amounts of text in producing their final response after a chain-of-thought. The more they consider the relations between different words, the better should be their final output.

I asked ChatGPT about this, and its (considerably shortened) response was as follows :

I think the article is smuggling in a dodgy intuition there. The right baseline is not “extra tokens should obviously make things worse”. In a transformer, later tokens are not forced to overwrite earlier ones in a simple serial way. Attention lets the model keep referring back to whatever parts of the prompt or scratch work are useful, so intermediate text can genuinely help by turning one hard prediction into several easier ones. That is exactly why scratchpad methods and chain-of-thought prompting often improve performance on multi-step tasks.

So I would rephrase the issue like this : intermediate steps should not be expected to degrade performance in general. They should be expected to help when they function as a good scratchpad, and to hurt when they function as self-generated noise. The article seems to assume that the mere presence of extra tokens ought to be harmful. That is too simple. The real trade-off is not between “direct answer” and “more text”, but between “useful decomposition” and “error propagation”. One small clue that this trade-off is real is that CoT can also make models more confident when they are wrong, which is exactly what you would expect if self-generated reasoning sometimes stabilises mistakes instead of correcting them.

This seems not crazy. It then seems too much of a stretch in the article, to me, to claim that because the model is reasoning "in the context of a probability distribution", it's still doing something directly (and I emphasise the emphasis here most emphatically) analogous to some aspects of human reasoning. I think we have the capacity for a much deeper, truer understanding than any LLM has or ever will have.

If you wish to reduce this to "just" token prediction, then your "just" has to carry the weight of a system that monitors itself, evaluates its own sufficiency for a posed problem, decides when to intervene, generates targeted modifications to its own operating context, and produces objectively improved outcomes. That "just" isn't explaining anything anymore. It's refusing to engage with what the system is observably doing by utilizing a thought terminating cliche in place of observation.

None of this requires an LLM to have consciousness. However, it does require an artificial neural network to be engaging in processes that clearly resemble how meta-cognitive awareness works in the human mind. At what point does "this person is engaged in silly anthropomorphism" turn into "this other person is using anthropocentrism to dismiss what is happening in front of them"?

This doesn't feel warranted to me. For sure, humans probably use linguistic heuristics in place of "actual" reasoning more than we like to, err, think. LLMs manipulating text, in ways not matter how arbitrarily complex, does not indicate any evidence of actual, true reasoning and thought. It's just an incredibly clever way to predict words, and I see no evidence of anything deeper going on at all. No sentience, no awareness, no emotions, no preferences, no inner light. The LLM literally does not exist when it isn't prompted. It has no consciousness, no subconsciousness, no self of any kind.

The irony if course delectable... I prefer the LLM's claim that it isn't thinking to the human's assertion that it is !

Before I go to the next article, I also have to mention a recent discussion with a very interesting claim indeed :

The thing is, Yann LeCun was actually right. Purely text-based LLMs never learned that if you push a table, you also move the objects on the table. What happened instead is that LLMs became “multi-modal” and made to accept images, audio, and video as input as well as text. So “AI” did learn that if you push a table, you also move the objects on the table, but Yann LeCun was right that they did not learn it purely from text.

Despite this coming from a genuinely proper expert, I struggle to believe it. It simply doesn't match my experience with LLMs at all... well, maybe a little bit with GPT-3.5, but even with that hilarious dumbass, only a little bit. Causal connections between objects don't seem particularly difficult to establish via pure text : if you move a table, you move all objects on that table... and LLMs are surely very good at knowing what word represents a literal object. This really doesn't feel like something that should present any difficulty. 

I wish 3.5 were still with us... I'd love to test it.

GPT-5.4 says that this claim isn't correct, but that there is an interesting point behind it. That is, multi-modal models do help LLMs learn things that humans never bother writing down, but that "text-only models clearly do learn a fair amount of everyday physical regularity from language". This I would definitely believe. Without some very specific documentation though, the claim they can't learn something as basic as objects being moved when their supporting table moves is something I'd be very reluctant to concede. The current "car wash" problem is an interesting reminder that LLMs aren't human, but not, I think, proof categorical that they're totally lacking in any sort of reasoning capacity whatseover.

On to the second, much more full-throated article.

The fundamental case against the “I” in AI is that intelligence is organic, derived from sensory interaction with a physical environment. Agüera y Arcas turns the tables with the premise that computation is the substrate for intelligence in all life forms. The claim builds on an apparently crude proposition: prediction is the fundamental principle behind intelligence and “may be the whole story”.

I react quite instinctively against this, essentially with, "fascinating, tell me more about your stupid idea !". That is, there are some things I think are gloriously weird. I love their sheer audacity, may or may not hold them respectable, but don't believe them for a microsecond. I do not mean "stupid" here in an especially pejorative sense, but if you can't already understand it, then I probably can't explain it.

A central tenet of What is Intelligence? is that every form of life is an aggregation of cooperative parts. Links proliferate through patterns that enable increasingly complex functions. When Agüera y Arcas says the brain is computational, it’s not a metaphor: it is not that brains are like computers, they are computers.

He is erasing a familiar conceptual boundary here: intelligence does not prompt function, it is function. Intelligence, he argues, is a property of systems rather than beings, and function is its primary indicator. A rock does not function, but a kidney does. This is demonstrated simply by cutting them in half. The rock becomes two rocks, but the kidney is no longer a kidney.

So does a kidney have intelligence? Or an amoeba? Or a leaf? These questions are opened up, along with the question of whether Large Language Models have intelligence, which may a better way to frame it than asking whether they are intelligent.

In another discussion, I could not make myself understood when I tried to say I think that awareness is something one has, not what one is. This is extraordinarily hard to explain if you don't already "get it", but cannot for the life of me understand the claim that experiences are literally the same as physical brain states. To me this is an absolute non sequitur, with the evidence being possibly the clearest that could ever be presented for anything ever. I won't try and do it again – this blog is chock-full of that kind of stuff as it is – but maybe it's still useful to frame how I think about LLMs.

I do not have any issue with humans eventually constructing some sort of true AI. I think it's perfectly possible we could construct a chip or something which would give rise to (or otherwise access) consciousness. I do not think an LLM will ever do that, because it's literally just rearranging text. I see no more reason an LLM could be conscious than the text of a newspaper if it were cut to pieces and thrown into a whirlwind. This is why that when I say it makes good sense to describe LLMs as intelligent and reasoning, I do so with a quite monumental proviso that this in no way whatsofuckingever implies they do so much like humans – at least, not with regard to the full scope of how humans think.

 Maybe some bullet points would help ? Well, let's try. I claim :

  • LLMs can be said to think and reason in that they process input data to produce sensible outputs. In certain domains, they can already do so with an accuracy that rivals and exceeds humans.
  • LLMs function as more than pure word predictors; they can generalise and abstract to a useful degree, though highly imperfect and not on a par with humans except for the stupid ones (of which unfortunately we must contend with many). They do some things with appreciable, meaningful similarity to humans.
  • LLMs are nevertheless just mechanical. They don't have an inner life, are incapable of feeling anything, have no desires, no sensations, and no awareness of even the smallest degree. You can't program a true AI, you have to build one. BUT...
  • ...who on Earth would want a mechanical mind that would potentially suffer or try and eat us or whatnot ? Pointless. Far better to stick with an LLM-like route and go for pure tool development; if you want to bring new souls into the world, there's exactly one way to do that, and it ain't about building robots in your garden shed.
There, does that clear things up ? I doubt it. This has been the most decoherent of Decoherency posts, so if you think I'm being self-contradictory, you're wrong and I don't care. But I feel slightly better, anyway.

Monday, 2 March 2026

Everything Is Too Easy

Today, a short look at a BBC article claiming that the modern world is making everything too easy. Yes, really.

It is, of course, all about the age-old argument as to whether making things easier makes us stupider. There's a fair point to be made for "use it for lose it", but the question should really be about when and whether skill atrophy actually matters. There's no point worrying that your driving abilities will degrade if you switch entirely to public transport, and so long as you can guarantee your new option is reliable and available, there seems no point at all in bemoaning this (I've covered this issue before, of course). Indeed, the switch should be the whole goal. That's how progress is supposed to work, or it wouldn't be progress at all.

Likewise, the ancient Greeks believed that warm showers would make them weak and effeminate, whereas most modern people would say that cold showers just make them miserable rather than toughening them up. There are more than enough other issues to deal with without also having to suffer needlessly. I believe we generally do better when our basic needs are met rather than having to struggle for them : I prefer to live in Star Trek's comfortable, scientific Federation rather than with Dune's tough, religiously fanatical Fremen. Struggle should be reserved for the things you want, not the things you need.

Still, there is a case to be made that it would be a Bad Thing if your overall intelligence started decreasing because you stopped trying to think for yourself. You do need to actually engage your brain, and all skills are maintained by continuously operating near their peak levels. What I struggle to see is how it's ever possible (except maybe if you're far too rich) to make your life such utterly comfortable, so devoid of any challenge whatsoever, that this ever becomes of any actual concern to anyone at all.

While modern technology can streamline day-to-day life, making everything from dating to food delivery more efficient, it may come at a cost: early data suggests that our attention span may be shortening, critical thinking capabilities weakening, emotional intelligence fading, and spatial memory getting worse as we offload human tasks to our devices. The technological optimisation doesn't seem to be making us happier, either: despite the continual digital assists and enhanced communication of social networks, people still report high levels of stress and loneliness.

Yeah, but that's a lot of social media ends up with interactions with people we'd never want to engage with in real life. You then have to spend ages dealing with stupid people under the dubious grounds that you'll otherwise end up in an echo chamber*. This is partly because most people are, in fact, very stupid, and partly due to algorithmic rage-bait manipulation.

* This apparently being something unique to social media : of course, you can walk away from morons in real life, but online we're expected to deal with them calmly and rationally and take them very seriously for some reason.

That's why a growing number of people are restoring to the hottest new trend: "friction-maxxing", or rebuilding tolerance for inconveniences. The idea is to find tasks or ways of doing things involve a level of difficulty, time or patience. This could, for example, involve going "old school" and swapping digital tech tools for analogue solutions, such as reading rather than watching YouTube, navigating by road signs in place of Google Maps or calling a friend for advice instead of consulting ChatGPT.

But our brains operate on a "use it or lose it" principle, says Mark. Experiments in animal models show that effortful learning keeps new neurons in the brain alive. Studies also show that cognitively-stimulating activities like learning an instrument, reading, playing games and doing puzzles can preserve cognitive function as we age.

I don't get it. Who is finding their life so convenient that they miss the difficulties ? More importantly, HOW CAN I BECOME ONE OF THESE PEOPLE ? I find that one incomprehensibly weird. How are you not just exchanging one set of difficulties for another ? For example, if I no longer have to worry about how to write the code, I simply have more time for thinking about the problem I wanted to solve in the first place. I can't imagine ever reaching the end of the chain, as it were. Yes, it's good to have some amount of very low-level technical knowledge (some degree of grit for the mill, I guess), but I can't imagine who's living in such a utopia that they've already reached a state of post-scarcity difficulties. I'm a lot happier, not stupider, if I don't have to deal with segmentation faults.

According to one band of experts, the features of our digitised existence – constant notifications, 24-hour news and endless social feeds – can hijack this attention system, resulting in cognitive overload, mental fatigue and trouble focusing.

Much like the research on the effects of technology on our mental capacities, the studies of digital detoxes show mixed results. Some breaks from technology lead to better mood, improved focus, lower stress and more social connectedness, while others show the opposite or null effects. One 2014 study found that restricting screen time at a five-day nature camp improved preteens' emotional and social intelligence, while another 2019 study of university students found an increase in loneliness after abstaining from social media for one week.

Now that one makes a lot more sense. I can definitely see problems with "everything devices", like phones : I have a tendency to stockpile stuff for later consumption rather than immediately reading what interests me; I definitely fall into the infinite scrolling trap.

As per a quite different recent post, this is not necessarily a bad thing. Indeed, to a degree I even welcome distractions of my own choosing. I find it helps keep my focus maintained for longer overall if I split it up into chunks, every so often checking something very simple (like a news feed or social media) that doesn't derail my train of thought. Usually, this sort of back-and-forth doesn't become a problem when working, but I'm as guilty as anyone of scrolling on my phone rather than actually watching TV. Especially if I'm watching a show on my computer... the temptation to just quickly check the feed again is, for whatever reason, quite hard to resist. 

But to me this not a technological issue, it's a design choice. I love my digital notebook because it's total lack of features mean I use it for writing notes and reading long web articles, and that's it. Single purpose items, be they digital or otherwise, innately demand more attention because you can't just flick a button for more content. Moreover, so long as I put my phone away, I'm not tempted to go and check it. A single purpose device doesn't feel boring : once you're given something to do, and nothing to distract you from it, you'll just do that thing instead of trying to do dozens of things at once. Printed books do this, but so do well-designed digital products.

But even if friction-maxxing isn't the end-all solution we've been waiting for, "it doesn't hurt", says Mark. "If people are putting in effort, it makes them more intentional and thoughtful." Analogue hobbies such as crafting, gardening or reading – which involve friction as opposed to scrolling or streaming – can act as "active meditation", calming the mind and reducing stress. One 2024 study of more than 7,000 adults living in England found that those who engaged in crafting or the creative arts were more likely to report significantly higher life satisfaction, a greater sense that life is worthwhile and increased happiness. 

"I realised that a good life isn't an easy life," Semple says. "There's an enjoyment that you're cheated out of when you take the easy route."

And this is fair. Most analogue activities are innately more focused because there are no equivalents of "everything devices" or constant notifications. Multi-functionality has its place, but it's not easy to stay to track with such things. The real difficulty is perhaps the web itself. It doesn't make any sense to have a separate physical device for every website or digital tool : some purposes, like photography, can easily and sensibly separated, but many others simply can't. 

What's the solution ? Self discipline is part of it, but it's hardly the whole answer. On a totally different video, I liked the point that it's simply not fair to expect people to be able to resist something that's designed to capture their whole attention. So better designs are also needed, e.g. alternatives to infinite scrolling, more nuanced control of feeds, easy options to control which notifications come through and when (Gmail and DuoLingo, shut the fuck up already)... I don't think this would actually be difficult to do and it would cost exactly nothing.

The problem is getting corporations to see that making us engaged with more and more things but each for less and less time is, in the long term, a stupid metric to gauge product success, and that providing us with one good service is better than the option of a hundred crappy ones. How we reign in this tendency, I don't know.

Saturday, 28 February 2026

Listening To The Voices In Other People's Heads

Here's a very nice long read from the Guardian about trying to understand what's really going on inside our heads.

Recently I concluded that any attempt to understand what consciousness actually is is likely hopeless. Trying to understand what we mean by experience when literally all we have access to is experience is inevitably circular. But the effort itself isn't fruitless. We can all of us have different preferences for what we think is going on – non-physical, spiritual, purely materialistic – and that discussion is often productive, if only in understanding how people reach radically different conclusions from the same data. More promisingly, it gives us a better handle on how we go about defining things, how we grapple with the imperfections of mapping language onto reality.

But there were also two more directly productive outcomes described in the Aeon essay I looked at last time. One was that we could understand in some detail the neural correlates of consciousness, the processes occurring within the brain that are associated with what we think and experience. The second, somewhat subtler issue, was that we can still take a reductive approach to different aspects of consciousness : we can describe it in terms of different levels and content, and in so doing get back to something we can discuss in familiar scientific terms. Just as we don't have to worry about what a quark is really made of to understand how it behaves, so too we can tackle the subject of minds.

The Guardian piece is complementary to the Aeon article in that it leans more heavily in this direction. As it begins :

A neuroscientific perspective on consciousness might tell us something about its neural correlates, but it is unlikely to tell us much, if anything, about the nature of thoughts or the textures of inner experience; it’s the wrong tool for that job. So what might we learn about consciousness if we gave more weight to the view from inside the experience – the phenomenological viewpoint ?

For example, it describes William James' comments on something I've found extremely strange for many years :

“Suppose we try to recall a forgotten name,” he writes. “The state of our consciousness is peculiar. There is a gap therein; but no mere gap. It is a gap that is intensely active.” A sort of ghost of the absent name haunts the empty space in our consciousness, he suggests, making us “tingle with the sense of our closeness, and then letting us sink back without the longed-​for term”.

He goes on: let someone propose a candidate for the missing name, he suggests, and even though we have no consciousness of what the name is, we are somehow conscious of what it is not, and so summarily reject it. How strange! Our consciousness of one absence is completely different from our consciousness of another. But, he asks, “how can the two consciousnesses be different when the terms which might make them different are not there ?” 

The feeling of an absence in our minds is nothing like the absence of a feeling; to the contrary, this is an absence that is highly specific and intensely felt. Thoughts glimpsed from some height of awareness but somehow not yet formed, much less put into words or images – this is the subtle terrain James invites us to explore with him.

I very much agree with this. It's always seemed weird to me how we can think in complete sentences. We – or rather I, and I'll get back to why the distinction matters later on – don't really sense the words falling into place, they just come out like that. Clearly they must have been assembled at some point, but somehow this happens without us knowing about it ! 

Perhaps even weirder is that sensation when grappling with a complex problem : at the point of reaching a possible solution comes a very distinct sensation of raw, unstructured thought, some quasi-awareness that the answer has been reached but without being able to articulate it. That moment is crucial. If interrupted in this momentary phase, the proto-thought may be lost entirely. But if it's seen through to completion, the thought crystallises into language : something we can easily memorise, recall, and communicate with others.

Much of the article is then concerned with whether we can access this much lower level of thinking in some way. If we want to understand consciousness, can be go beyond language ? Maybe that's too ambitious. To start with, can we at least access thoughts at the stage that they become coherent ? 

Step forth, Russell T. Hurlburt :

For half a century now, Hurlburt has been scrupulously collecting reports of people’s inner experiences at random moments – and just as scrupulously resisting the urge to draw premature conclusions. A die-​hard empiricist, he is as devoted to data as he is allergic to theories... I’ve been going around with a beeper wired to an earpiece that sends a sudden sharp note into my left ear at random times of the day. This is my cue to recall and jot down whatever was going on in my head immediately before I registered the beep. The idea is to capture a snapshot of the contents of consciousness at a specific moment in time by dipping a ladle into the onrushing stream.

What he is after in his research is the “pristine inner experience”, by which he means a sample of human thought “unspoiled by the act of observation or reflection”. Like James, Hurlburt acknowledges that the act of recalling and describing an experience is bound to alter it, but he believes that his method can get us closer to the uncontaminated ideal than any other.

In some ways this should be extremely easy. Again, we definitely do have coherent, structured, linguistic thoughts, and writing those down is straightforward enough. Indeed, as this fascinating BBC article describes, it's even possible to read these directly from the brain. This doesn't require participants to mentally try and speak : to a degree, thoughts can now be extracted at a lower level than this.

But then again, even the most fully-developed thoughts can be extremely slippery. As I've noted, once you put pen to paper you engage in a sort of self-conversation, thoughts and beliefs becoming highly flexible once you start reflecting back on them from an external input. Take a thought from in here and put it out there and it inevitably changes, even if only just a little. 

Still, this can largely be avoided : once a sentence is formed, it can be written down. A more difficult aspect of the problem is trying to disentangle that bit of coherency from everything else we're thinking about :

I took out the little pad provided by Hurlburt and jotted down this thought: “Deciding whether or not to buy a roll.” I know, not terribly exciting, but it seems very few of my mental contents are. I was thinking ahead to lunch and wordlessly deliberating whether to buy a fresh roll for a sandwich or do the responsible thing and use up the heel of bread I had at home. I was also conscious of the pattern of the skirt – an unflatteringly large plaid – worn by the woman standing in line ahead of me. 
Was that observation part of the moment in question, or did it come immediately before or after? I couldn’t say for sure. (How long does a moment in consciousness last ?) And what about the pervasive smells of freshly baked goods and cheese ? These both preceded and followed the moment under examination, but were they present to my awareness at the beep ?

Throw in just a few complications and suddenly the problem becomes much more difficult, maybe even impossible. Words ? We can attend to them. The whole facet of experience, including our different senses, how they affect us, when they occur, when we assemble a sentence ? That's much more fuzzy. Trying to describe anything before the point of coherency may even be a non sequitur. Maybe we can pin down a bit more about the process by determining what we're currently experiencing (e.g. which senses are given priority over the others, where and when we give out conscious attention to language rather than sensory experience, what counts as thought, etc.) but there will be some limits as to how far we can get with this.

Some really fascinating things have come out of the research though :

The first finding, to which I can personally attest, is just how little most of us know about the characteristics of our own inner experiences. “That’s probably the most important finding that I’ve got,” Hurlburt said.

Important, yes, but I think the other findings are a lot more interesting : 

Inner speech, which many of us – including many philosophers and neuroscientists – believe is the common currency of consciousness, may actually not be all that common. Hurlburt estimates that only a minority of us are “inner speakers”. So why do we think we talk to ourselves all the time? Perhaps because we have little choice but to resort to language when asked to express what we are thinking. As a result, we’re “likely to assume that’s the medium for inner thought”.

But that doesn’t make it true for everyone. Fewer than a quarter of the samples that Hurlburt has gathered report experiences of inner speech. A slightly lower percentage report either inner seeing, feeling, or sensory awareness. Still another fifth of his samples report experiences of “unsymbolised” thought – complete thoughts made up of neither words nor images. Hurlburt has suggested that we fail to recognise the diversity of thinking styles because we lump them all together under that single word – thinking – and assume we mean the same thing by it, though in actuality we don’t.

Aphantasia and the lack of inner speech is something I've covered many times before, but this is something beyond that. It's something I know I must have but am absolutely incapable of imagining. A truly pure thought consisting of... what, exactly ? Not language. Not any of the senses. Just pure electrochemistry, I guess. That's absolutely wild. 

EDIT : To a degree, I can imagine this. My inner monologue is pretty incessant, but it's not constant. There are times when it shuts up and all I have is sensory experience and emotions. What I absolutely cannot do is properly articulate what's going on. Is there still some processing going on using language at the lower levels, ready to be raised to my awareness for perusal when required ? Or is it fundamentally different at the earlier stages and only converted to language later on ? More below.

I wonder if the brain scans described in the BBC would be capable of interpreting these in the same way as for the (to me normal) case of thinking with an inner voice or eye. Perhaps it's like blindsight. That is, maybe our brains are all doing basically the same low-level stuff, but sometimes not everything is raised to whatever part it is that brings it to conscious awareness. Or maybe, even more interestingly, we don't all work in quite the same way. Regardless, scans of people who aren't thinking with inner speech, imagery, or any kind of structured thoughts would surely make for a fascinating comparison. Would the scan reveal the same thing as in those with well-defined internal monologues or would it show something else altogether ?

Another researcher suggests a different and more holistic approach :

The field’s focus on conscious perception has led it to overlook the 30-50% of mental experience that is fed to us by our minds rather than our senses, Kalina Christoff Hadjiilieva contends. “Consciousness is just one function of the mind,” Hadjiilieva told me during one of a half-​dozen interviews, this session over a cup of tea in my garden. “To focus on conscious thoughts is like focusing on the leaves of a tree and trying to understand them in isolation,” she said. “The tree is the mind, and there’s a lot more to the mind than consciousness.”

The degree to which the mind wanders appears to be surprisingly important :

Hadjiilieva conducted an experiment with long-​term meditators (mindfulness practitioners). These are people who have been trained to still their minds but also to notice the precise moment when that stillness is broken by an errant thought, which Hadjiilieva found happens every 10 to 20 seconds or so even in these trained minds. (“The big lesson of meditation,” she said, “is that the mind cannot be controlled.”)

This makes intuitive sense to me, and again maybe reveal something about the structure of thought processes. The way I like to work even, when in a state of relative focus, is often to flit back and forth between a couple of different things at once. I like to check my emails and glance at the news quite frequently, only going into really deep focus every once in a while*. WIth some tasks this works very well : it's like I have my brain keep working on the other thing in the background while giving my consciousness time to rest by attending to something easier. What's crucial for me, though, is that these must be activities of my own choosing. Being disturbed by an external influence is a big no-no, If someone interrupts me then the process is instantly broken.

* Though my work habits vary considerably. For code I nearly always concentrate on the code and absolutely nothing else, with a similar situation for most difficult problems. It's for the routine, less cognitively demanding tasks that I prefer to have multiple tabs open, as it were.

This all ties is quite nicely to the earlier discussion :

Hadjiilieva and her colleagues noted a jump in activity within the hippocampus, a key component of the default mode network that is involved in not only memory but also learning and spatial navigation. To their surprise, the leap in hippocampal activity preceded the arrival of the thought in the meditator’s consciousness by nearly four seconds – an epoch in brain time, and far longer than it takes for a sensory impression to cross the threshold of our awareness.

You might wonder if this further shifts my uncertainty about the apparent non-physical nature consciousness. In this case, it doesn't. I've already covered similar experiments regarding free will, and here it seems to me that no neural correlate could be anything remotely like subjective experience : how do some electrons whizzing about resemble the smell of a daffodil or the feeling of anger ? They simply don't, and to assume otherwise is to completely miss the point of the Hard Problem. But to build on from the previous post, this is still very interesting stuff : 

“Something is going on prior to awareness,” Christoff Hadjiilieva said, but she’s not sure exactly what it is or why it takes so long. This finding indicates that a spontaneous thought must undergo some sort of complicated unconscious processing before finding (or forcing) its way into the stream of consciousness. For Hadjiilieva, the mystery she’s uncovered points to what she regards as the “really hard problem of consciousness” – how the contents of the unconscious form into thoughts that sometimes find their way into our awareness, and sometimes don’t.

Well, that's definitely a hard problem, and maybe it would even be better to call it the hard problem. The Hard Problem of the philosophical sense may well turn out to be the Impossible Problem : we literally can't understand our own subjective experience, since by definition this is all we have access to. In that case some relabelling makes good sense. It seems very reasonable that the time delay points towards the brain doing some unconscious information processing before raising it to our awareness, and understanding how and why this happens seems extremely difficult but far from outright impossible.

The article concludes, in typical Guardian fashion, with a warning of the dangers of capitalism in preventing our minds from going about their productive, unguided wanderings, as well as the difficulties of persuading people to treat research into the subconscious as serious science. Perhaps the author should have read that BBC story. If you can access this with a machine, the danger may not be that nobody takes it seriously, but the exact opposite.

Monday, 23 February 2026

The Truth About Utility

What makes a useful definition ? Originally I had a much more philosophically pretentious post semi-drafted for this, and I may still do that one separately. But various recent discussions have taken me down a very different path, one which might be more, err... useful. So let's start with this one and see if I ever get back to the nature-of-reality version in a future post. 

A good definition, I think, must surely be something which is widely applicable but also specific : things which happen frequently but not always. It should be readily distinguished from similar counter-examples. Crucially, it cannot be something which either never happens or always happens. It shouldn't be something which forbids itself entirely or makes it inevitable or ubiquitous. It should describe a specific thing that actually sometimes happens, or is at least conceptually valid and distinct from other, similar terms. 

What I see people doing is trying to make things true or false by setting their definition up in such a way that it cannot ever fail, and that to me seems like a mistake. This doesn't mean we can't have productive discussions, but it does, I think, impose some extremely unhelpful limitations. 

Let's do this one by example. First I'll look at cases where people define things such that they can't ever happen, and then the reverse, related case of defining them such that they're inevitable. Both in my view are counterproductive mistakes. They are terminology problems but they prevent us from getting at what we really mean, which is usually much more interesting.


1) Defining things out of existence

If we define something with such precision, such high standards, or such that it involves a logical contradiction that it can't ever be true, then I submit that this isn't a useful definition at all. Furthermore, it's likely not what we really mean when we use the term in everyday discourse.


Malevolence : Plato and other ancient philosophers help that nobody would knowingly do evil. I forget who it was who described it explicitly (possibly several people), but the basic idea was that if you knew something was wrong, you couldn't possibly do it. You might still carry out an immoral action, but you'd be misjudging, thinking that the gratification you would get would outweigh any negative consequences there might be for anyone else. Alternatively, you might do so only because you hadn't realised the existence, extent, or nature of those negative consequences.

I think this is a deeply mistaken view of humanity. As per the link, people certainly carry out heinous acts in full knowledge of their full consequences, sometimes this being the very reason for their behaviour rather than a side-effect. Or they may know but simply not care. But they aren't, I think  carrying out a mental calculus of where the balance lies. Even if they were, this would still make the word – or notion of wilful harm – meaningless. The point for most discussion is that people cause each other harm sometimes because they want this to happen, not out of ignorance. Anything more beyond this rapidly leads into such convoluted nuances that the definition collapses into uselessness. 

Or to put it another way : "Sure, he committed the murder, but it wasn't out of malice : that's impossible, so he must've done it because he mistakenly thought his pleasure at the victim's suffering would outweigh their actual suffering". 

To me this makes the word unproductively useless, trying to define the thing out of existence. Surely that points towards this meaning not being what we truly meant : the important thing is that people inflict harm on others for its own sake, and inferring anything further is best avoided altogether.


Altruism : In the opposite case, my partner likes to say that nobody is really altruistic. Everyone acts, she says, because they believe there will be some benefit to themselves, even if that reward is purely emotional. In the extreme case, someone might be give their own lives to save others, not because they thought the value of the lives they saved outweighs their own, but because of the fleeting emotional reward they themselves will get from knowing the others will live.

This too I think is surely putting more on the word than it can bear. The point of altruism, I'd say, is that we sometimes value others more than ourselves and act to bring a net benefit to them even at the expense of our own status. Start demanding that we get no emotional reward at all and again the term has been defined out of meaningful existence. This makes it utterly useless, and surely, therefore, this can't be what most of us mean most of the time when we use it. I'll qualify this a bit more later, but that general-case point is the one I want to focus on.


Knowledge and understanding : I've covered the nature of LLM-outputs several times, most recently here but also e.g. here and (tangentially) here. More on those more directly in a minute, but a closely related claim is whether they can be said to truly understand anything. I think they can, in the carefully qualified sense that a) they have access to some form of information; b) they form connections between different pieces of information; c) they act in a logical, coherent way to predict how things behave in novel situations. Not perfectly, it's true, but more interesting by far is that they do it at all. 

Now for sure, this is not the same sort of understanding that humans have. But its qualitative similarities, in my view, outweigh and are far more interesting than the quantitative differences between silicon and neural understanding. I think it's just not at all useful to say that "meaning only comes from humans ascribing this to the output". This is so inevitably necessary that it adds nothing useful to the discussion : well who else was going to be reading the output then ? And if you define "understanding" to be only a human thing, then it's tautologous that no non-human will ever have it. That's cheating.


2) Defining things into existence

We can now see how the reverse is also true : if we define a thing as being completely unavoidable, we won't get anywhere.


Hallucinations : See the links in the previous entry as this follows directly from the previous definition. I did initially agree that it was sensible to describe all LLM output as a hallucination, but I changed my mind some time ago. Given that they are now able to process complex (and multi-modal) information in a way that closely aligns with human expectations, and can in fact exceed our own predictive capacities at least some of the time, I now no longer think describing their output at purely hallucinatory makes much sense. 

It's more useful, I think, to say they're hallucinating (in their own peculiar way) when their output has no connection whatever to the input data or prompt. This is much more analogous to human hallucinations in which we see things which aren't there. I would still agree, provisionally, that LLMs treat all information has having a much more similar level of validity than humans do, and undeniably they have some qualitative as well as quantitative differences from human thought. But they are very clearly not purely fabricating stuff all the time : more often than not, they're processing their inputs quite sensibly.

Importantly, the claim that all LLM output is a hallucination is consistent with the notion that they don't understand anything. I'm not claiming incoherency here : I'm claiming that these definitions should be discarded because they aren't useful, not because there's any inherent problem as such. The alternative definitions I've suggested are, I think, better only because they are more flexible and specific, allowing us to describe things in more detail, not because they eliminate any inconsistency.


God : Don't say I'm not ambitious ! The old argument that god is necessarily perfect and perfection necessarily exists... well, surely this is the ultimate case of trying to assert truth by definition. God is a perfect what, exactly ? A perfect square ? A perfect teapot ? Well, if a perfect teapot exists, where is it ? Could it be Russel's Teapot, somewhere beyond Earth's orbit ? Surely not, because if it was perfect, it would be in my hands whenever I need it. But it isn't, and therefore the perfect teapot clearly does not exist. 

And if even the perfect teapot doesn't exist, I see no reason to say that the abstract concept of perfection itself – a Spinozan notion of God – also has to exist in any sense beyond a mental construct. Clearly, I can imagine what I think a perfect teapot should be like, but that has no further existence outside my head. There's no reason to think that perfection itself is any different.

So here too, "perfect" in the everyday sense does not mean the same thing as St. Anselm would have it mean. Nobody uses perfect to mean "something which must exist" : indeed, we often use it to describe things which can't exist precisely because they're perfect ! "Platonic ideal" might be one of Plato's better ideas here, if only in the concept : we can conceive of better examples of chairs and circles and virtues even if we can't bring them into being. That's generally how we use the word, to describe something specific in aspect, not the singularity-like God of the Upanishads

As far as the existence of God goes, and very much with my agnostic hat one, I think definitions here are of no help whatever. We can conceive of perfect examples of things we fundamentally do understand, like circles. But perfection itself ? That would require understanding all facets of existence, which as imperfect beings we simply can't do and never will. A general understanding of perfection is beyond our limitations : we can no more say that "god's perfection means he exists" than we can say what a perfect dinosaur would be like. The concept may simply be incomprehensible or it may not even make sense at all.




That's my idea of a good definition then. It should be specific, flexible, distinct from alternatives, and describe things which occur at a finite rate (even if only conceptually). If a definition forbids itself from ever existing, or would always be true, then it has no use cases and should be discarded. Those kinds of definitions usually twist readily-comprehensible everyday meanings into something convoluted, unproductive, and useless.

I'll stress "usually" a little bit though as I don't say that the extremes don't matter at all. For example, what do we mean when we say we're know something or that we're certain of it ? Usually, that our own belief is well-formed and our confidence is beyond reasonable or routine doubt. We don't usually mean that we have found Truth Itself, that we can state our claim with literally zero chance of it being wrong, and that all unbelievers are evil and/or stupid. 

Like the case of purest selflessness, this kind of concept definitely does have value, but more in the philosophy classroom than the real world. The extreme cases let us frame our own actual beliefs and compare them to those of others, rather than providing useful, workable definitions in themselves. For example, we can all agree on what true certainty would actually mean, but to use the word more practically, we have to scale things back. That's where the discussions start to get interesting, trying to figure out the limits of our own underlying reasoning as well as that of the others in the debate. 

To a very large extent, I think the question of how we use a definition is very much the same as what we think it means at the most basic level. But then, others may have a different understanding. I don't always agree with the alternatives, but trying to figure them out is usually the fun bit.

Thursday, 19 February 2026

AI For Fun Or Profit ?

The Czech Academy of Sciences, the research council which funds my own employment, recently put on a five-part webinar giving detailed guidance on how to use LLMs in research. It was a good course with some useful tips and tricks and a few tools I'll try to check out eventually. The presenter seems like someone who really uses AI a lot, like for absolutely everything, but as you'd expect in a decent course, it was full of caveats : don't use it to do X, always check its citations, don't take its output for granted. 

The best line was to "treat it like a skilled researcher who's on drugs". You wouldn't discount everything they say, but you wouldn't trust them either.

It's all very common sense really. The course had about 80 attendees, and pretty much matches my real-world experience with colleagues. In every discussion, every single one, everyone gets in a bit of a circle-jerk about how useful AI is but how it can't be trusted. There's basically no-one who isn't using AI to some degree, and similarly nobody who's trusting its output without question.

After this course, I wonder if perhaps I'm not using AI enough. Would I be more productive if I did ? Possibly, to some degree. But I think not a great deal. Personally I simply see no point at all in using AI to replace my own voice : when I want to express myself, in any medium, if it's not me doing it then I might as well not bother at all. Okay, for the final polish here and there, or checking if I'd got the basics correct, or follow-ups... sure. But the basic gist of the text has got to be my own, even if it's imperfect. 

So I just don't see how it can be any help in preparing Power Point files* or writing whole paragraphs in a grant application, let alone anything in a publication**. For writing code I'm perfectly happy to let it go nuts, so long as it's doing grunt work and/or I just want something quick that works. But even then, if it's something I'm going to want to maintain, or want to understand what's going on, I find it far more valuable as an assistant, someone who can teach me and simplify things where necessary, not do the whole thing for me.

* No, we're not calling them "slide decks", thankyouverymuch. WTAF is wrong with people ? 
** The author has prompt instructions for just about everything. If nothing else, maybe some of these will at least be useful guidelines for people to follow when doing the tasks the old-fashioned way.

The use of AI chatbots is very possibly where my real-world and online lives are most dramatically at odds. Offline, AI is already normal. Like totally normal. Online, there's a far bigger fraction who are still clinging to the idea that it doesn't work, won't work, can't work, is innately immoral, etc. etc. These are sentiments I've barely encountered at all in everyday life, which veers much more towards thinking that people not using AI are either a) old or b) weirdos.

What concerns me for today's post is why we don't appear to have seen any transformation in the economy as a result of AI. It seems abundantly clear to me that AI does work, so where are the productivity gains ?

Now I'm not expecting any instant revolution. The hype train that AI will lead to FTL travel and immortality and a utopian world by next Tuesday is not worth considering. Nor do I think it's capable of fully automating any significant number of jobs anytime soon. But it is, unarguably, an extraordinarily useful tool for people using it properly. It's not unreasonable to expect that we see some measurable effects of this, so here's a round-up of some recent articles giving some different perspectives.




One widely-reported study found that 95% of companies had seen no measurable impact from generative AI. I asked the lecturer about this : she said she didn't know, but speculated that maybe this was using earlier models, particularly unrepresentative samples etc. This post presents some plausible rebuttals : most crucially, the question is "95% of what ?". Apparently it's 95% of all companies, not companies that actually tried using AI ! So much for that.

This Nature piece leans pretty much in my direction that AI has tangible benefits and will, like the internet, restructure things to such an extent that it's difficult to know which metric to use for judgements. It makes the perfectly reasonable point that AI is advancing so rapidly that it's already difficult to know what we should be measuring; related to this is that adoption does not necessarily keep pace with AI capabilities. All very reasonable, but still... where are the gains ? Where's the money ?

A much more bullish piece* on "Noahpinion" (I never heard of it before) looks more at the different attitudes to AI. This at least partly explains the discrepancy in my real and online worlds : Americans are among the most "AI-concerned" people on the planet. Which fits with typical American bipolarism : let's invent a thing we spend crazy amounts of money on which we really hate. And in fairness, it seems to me that Americans are vastly more likely to be shafted by their employer than Europeans, so this attitude is not at all without foundation.

* Interestingly, the author is convinced that data center water usage is unimportant but that their electricity consumption is extremely high. I've seen other articles claiming the exact opposite. I don't know. To me it feels like this is all a massive distraction on the environmentalism front : what we need is to switch generation methods to renewables and nuclear and invest heavily in storage. Bitching about AI is pointless, and usually when I look at the claims and counter-claims, it seems to me that the impact of AI is heavily overstated at best.


To go off on a slight tangent, the author also notes that complete omniscience is a myth. Yes, AI makes mistakes that humans don't, and its error fraction is higher than that of true experts. I would also note that real experts are generally more self-aware of their own limitations and vastly more likely to say "I don't know" when asked about things outside of their own domains. But still, the problem is fundamentally the same : here is a claim, how do we know to trust it ?

The answer is simple. Everyone's worried about AI fakes and manipulation, but ultimately we have to treat it just like any other source. We literally do not have access to perfection. Everything requires a degree of trust and verification; we should apply the same standards to AI as for anything else. That is, when things aren't critical, we go ahead and provisionally accept its claims. When there are consequences we need to double-check what it comes up with. That's it.

Though, a BBC piece presents and intriguing example of poisoning the well : deliberately writing a credible-sounding blog post to fool LLMs which rely on web searches. For me, ChatGPT wasn't fooled, but it's for sure an important point. Seeking out independent sources of evidence will become more important than ever.


To return to my main theme, Noahpinion claims that the effects of the AI bubble bursting are exaggerated and AI will lead to more jobs rather than less. This is all getting very murky.

For balance, a couple of more negative pieces. "Marcus on AI" is convinced that while AGI is achievable (I am not), ChatGPT will never live up to its promises (I think it's already doing better than I expected it would when GPT-3.5 was released). I think it's a strawman argument to say that because it didn't reach the absurd standards promised by the same techbros who initially claimed it was too dangerous to release (a bit of marketing genius, that) that it hasn't massively improved. Similarly I think his acceptance of the famous "95%" claim is clearly flawed as he doesn't explain his own reasoning : sure, another study finds that not many companies are using AI intensely, but this says nothing about what stage of adoption they're at or their long-terms plans. Maybe the original study he cites does say this, I don't know, but this needs to be included in any analysis.

More interesting is the claim that AI use at work is flatling or declining. But the timeline here is rather muddled, and from my own direct experience, I too became tired of GPT-4 for work use as it just offered nothing of substance. It could proofread for language and make very basic comments on the substance of the text, but it was pretty shite for discussing actual science. It would have some hits, true, but they were buried in a mountain of faff that was often just not worth the effort of digging through to get to the good stuff. And of course, it would hallucinate like nobody's business. 

ChatGPT-5, by contrast, was a massive, game-changing improvement on all counts, and that was only released six months ago : surely we should wait and see to determine what its effects are. Now this is of course is not to say GPT-5 is perfect. But I absolutely maintain my initial excited stance that this is a breakthrough which crosses important thresholds for making it an actually useful tool. 

Perhaps Marcus's most interesting claim :

If GPT-5 had solved these problems, as many people imagined it would, it would in fact of enormous economic value. But it hasn’t.

To be fair he's consistent in that he says it hasn't solved the old problems of hallucinations, lack of common sense etc. But I deny this vehemently. I think it's made massive, demonstrable, in-your-face progress, and saying that it hasn't solved these issues completely is a totally pointless claim. Of course it hasn't ! I never expected that it would. If you were actually expecting AGI, then more fool you. The whole perception of LLMs seems like a massive case study in the old quote that the perfect is the enemy of the good.

But this of course still leaves me with the dilemma : where then is this massive economic value ? We at least agree that a good AI would be of economic benefit. Marcus has the obvious "out" in that he doesn't believe the AI actually is any good, but I don't. Where, then, are the benefits I expect to see ?

One possible answer comes from Business Insider*. Perhaps, it suggests, the answer is ironically in that very demand for increased productivity. Software developers are now working not on one task at a time but on many at once, waiting in between prompts while the AI writes the code and then they clean it up. This is not a natural way of working that understandably causes a great deal of fatigue. In essence, then, AI is good enough to help, but still needs constant supervision and cannot fully automate much. It might be like self-driving cars which are actually more like driving assistants : the worst sort of grey zone, a necessary step towards something transformational but in some ways actually counterproductive in and of itself.

* I'd started skipping their pieces because they tended to be shallow and dull, but from what I've seen lately, they seem to have improved considerably. 

Much the most cynical of all the pieces here is from the I assume ironically-named "Pivot to AI". The author's case is basically that the economy is screwed, that CEOs are firing people left right and centre not because AI is actually capable but because they just love firing people and enshittification and all that. In direct contrast to Noahpinion, he says that the inevitable bursting AI bubble will be worse than the Great Depression and we'll all die, or something. Righty-ho then.




What are we to make of all this ?

It's very hard to say. We have two opposing hype trains : AI will transform the economy; AI will wreck the economy. So far it seems to have done neither.

Falling back on my own direct experience, AI is undeniably helping. It's allowing me to tackle things I wouldn't have done otherwise and understand things very much more quickly than I would otherwise. It has not yet had a measurable effect on my actual productive output as typical metrics would indicate (papers and the like), but since only GPT-5+ looks capable of influencing this, and this has only been out for six months, it's probably too early to judge on that score. 

By my own internal metrics it's definitely had a measurable output, allowing me to generate quite a lot of code I would otherwise have liked to have but never have gotten around to writing. Last year I even spent quite a long time using it to write a 75+ page introduction to radio astronomy that helped me enormously... if I ever have time to finish it, I'll put it online somewhere.

But shouldn't AI be solving the "if I ever have time" problem ? Yes and no. The thing is, the bottlenecks in my productivity lie elsewhere : primarily, meetings. In a busy period I might have one or even two meetings a day, which all told take up a full working day each week. Not all of these are useless (although some of them are), but in terms of actually getting stuff done, almost all of them have a negative impact. 

Likewise, not everything I work on is directly tied to productive output. I need to experiment and understand and pursue blind alleys. AI can help with some of this, but not all : in essence, it can alleviate a small amount of my workload to a very large degree. It can't help at all where my code tests are limited by other factors such as download speeds. I don't even want it to automate everything, because if I can't understand the science, what's the point ? Yes, it can help me understand things, but I'm the bottleneck here, not the AI.

From my perspective the answer is clear : AI has only very recently reached the point of being seriously useful, we're all still adjusting to how to best to use it, and there are many things it either can't do or we don't want it to do. This would suggest that we ought to see more substantial improvements in productivity on a timescale of a year or two, allowing for human adjustment, but those will be gains at the level of "nice to have" and won't herald a scientific revolution. Of course, "a year or two" is a crazy long time in AI-circles, and it's anyone's guess if it will finally have hit a wall by then or if it will continue to make radical gains (there are other avenues for improvement besides data volume).

But what of everyone else ? My only answer is for that we'd need a detailed study of the different working practises across multiple sectors. It is not enough to simply say "well it can do task X, your job revolves around task X, so you'll be a million times more productive now". All jobs require a lot of secondary tasks to facilitate the sharp end of productive output and not all of them can be automated. This is perhaps naivety on behalf of the techbros, thinking that because some key component can be done by robots that people will automatically adopt this practise and/or that productivity will be impacted in a linear, predictable fashion : as per the Nature article, it's far more likely that this will lead to more complex, systemic change.

Which means the answers are plausibly a combination of :

  • Seriously capable AI has only just arrived, with earlier models massively overrated.
  • We don't have good metrics for judging the efficacy of AI outside the lab.
  • Poor management strategies can mean that AI can make things worse, not better, even if it's ostensibly extremely powerful.
  • AI can improve some tasks enormously, but even when these are the most important part of a job, they are often far from the whole story. 

In short, AI has crossed a threshold for usefulness. It certainly hasn't crossed all such thresholds, and it's far from clear it's anywhere near doing so. Understanding the impact is a sociological and business problem every bit as much as it is a technological one. The good news for the AI enthusiasts is that it definitely does work; the bad news is that implementing this in a profit-generating way is anything but straightforward.

Courage, Merry, Courage For Our Pony

A very short post indeed because I just think this is something I'm going to need to keep coming back to.

This video discusses how Lord of the Rings is "just" Winnie the Pooh for grown-ups. It does an excellent dissection of why some twat called Michael Moorcock completely missed the point in arguing that LOTR was pure escapism and a refusal to engage in the modern world, which is really quite the bold statement considering that Tolkien fought in the trenches of the Somme. Fuck you sir, fuck you.

Anyone who'd read LOTR will know immediately that this is nonsense. Anyone who's read The Silmarillion will likely already be having heart palpitations, so I won't dwell on this at all. 

Rather I just want to provide the most important quotes on the morality of the whole thing. In all honesty, this brought a little tear to my eye. In this age of so much bullshit, where we have to deal endlessly with nonsense of racism and incels and toxic cunts raging perpetually about "wokeness"... this stands to me as the most perfect rebuttal of all of that. To label the morality of LOTR as "escapism", or to see it as unengaged with the brutal realities that life can bring... that is so impossibly stupid that I would wish a Darwin Award upon the author of so much absolute garbage.

It is masculine tenderness in response to the horrors of war. Because if Lord of the Rings is Pooh for adults, it means that even in the face of Mordor, in the face of the atomic bomb, the concentration camp, and the mechanized slaughter of the 20th century, we refuse to let go of the nursery’s wisdom. It’s an admission that the values of the nursery, kindness, friendship, a warm fire, and a good pot of honey aren’t weak. They are the only things worth fighting for. They are the only things that can fight the great men of history who want to burn the world down.

Ultimately, the hobbit is just a grown-up pooh bear. He has lost some stuffing. He has gained a sword. He has seen the face of war and carried a burden no bear should ever have to carry. But his core desire is the same. He just wants to sit down before a fire with his friends. And in a world like ours, preserving that simple joy is the most heroic thing you could possibly do.

Fool of a Moorcock ! Throw yourself in next time and rid us of your stupidity !

In like vein, I will share an extract of a conversation with ChatGPT. I still have an occasional guiltless pleasure of using it for crossover stories and comparing the relative strengths of characters from different universes. Yes, it's immature and silly, and no, I don't give a flying fuck, because it does nobody any harm and makes me giggle. Anyway, on this occasion I had a particularly silly example : Superman versus Morgoth. Unsurprisingly, Morgoth wins : he's second only to God himself, whereas Superman... isn't. But then I idly asked :

Ahh, but what if Superman also has... a small three-legged pony called Mr Stumpy ?

To which in response I received :

Introducing a small three-legged pony named Mr Stumpy fundamentally alters the cosmological balance of power. Let us consider the implications carefully.

First, Superman — that is, Superman — has been defeated before, but never when accompanied by a plucky, structurally unsound equine sidekick. This alone introduces a narrative force known across genres as the Unexpected Comedy Familiar, whose power scales with how unlikely it is that they should matter.

Mr Stumpy, with only three legs, possesses a built in advantage : he is inherently unbalanced, and thus immune to Morgoth’s attempts to “unbalance” the world. You cannot destabilise one who exists in a permanent state of wobbling equilibrium.

Now consider Morgoth. The true enemy of the Valar was always hubris. Morgoth’s pride is vast, but even he is unprepared for an opponent who clops into Angband looking like a children’s drawing brought to life. The dissonance alone could create a metaphysical feedback loop in the Music of the Ainur. Tolkien implied many things, but he never imagined Morgoth having to keep a straight face when a pony named Mr Stumpy trots bravely into his throne room.

More seriously, in a mythic narrative framework, Mr Stumpy is a classic “Story Token”. He is small, improbable and symbolises humility, steadfast friendship and the absolute refusal of the universe to take itself entirely seriously. Middle-earth absolutely hates that kind of thing — that is precisely why hobbits keep saving it.

The combined effect is something like this.

• Superman provides the physical might.
• Mr Stumpy provides the mythic disruption field.
• Morgoth’s grim, cosmic melodrama collapses under the weight of one brave, lopsided pony.
• The Valar quietly pretend they planned this all along.

In short, Superman alone loses. Superman with Mr Stumpy wins through narrative absurdity of the highest and most ancient order.

Say what you will of ChatGPT. I don't think it's a perfect response : "that's why the Hobbits keep saving it" is a bit out of kilter. But sod all of that : it's got the morality bang on, and that's what matters.

Wednesday, 18 February 2026

The Intentionality of Evil

A few thoughts after having let Nuremberg sink in for a little while. It's a great movie, not on the emotional level of One Life overall, but by God there are scenes in there which you'll need a stiff drink to try your very best to forget. Those, of course, are the clips of real footage from the concentration camps. We need not dwell on this. Nor am I going to review the movie as a movie. Rather I just want to draw attention to one particular theme on the nature of evil. 

Ordinary Men rightly explores the banality of evil, how normal people who don't hold especially strong views can, in the wrong circumstances, come to commit acts of wanton barbarity. Understanding how this happens is undeniably important. But a key component of this, the other side of the coin, is that such actions are highly unlikely to take place without a guiding mind. A mob can be violent but it's usually thoughtless and burns itself out after a riot; sustained atrocities require planning and organisation. 

Clearer Thinking had a closely related email-only post about this recently :

When evaluating a person's immorality based on an action they took, their intention is a very important factor, but when evaluating the badness of an action, it isn't.

Which I think is exactly right (though Existential Comics, as usual, expresses much the same thing in a far more amusing way). Actions and intentions are not the same thing. Even at the sharp end in the most extreme cases, the people committing the horrors are not, by and large, as evil as those telling them to do so, even if those behind it never so much as punch anyone. 

Why ? Because if circumstances were different, most normal people wouldn't necessarily repeat actions they knew to be wrong. The instigators would. For them, the repulsive outcome is precisely the point. They want to do this, they aren't trying to excuse it, and they would keep trying to make it happen even knowing the end result (which is brilliantly expressed by Goring's final admission in Nuremberg). Most ordinary people try and excuse their actions, and if they're easily manipulated, then at least if they're left to their own devices, they seldom resort to violence – at least not on a grand scale or to any great extremes. 

Of course, telling other people to go out and murder each other is itself an action. Merely having an intention or desire to harm other people is one thing, but to act on it in any way designed to bring this about is far worse. 

At this point I want to bring in a very interesting quote from Trevor Noah :

But I often wonder, with African atrocities like the Congo, how horrific were they? The thing Africans don't have that the Jewish people do have is documentation. The Nazis kept meticulous records, took pictures, made films. And that's really what it comes down to. Holocaust victims count because Hitler counted them. Six million people killed. We can all look at that number and rightly be horrified. 

But when you read through the history of atrocities against Africans, there are no numbers, only guesses. It's harder to be horrified by a guess. When Portugal and Belgium were plundering Angola and the Congo, they weren't counting the black people they slaughtered. How many black people died harvesting rubber in the Congo? In the gold and diamond mines of the Transvaal?

So in Europe and America, yes, Hitler is the Greatest Madman in History. In Africa he's just another strongman from the history books...

And yet I think we can say exactly why Hitler does have a genuine claim on being the Greatest Madman in History, or the most evil cunt who ever lived. Most strongmen, most dictators, don't care about how many people die under their rule so long as it benefits them in some way : if they were given another option whereby they'd be just as benefited but with fewer deaths, most would probably take it. Even Stalin might not have caused nearly as many casualties if he'd been given an alternative. 

In contrast, people like Hitler and Pol Pot most certainly would. For them the deaths are not a side-effect, but the whole point.

This is why my onetime go-to YouTuber Lindybeige is completely wrong when he says that Napoleon was more evil than Hitler because he (supposedly) killed a higher percentage of people. Napoleon didn't actually care very much. Running away from his army and leaving them to die horribly : of that he's guilty. Actually wanting them dead ? No. He'd have acted differently if he believed he could. He would not have ordered his own men to die out of any belief that they simply deserved it. He would not have acted like a Dalek :

The Doctor : What's the nearest town?
Van Statten : Salt Lake City.
The Doctor : Population?
Van Statten : One million.
The Doctor: All dead. If the Dalek gets out it'll murder every living creature. That's all it needs.
Van Statten : But why would it do that ?!
The Doctor: Because it honestly believes they should die.

The Nazis believed that. Their weapons were ordinary people, and those people were guilty of some of the most horrific crimes ever committed. But it was the leaders who made this happen. The responsibility is theirs. The ordinary people will always exist and always have this tendency to act as they do for good or ill; we can't much change that and it's no use lamenting about fundamental human psychology. The leaders though, they're absolutely and wholly responsible for their own actions. They're the ones we should turn and point to and say "you're a evil bastard". They're the ones we can do something about. Normal people, by and large, are much more like a force of nature.

Two significant caveats. First, this is not to say that we can't change how people on the ground respond to directives from above at all : we can, but this requires huge systemic, societal change. My point is that going after the leaders is much easier and much, much more effective.

Second, none of this means, in any way, that those firing the guns or releasing the gas weren't also immoral – of course they were ! Far, far too many of them simply, like Napoleon, didn't care enough to rebel. But this is still not the same as actually initiating the Holocaust and ensuring that it was carried out to completion. Put the same people with the guns in another life and they'll generally be completely harmless (we know this because this is exactly what did happen after the war); put Nazi High Command in another situation and they'll try and do the whole thing all over again.


This holds for much smaller crimes than genocide or universal domination. Adultery is seldom committed out of a desire to harm anyone. A robber who kills you to steal your TV is obviously immoral, but not as immoral as someone who comes into your home, kills you, and just leaves – even though the former has committed more wrong actions than the latter. The point is that most robberies don't intend violence. The morality of the person is different from the moral status of the actions they commit : someone who tries to kill your for its own sake is a worse person than someone who kills through apathy.

A final caveat is that I'm deliberately not attempting to set forth how we respond to these cases; this exploration as been purely for its own sake. The apathetic villain may well be more dangerous than the abjectly evil, in that they're harder to spot and ignorance/incompetence are easier to excuse. Nevertheless, if we stand in judgement of people, I would always deem those deliberately trying to inflict harm for its own sake as worse than those who aren't.

Lastly, Plato and many ancient philosophers essentially defined malevolence out of existence when they said that no-one deliberately and knowingly does wrong (Davros, in the above clip, similarly defends his insane plot on the grounds that he himself thinks he's doing good). But in my view, this is simply not a sensible definition at all. Someone who does harm to another for its own sake – because they think this person simply must suffer, even/especially when they know they don't actually deserve it, or do so for the sake of their own pleasure, or just inexplicably want this to happen – this person is being malevolent. Only by confronting this dark nature of the soul, acknowledging that the worst of us enjoy causing suffering for its own sake, can we guard against it. 

The Logician's Swindle

What makes a puzzle annoying ? When is solving a problem rewarding, and when is finding out the answer just frustrating ? If we could answer...