More

staticshock · 2026-04-26T20:36:04 1777235764

The eloquence with which this point gets (repeatedly) made is continuing to improve each next time I read it. However, I still feel like we haven't nailed it. That is, we are not yet at the "aphorism" stage of the discourse (e.g. "the medium is the message", "you ship your org chart", "9 mothers can't make a baby in a month"), in which the most pointed version of this critique packs a punch in just a few words that resonate with the majority of people. That kind of epistemological chiseling takes years, if not decades. And AI certainly won't do it for us, because we don't know how to RL meaning-making.

Edit: 9 babies → 9 mothers

bla3 · 2026-04-26T21:42:13 1777239733

> "can't make 9 babies in a month"

It's "9 women can't make a baby in one month".

bluefirebrand · 2026-04-26T23:00:53 1777244453

In fairness, 9 women can't make 9 babies in a month either

gerdesj · 2026-04-26T23:36:20 1777246580

No idea why you were dv'd.

It still takes roughly nine months to make a human baby, regardless of how many women or babies are involved!

fragmede · 2026-04-27T01:43:41 1777254221

You're assuming all women in your cohort start not pregnant. However, given a random sampling of women across the entire human race, if you have approximately 14,000 women, statistics says you'll have a baby in a month. That is to say, the chances of one of those woman being 8 months pregnant reaches close enough to 1, given about 14,000 randomly selected women.

Also, you can get a baby tonight if you steal one from the maternity ward.

The real question is, how do LLMs turn the mythical man month on its head. If we accept AI generated code, can an agentic AI swarm make software faster simply by parallelizing in a way that 9 women can't make a baby in 1 month because they're am AI, not human, and communicate in a different way.

The pitfall of AI coding is that previously every shiny tangent that was a distraction, is now a rabbit hole to be leaped into for an afternoon, if you feel like it. It's like that ancient Chinese curse, may you live in interesting times. Everybody can recreate an MVP of Twitter in a weekend now when previously that was just a claim a certain type of people made.

b00ty4breakfast · 2026-04-27T01:55:09 1777254909

that's still one woman per pregnancy, it's not 14k women collaborating on a single pregnancy.

bluefirebrand · 2026-04-27T02:03:58 1777255438

> You're assuming all women in your cohort start not pregnant

As far as I know, all women everywhere start not pregnant

fragmede · 2026-04-27T02:06:37 1777255597

Tribbles, on the other hand...

bluefirebrand · 2026-04-27T00:19:19 1777249159

Sometimes HN doesn't like jokes, which is okay. I didn't really contribute much to discussion, so I probably deserve some downvotes. I'm ok with it.

Brajeshwar · 2026-04-27T01:21:00 1777252860

Actually, I like quite a lot of the subtle jokes on HN. It is harder to notice, fewer to find, and I don’t get it many a times. But when I get it (or someone explains it to me, perhaps out of pity), I chuckle, laugh, and laugh again. And I remember those comments.

staticshock · 2026-04-26T21:49:49 1777240189

Hah, right, I mixed it up!

ctvdev · 2026-04-26T21:33:53 1777239233

> That is, we are not yet at the "aphorism" stage of the discourse

we learn by doing

nkrisc · 2026-04-26T22:18:25 1777241905

Put differently: you get good at what you actually do, not what you think you're doing.

If you're not coding anymore, but using AI tools, you're developing skills in using those AI tools, and your code abilities will atrophy unless exercised elsewhere.

ipython · 2026-04-26T21:38:36 1777239516

I’ve also seen along those lines “there is no compression algorithm for experience” - a nice summary of the hn posts from today.

canjobear · 2026-04-27T01:05:23 1777251923

There clearly is though. You don’t remember every detail of every moment that constitutes the experience.

skybrian · 2026-04-27T00:34:01 1777250041

It seems overly pessimistic about education. Book learning isn't everything, but a physics textbook could be seen as the compression of centuries of experience.

Ronsenshi · 2026-04-27T02:05:52 1777255552

Book learning to me seems like a compression of knowledge that had to be acquired through many years of experimentation and observation. But knowledge is not an experience itself.

Take juggling for example - something that was on HN homepage last week. You can learn everything you need to know about juggling though a post or a book or an educational video. But can you juggle after all that book learning? Not at all - to be able to juggle one has to spend time practicing and no amount of reading can help meaningfully compress that process.

Muscle memory required for juggling is not a 1:1 correlation to experience, but I feel like it's close enough to it.

kristianc · 2026-04-26T23:45:25 1777247125

... or by textbooks, Stack Overflow, senior engineers, code review. How many engineers today got their start by building Minecraft mods or even MySpace?

I do think that these pieces sometimes smuggle in a nostalgic picture of how engineers "really" learn which has only ever been partly true.

embedding-shape · 2026-04-26T21:44:16 1777239856

How about "Intelligence amplification, not artificial intelligence"?

Also could be shortened to "IA, not AI", and gets even more fun when you translate it to Spanish: "AI, no IA".

alphabeta3r56 · 2026-04-27T01:59:27 1777255167

Taste/judgement cannot an AI beget

viccis · 2026-04-26T22:48:53 1777243733

>the medium is the message

If you asked 100 Americans what this aphorism means, I strongly doubt a single one could capture McLuhan's original meaning.

apsurd · 2026-04-26T23:53:41 1777247621

You're right. ive struggled to understand what exactly this means, in large part perhaps it's so often misused?

I think it means something like we're trapped in the constraints of the medium. Tweets say more about the environment of twitter than whatever message happened to be sent.

but i think im off on that, ill look this person up and find out!

rdevilla · 2026-04-27T01:46:24 1777254384

Some examples.

Firstly, Twitter has an upper bound on the complexity of thoughts it can carry due to its character limit (historically 180, now somewhat longer but still too short).

Secondly, a biased or partial platform constrains and filters the messages that are allowed to be carried on it. This was Chomsky's basic observation in Manufacturing Consent where he discussed his propaganda model and the four "filters" in front of the mass media.

Finally, social media has turned "show business [into] an ordinary daily way of survival. It's called role-playing." [0] The content and messages disseminated by online personas and influencers are not authentic; they do not even originate from a real person, but a "hyperreal" identity (to take language from Baudrillard) [0]:

    You are just an image on the air. When you don't have a physical body, you're a
    _discarnate being_ [...] and this has been one of the big effects of the electric age. It
    has deprived people of their public identity.

Emphasis mine. Influencers have been sepia-tinted by the profit orientation of the medium and their messages do not correspond to a position authentically held. You must now look and act a certain way to appease the algorithm, and by extension the audience.

If nothing else, one should at least recognize that people primarily identify through audiovisual media now, when historically due to lack of bandwidth, lack of computing and technology, etc. it was far more common for one to represent themselves through literate media - even as recently as IRC. You can come to your own conclusions on the relative merits and differences between textual vs. audiovisual media, I will not waffle on about this at length here.

The medium itself is reshaping the ways people represent, think about, and negotiate their own self-concept and identity. This is beyond whatever banal tweets (messages) about what McSandwich™ your favourite influencer ate for lunch, and it's this phenomena that is important and worth examining - not the sandwich.

[0] Marshall McLuhan in Conversation with Mike McManus, 1977. https://www.tvo.org/transcript/155847

viccis · 2026-04-27T01:03:57 1777251837

It's confusing because "message" is not using its lay meaning, and decades of "medium" and "media" meaning drift meant that it isn't either.

For "the medium is the message", "medium" refers to any tool that acts as an extension of yourself. TV is an extension of your community, even things like light bulbs (extends your vision) are included in his meaning.

McLuhan argued that all forms of media like that carry a message that's more than just their content. "The message" in that argument refers to the message the medium itself brings rather than its content. For example, the airplane is "used for" speeding up travel over long distance, but the the message of its medium itself is to "dissolve the railway form of city, politics, and association, quite independently of what the airplane is used for."

You can see it happening via online media that extend ourselves across the internet. Think of how, once easy video creation via Youtube became uniform, web comics stopped becoming a popular medium for comedy online. It's not like the web comics faded because they got worse; it's that they faded into a niche format because people didn't want to communicate via static images anymore. Or how, once short form videos on TikTok got big, you saw other platforms shift to copy the paradigm. McLuhan's point is that it's not just the content of those short form videos that matters; it's the message of the format itself. Peoples' attention spans grow shorter because of the format, and before too long, we saw the tastes and expectations of the masses change. Reddit's monosite-with-subcommunities format and dopamine triggering voting feedback mechanism were its message more than any actual content posted there, and it's why traditional forums are niche and dwindling.

If you want to get a pretty good understanding of it, just read the first chapter from his book Understanding Media. It's short and relatively straight forward.

thomastjeffery · 2026-04-27T00:45:42 1777250742

Meaning is abstract. We can't express meaning: we can only signify it. An expression (sign) may contain the latent structure of meaning (the writer's intention), but that structure can only be felt through a relevant interpretation.

To maintain relevance, we must find common ground. There is no true objectivity, because every sign must be built up from an arbitrary ground. At the very least, there will be a conflict of aesthetics.

The problem with LLMs is that they avoid the ground entirely, making them entirely ignorant to meaning. The only intention an LLM has is to preserve the familiarity of expression.

So yes, this kind of AI will not accomplish any epistemology; unless of course, it is truly able to facilitate a functional system of logic, and to ground that system near the user. I'm not going to hold my breath.

I think the great mistake of "good ole fashioned AI" was to build it from a perspective of objectivity. This constrains every grammar to the "context-free" category, and situates every expression to a singular fixed ground. Nothing can be ambiguous: therefore nothing can express (or interpret) uncertainty or metaphor.

What we really need is to recreate software from a subjective perspective. That's what I've been working on for the last few years... So far, it's harder than I expected; but it feels so close.

staticshock · 2026-04-27T01:34:04 1777253644

> What we really need is to recreate software from a subjective perspective.

What does "subjective" mean here? Are you talking about just-in-time software? That is, software that users get mold on the fly?

Jarwain · 2026-04-27T01:46:46 1777254406

LLM's are a mediocre map, but they're a great compass, telescope, navigation tools and what have ye

rdevilla · 2026-04-27T01:27:15 1777253235

> Meaning is abstract. We can't express meaning: we can only signify it. An expression (sign) may contain the latent structure of meaning (the writer's intention), but that structure can only be felt through a relevant interpretation.

I'm reminded immediately of the Enochian language which purportedly had the remarkable property of having a direct, unambiguous, 1-to-1 correspondence with the things being signified. To utter, and hear, any expression in Enochian is to directly transfer the author's intent into the listener's mind, wholly intact and unmodified:

    Every Letter signifieth the member of the substance whereof it speaketh.
    Every word signifieth the quiddity of the substance.

    - John Dee, "A true & faithful relation of what passed for many yeers between Dr. John Dee ... and some spirits," 1659 [0].

The Tower of Babel is an allegory for the weak correspondence between human natural language and the things it attempts to signify (as opposed to the supposedly strong 1-to-1 correspondence of Enochian). The tongues are confused, people use the same words to signify different referents entirely, or cannot agree on which term should be used to signify a single concept, and the society collapses. This is similar to what Orwell wrote about, and we have already implemented Orwell's vision, sociopolitically, in the early 21st century, through the culture war (nobody can define "man" or "woman" any more, sometimes the word "man" is used to refer to a "woman," etc).

LLMs just accelerate this process of severing any connection whatsoever between signified and signifier. In some ways they are maximally Babelian, in that they maximize confusion by increasing the quantity of signifiers produced while minimizing the amount of time spent ensuring that the things we want signified are being accurately represented.

Speaking more broadly, I think there is much confusion in the spheres of both psychology and religion/spirituality/mysticism in their mutual inability to "come to terms" and agree upon which words should be used to refer to particular phenomenological experiences, or come to a mutual understanding of what those words even mean (try, for instance, to faithfully recreate, in your own mind, someone's written recollection of a psychedelic experience on erowid).

[0] https://archive.org/details/truefaithfulrela00deej/page/92/m...

IceDane · 2026-04-26T22:19:21 1777241961

Outsource manual labor, not your brain.

xnx · 2026-04-26T20:44:02 1777236242

This concept won't reach that point because when you chisel too hard it crumbles. There are countless lower level tasks that typical programmers no longer learn how to do. Our capacity for knowledge is not unlimited so we offload everything we can to move to the next level of abstraction.

lsy · 2026-04-26T21:35:01 1777239301

AI coding isn’t an abstraction, though. You can’t treat a prompt like source code because it will give you a different output every time you use it. An abstraction lets you offload cognitive capacity while retaining knowledge of “what you are doing”. With AI coding either you need to carefully review outputs and you aren’t saving any cognitive capacity, or you aren’t looking at the outputs and don’t know what you’re doing, in a very literal sense.

Krssst · 2026-04-26T23:15:26 1777245326

Non-determinism is not as much of a problem as the lack of spec. C++ has the C++ norm, Python has its manual. One can refer to it to predict reliably how the program will behave without thinking of the generated assembly. LLMs have no spec.

lukan · 2026-04-26T22:37:11 1777243031

"You can’t treat a prompt like source code because it will give you a different output every time you use it"

But it seems we are heading there. For simple stuff, if I made a very clear spec - I can be almost sure, that every time I give that prompt to a AI, it will work without error, using the same algorithms. So quality of prompt is more valuable, than the generated code

So either way, this is what I focus my thinking on right now, something that always was important and now with AI even more so - crystal clear language describing what the program should do and how.

That requires enough thinking effort.

lelanthran · 2026-04-26T22:57:42 1777244262

Didnt work for the prod data that the AI nukes in spite of prompts saying "DON'T FUCKING GUESS", just like that in all caps: https://news.ycombinator.com/item?id=47911524

What makes you think it will work for you?

lukan · 2026-04-26T23:45:47 1777247147

That I don't let agents run wild in a production environment?

habinero · 2026-04-27T01:53:05 1777254785

> if I made a very clear spec - I can be almost sure

That "almost" is doing a lot of heavy lifting here. This is just "make no mistakes" "you're holding it wrong" magical thinking.

In every project, there is always a gap between what you think you want and what you actually need. Part of the build process is working that out. You can't write better specs to solve this, because you don't know what it is yet.

On top of that, you introduce a _second_ gap of pulling a lever and seeing if you get a sip of juice or an electric shock lol. You can't really spec your way out of that one, either, because you're using a non-deterministic process.

xnx · 2026-04-26T23:50:15 1777247415

> AI coding isn’t an abstraction

Isn't it an abstraction similar to how an engineering or product manager is? Tell the (human or AI coder) what you want, and the coder writes code to fulfill your request. If it's not what you want, have them modify what they've made or start over with a new approach.

habinero · 2026-04-27T01:39:02 1777253942

No, because software engineering is more than <insert coin, receive code>. I've never had a full spec dropped on my desk lol. There's no abstraction.

Software engineering is a lot more social and communication-heavy than people think. Part of my job is to _not_ take specs at face value. You learn real quick that what people say they need and what they actually need are often miles apart. That's not arrogance, that's just how humans work.

A good product manager understands the biz needs and the consumer market and I know how to build stuff and what's worked in the past. We figure out what to build together. AIs don't think and can't do this in any effective way.

Also, if you fuck up badly enough that you make your engineers throw out code, you're gonna get fired lol

skydhash · 2026-04-27T01:16:39 1777252599

With an abstraction, you literally move your thinking up a level. So you move up a floor up the tower and no longer have to think what's happening below. The moment something leaves your floor, its course is set. If a result come back, its something familiar, not something from the lower floor.

A human coder can be seen as an abstraction level because it will talk to the PM in product terms, not in code. And the PM will be reviewing the product. What makes this work is that the underlying contract is that there's a very small amount of iterations necessary before the product is done and the latter one should require shorter time from the PM.

We've already established using a LLM tool that way does not work. You can spend a whole month doing back and forth, never looking at code and still have not something that can be made to work. And as soon as you look at the code, you've breached the abstraction layer yourself.

IceDane · 2026-04-26T22:20:30 1777242030

It's staggering to me how many times I've heard this argument that LLMs are just the next level of abstraction. Some people are even comparing them to compilers.

girvo · 2026-04-26T22:25:22 1777242322

> Some people are even comparing them to compilers.

A lot of people are using them as such too: the amount of people talking about "my fleets of agents working on 4 different projects": they aren't reviewing that output. They say they are, but they aren't, anymore than I review the LLVM IR. It makes me feel like I'm in some fantasy land: I watch Opus 4.7 get things consistently backwards at the margins, mess up, make bugs: we wouldn't accept a compiler that did any of this at this scale or level lol

habinero · 2026-04-27T01:54:57 1777254897

Right? People have put in decades of work to make them extremely reliable, they didn't magically start like that.

staticshock · 2026-04-26T21:09:56 1777237796

That's true, but I think it's beside the point. The flip side of that argument, which is equally true, goes something like, "not doing cognitive push-ups leads to cognitive atrophy."

There are skills we're losing that are probably ok to lose (e.g. spacial memory & reasoning vs GPS, mental arithmetic vs calculators), primarily because those are well bounded domains, so we understand the nature of the codependency we're signing up for. AI is an amorphous and still growing domain. It is not a specific rung in the abstraction hierarchy; it is every rung simultaneously, but at different fidelity levels.

kochikame · 2026-04-26T23:59:42 1777247982

> There are skills we're losing that are probably ok to lose (e.g. spacial memory & reasoning vs GPS, mental arithmetic vs calculators)

I'd argue these are not at all OK to lose. You live in an earthquake zone? You sure better know which way is north and where you have to walk to get back home when all the lines are down after a big one. You need to do a quick mental check if a number is roughly where it should be? YOu should be able to do that in your head.

There might be better examples that support your point more effectively e.g. cursive writing

staticshock · 2026-04-27T00:25:50 1777249550

Yep, there are tons. Growing food, building shelter, etc. But, for pretty much all of the skills we've allowed to atrophy in response to the advances of capitalism, technological & scientific progress, and societal changes, one COULD make the same basic argument, which is that losing that skill is detrimental to the individual, and yet here we are, not growing our own food, not building our own shelter, etc.

The arguments you make ≤ the values you actually hold ≤ the actions you take in support of those values.

I'm only interested in any such argument to the extent to which you've personally put it into practice. Otherwise, you're living proof of the argument's weakness. (To be fair, it's extremely hard to be internally consistent on this stuff! We all want better for ourselves than we have time and energy for. But that's my point: your fully subconscious emotional calculus will often undercut at least some of your loftier aspirations. Skills that don't matter anymore invariably atrophy due to the opportunity cost of keeping them honed.)

koshyjohn · 2026-04-26T23:57:50 1777247870

> "not doing cognitive push-ups leads to cognitive atrophy" This is one of the points being made in the post, at least in reference to people who already have some mastery of their craft. If they outsource their thinking without elevating it, they aren't exercising that metaphoric muscle between their ears.

ua709 · 2026-04-26T20:55:12 1777236912

I get your point, I just wonder how accurate it is. We basically never look at the output of the compiler, so I agree that tool allows one to operate at a higher level than assembly. But I always have to wade through the output from AI so I’m not sure I got to move to the next level of abstraction. But maybe that’s just me.

willhslade · 2026-04-26T23:09:19 1777244959

Are compilers deterministic?

ua709 · 2026-04-26T23:41:17 1777246877

I'm sure someone, somewhere, once wrote one that wasn't but in general, yes they are.

The ones I use certainly are. And with a bit of training you can reason and predict how they will respond to a given input with a large degree of accuracy without being familiar with how the particular compiler under question was implemented.

Not so with the AI tools. At least with the ones I use anyway.

dbalatero · 2026-04-27T00:55:58 1777251358

Given the same compiler, I believe they would be the same between runs given the same inputs. I suppose that could not be true at the margins, but I would expect correctness out of whatever path it chose.

23df · 2026-04-26T23:59:28 1777247968

For all intents and purposes yeh. Its really about the variance in actual outcomes vs the expected. The variance is not much is it? With LLMs that absolutely isnt the case.

imiric · 2026-04-26T22:13:50 1777241630

The idea that a tool intended to replace all human cognitive work is the next level of abstraction is so fundamentally flawed, that I'm not sure it's made in good faith anymore. The most charitable interpretation I can think of is that it's a coping mechanism for being made redundant.

Nevermind the fact that these tools are nowhere near as capable as their marketing suggests. Once companies and society start hitting the brick wall of inevitable consequences of the current hype cycle, there will be a great crash, followed by industry correction. Only then will actually useful applications of this technology surface, of which there are plenty. We've seen how this plays out a few times before already.

staticshock · 2026-04-21T05:48:46 1776750526

I can't believe this article does not mention what I think is the most puzzling part of the repair: the delicate process by which the individual fibers are FUSED TOGETHER in a way that maintains near perfect total internal refraction.

tambre · 2026-04-21T07:18:59 1776755939

You mean fusion splicing? That's common knowledge to anyone that's done any professional fibre cabling and you can easily find reading on it. The specifics of subsea cables however are much more elusive so it makes sense the article focuses on that.

staticshock · 2026-04-17T20:32:24 1776457944

In a well functioning system, the incentive to make money is somewhat aligned with the incentive to create value for other people.

ryandrake · 2026-04-17T21:08:32 1776460112

This is probably your point, but we are not in a well functioning system.

staticshock · 2026-04-18T01:23:17 1776475397

Not currently, but I have faith in our collective ability to push in the direction of such systems over the long arc of history.

staticshock · 2026-04-13T04:59:23 1776056363

I do the same. Gmail gives me a single, standardized interface for opting out of emails: mark it as spam. All the various companies I've given my email to, on the other hand, give me different, either clunky or often outright broken interfaces for opting out. There's no direct financial incentive for them to invest in making ethical, robust opt-out systems.

However well meaning, collectively all those companies are still just a bunch of sociopaths. This might be a bit dark, but I think a reasonable real world analogy here is stalkers and restraining orders. A stalker isn't motivated to listen to you when you tell them to stop talking to you. That's why you get the restraining order.

staticshock · 2026-04-12T19:59:43 1776023983

They were taught not to read errors because they encountered thousands of errors (in other software) that were less helpful than that one.

Most people have an adversarial relationship with software: it is just the pile of broken glass they have to crawl through on the way to getting their task done. This understanding is reinforced and becomes more entrenched with each next paper cut.

hermitcrab · 2026-04-13T08:43:55 1776069835

I guess it is a mindset thing. Techies see something like this as a problem to solve. Non-techies often panic at the slightest variance from what they were expecting. See also: https://en.wikipedia.org/wiki/Learned_helplessness

staticshock · 2026-04-10T02:35:09 1775788509

You know the saying, "when you owe the bank a million dollars, that's your problem, but when you owe the bank a billion dollars, that's the bank's problem"?

I suspect the theory behind OpenAI is to grow to be "too big to fail" as fast as they can, because once they cross that threshold, their liquidity/solvency problems will cease to be theirs, and become everyone else's.

staticshock · 2026-04-02T17:33:01 1775151181

LLMs seem to me closer to Kahneman's System 1 than to System 2. When understood in this way, it is obvious why LLMs are bad at counting r's in "strawberries". But it also makes ZEH feel like it couldn't possibly be a useful metric, because it's a System 2 evaluation applied to a System 1 system.

derefr · 2026-04-02T23:31:01 1775172661

FYI, the LLM letter-counting problem has nothing to do with counting per se, and is instead entirely down to LLMs not getting to see your raw UTF-8 byte stream, but rather having a tokenizer intermediating between you and it, chunking your UTF-8 bytes into arbitrary, entirely-opaque-to-the-LLM token groupings.

Try it for yourself — under the most popular tokenizer vocabulary (https://tiktokenizer.vercel.app/?model=cl100k_base), "strawberry" becomes [str][aw][berry]. Or, from the model's perspective, [496, 675, 15717]. The model doesn't know anything about how those numbers correspond to letters than you do! It never gets sat down and told "[15717] <=> [b][e][r][r][y]", with single-byte tokens on the right. (In fact, these single-byte tokens appear in the training data extremely rarely, and so the model doesn't often learn to do anything with them.)

Note that LLMs can predictably count the number of r's in "s t r a w b e r r y", because <Count the number of r's in "s t r a w b e r r y"> becomes [Count][ the][ number][ of][ r]['s][ in][ "][s][ t][ r][ a][ w][ b][ e][ r][ r][ y]["]. And that's just a matching problem — [ r] tokens for [ r] tokens, no token-correspondence-mapping needed.

orbital-decay · 2026-04-03T00:19:36 1775175576

>entirely-opaque-to-the-LLM token groupings

This is clearly not the case, any modern (non-reasoning) model easily decomposes words into individual token-characters (try separating them with e.g. Braille spaces...) and does arbitrary tokenization variants if forced with a sampler. It's way deeper than tokenization, and models struggle exactly with counting items in a list, exact ordering, retrieving scattered data, etc. LLM context works a lot more like associative memory than a sequence that can be iterated over. There are also fundamental biases and specific model quirks that lead to this.

8note · 2026-04-02T18:27:56 1775154476

> When understood in this way, it is obvious why LLMs are bad at counting r's in "strawberries".

no it doesnt. it makes sense that they cant count the rs because they dont have access to the actual word, only tokens that might represent parts or the whole of the word

orbital-decay · 2026-04-02T18:59:12 1775156352

Tokenization is a simplistic explanation which is likely wrong, at least in part. They're perfectly fine reciting words character by character, using different tokenization strategies for the same word if forced to (e.g. replacing the starting space or breaking words up into basic character tokens), complex word formation in languages that heavily depend on it, etc. LLMs work with concepts rather than tokens.

im3w1l · 2026-04-02T18:30:06 1775154606

A big part of skill aquisition in humans is moving tasks from system 2 to system 1, to free up the very scarce thinking resources for ever more complex tasks, that can then in turn be internalized and handled by system 1.

staticshock · 2026-02-18T21:23:05 1771449785

here's another good one: https://terraformindustries.com/

staticshock · 2026-02-15T19:47:55 1771184875

> the death of Payton Isabella Leutner

she's alive, so "attempted murder" would be more appropriate.

i really enjoyed the "we are 3D printers for our thoughts" framing!

munificent · 2026-02-16T01:04:24 1771203864

Sorry, misremembered the detail on that one!

staticshock · 2026-02-11T18:14:53 1770833693

I think the way to see this as the organic process of discovering hard-to-game benchmarks. The loop is:

1. People discover things LLMs can kind of do, but very poorly.

2. Frontier labs sample these discoveries and incorporate them into benchmarks to monitor internally.

3. Next generation model improves on said benchmarks, and the improvements generalize to improvements on loosely correlated real world tasks.