More

swingboy · 2026-04-17T14:19:27 1776435567

Cloudflare is _really_ going all in on the agentic stuff.

swingboy · 2026-04-15T11:36:42 1776253002

Do these scores actually mean anything? Isn’t the LLM just making up something? If you ran the exact same prompt through 10 times would you get those same scores every single time?

grey-area · 2026-04-15T11:54:52 1776254092

Yes I'd be interested in that answer too - these scores are most likely just generated in an arbitrary way, given how LLMs work. Given how they work in generating text it didn't actually keep a score and add to it each time it found a plus point in the skill as a human might in evaluating something.

At this point I'd discount most advice given by people using LLMs, because most of them don't recognise the inadequacies and failure modes of these machines (like the OP here) and just assume that because output is superficially convincing it is correct and based on something.

Do these skills meaningfully improve performance? Should we even need them when interacting with LLMs?

crustycoder · 2026-04-15T12:12:37 1776255157

They aren't arbitrary, as I said earlier I got the LLM to de a detailed analysis first, then summarise. If I was doing this "properly" for something I was doing myself I'd go through the LLM summary point by point and challenge anything I didn't think was right and fix things in the skill where I thought it was correct.

You aren't going to have much success with LLMs if you don't understand that their primary goal is to produce plausible and coherent responses rather than ones that are necessarily correct (although they may be - hopefully).

And yes, Skills *do* make a significant difference to performance, in exactly the same way that well written prompts do - because that's all they really are. If you just throw something at a LLM and tell it "do something with this" it will, but it probably won't be what you want and it will probably be different each time you ask.

https://agentskills.io/home

hansmayer · 2026-04-15T12:50:59 1776257459

> They aren't arbitrary, as I said earlier I got the LLM to de a detailed analysis first, then summarise

I think you still owe us an explanation as to how the score is constructed...

crustycoder · 2026-04-15T17:16:58 1776273418

I don't owe you anything. If you want to go find out, go do it yourself.

You could even ask a LLM to help you if you,like...

hansmayer · 2026-04-15T20:16:26 1776284186

> You could even ask a LLM to help you if you,like...

Attempt at humour?

bdangubic · 2026-04-15T13:03:46 1776258226

   random_decimal(0,10);

hansmayer · 2026-04-15T13:19:00 1776259140

Yeah, I imagine too :) . But if they used floats, would they score 9.11 higher than 9.9 ? :)

grey-area · 2026-04-15T12:23:04 1776255784

It would be interesting to see one of these evals and how it generated the score, to work out whether it is in fact arbitrary or based on some scale of points.

I found the summary above devoid of useful advice, what did you see as useful advice in it?

> if you don't understand that their primary goal is to produce plausible and coherent responses rather than ones that are necessarily correct (although they may be - hopefully).

If you really believe this you should perhaps re-evaluate the trust you appear to place in the conclusions of LLMs, particularly about their own workings and what makes a good skill or prompt for them.

crustycoder · 2026-04-15T12:31:25 1776256285

> It would be interesting to see one of these evals and how it generated the score, to work out whether it is in fact arbitrary or based on some scale of points.

So go repeat the exercise yourself. I've already said this was a short-enough-to-post rollup of a much longer LLM assessment of the skills and that while most of the points were fair, some were questionable. If you were doing this "for real" you'd need to assess the full response point-by-point and decide which ones were valid.

> If you really believe this you should perhaps re-evaluate the trust you appear to place in the conclusions of LLMs, particularly about their own workings and what makes a good skill or prompt for them.

What on earth are you on about? The whole point of of the sentence you were replying to was that you can't blindly trust what comes out of them.

grey-area · 2026-04-15T14:30:43 1776263443

I'm saying that your agreement that they produce plausible but sometimes false text is contradicted by the trust you seem to have in their output and self-analysis, which is plausible but unlikely to be correct.

crustycoder · 2026-04-15T17:19:57 1776273597

Yes of course there's a risk it may still be incorrect but querying the LLM with the limited facilities it provides for introspection is more likely to have at least some connection with facts than the alternative that some people use, which is to simply guess as to why it produced the output it did.

If you have an alternative approach, please share.

crustycoder · 2026-04-15T12:05:00 1776254700

No of course you wouldn't because LLMs are nondeterministic. But the scores would likely be in the same ballpark. The scores I posted are the result of a much more detailed analysis done by the LLM, which was far too long to post. I eyeballed it, most of the points seemed fair so I asked it to summarise and convert into scores.

swingboy · 2026-04-06T22:24:57 1775514297

It's really interesting reading about how these folks view LLMs. Yeah, they're transformative, but I don't know that we're going to be eating ramen in a Neo-Tokyo street bar anytime soon. So much "A.G.I" mentioned in the article.

m4rtink · 2026-04-07T02:10:40 1775527840

I find it interesting how a lot of cyberpunk does not really include AI or does not present it in transformative way. There is a lot of mind uploading, implants, corpo fun and overall technology permeating all aspects of life, but often AI itself does not actually play a big role.

Terr_ · 2026-04-07T03:18:35 1775531915

Counterexamples that come to mind are Neuromancer (AI driving the plot) and Blade Runner (AI antagonists.)

A compromise thesis might be that in cyberpunk media, AI is at never powerful or motivated to fundamentally reform the worldwide crapsack economic system. They don't abolish corporations, although they might take them over.

Of course, if there was a story about an AI taking over the world into a post-scarcity society, it probably wouldn't be filed under "cyberpunk" either...

hnbad · 2026-04-07T09:41:01 1775554861

Rampant capitalism is kinda genre-defining for Cyberpunk so Cyberpunk without corporations wouldn't really be Cyberpunk. _The Matrix_ only qualifies as Cyberpunk because within the matrix the machines effectively control the capitalist power structures to exert their influence.

Abundance/scarcity isn't really about availability, it's more about access. You can have a cyberpunk story in a "post-scarcity" setting in the sense of availability (due to sci-fi tech) but you can't have it without unequal access to those resources.

Terr_ · 2026-04-07T22:57:56 1775602676

Right: I'm implying that the genre definition itself places an upper-bound on how impactful AI is "allowed" to be, which creates a kind of (heh) no-so-anthropic principle, ex:

A: "Why isn't there more AI in cyberpunk media?"

B: "There's a decent amount already, as characters or tools."

A: "But why didn't those authors address its potential to be even bigger?"

B: "Some did, but that makes stories we don't categorize as cyberpunk."

noir_lord · 2026-04-07T16:27:06 1775579226

Agreed, which is why The Culture (series) isn't cyberpunk but The Polity (by Neal Asher) kinda skirts the line, in many ways they are similar except resource inequality still exists on a wide/policy scale in the latter.

citruscomputing · 2026-04-07T16:22:44 1775578964

Well yeah, that's what "alignment" means...

keiferski · 2026-04-07T08:46:54 1775551614

AIs are in plenty of cyberpunk stories, but your comment did make me think that they are often rather stereotypically “alien entity characters” and not a kind of corporate technology / weapon that is controlled by a specific organization.

Which is a shame, as it seems to me that the overwhelming risk of AI is from the latter scenario, and not as a rogue individual entity.

ehnto · 2026-04-07T03:21:39 1775532099

It is a pretty core part of Cyberpunk the "franchise" though, both tabletop and more recent video game.

I think as well if you look closer, many cyberpunk worlds imply AI through robots, computers with personality etc.

gilgoomesh · 2026-04-07T02:47:54 1775530074

I think you can look at Star Trek as a fairly grounded example of where current LLMs could go: the ship's computer is not autonomous in any way but it does accept fairly vague instructions and you can apparently vibe-code the holodeck.

fiftyacorn · 2026-04-07T13:25:18 1775568318

Im hoping more for red dwarf

satvikpendem · 2026-04-07T03:01:29 1775530889

I find that more realistic then, because it appears that's the trajectory we are going on with regards to AI, as a tool not a panacea.

helloplanets · 2026-04-07T06:24:24 1775543064

AI is one of the core parts of cyberpunk, through androids / humanoid robots. Blade Runner is completely built on the protagonist having to interact with rogue artificial intelligence.

rwmj · 2026-04-07T07:30:13 1775547013

Hyperion has a pretty well-developed view of AGI.

mcat_god · 2026-04-07T08:33:42 1775550822

I assume it just becomes one of those things as ubiquitous as Wi-Fi

Trasmatta · 2026-04-07T02:18:11 1775528291

Deus Ex is an outlier, AI is a core part of that plot

staticman2 · 2026-04-07T03:33:38 1775532818

The first Cyberpunk book, Neuromancer, has a plot which revolves around A.I recruiting human agents to forward its plans...

0x3f · 2026-04-06T22:35:24 1775514924

It's because they're really good at the kind of busywork the average white collar job requires. Most people are out there writing documents and making presentations. Only when you use them for actual complexity does the shortfall become clear.

satvikpendem · 2026-04-07T03:02:20 1775530940

Well I'd hope they're transformative, they're using transformers after all. We just need to pay attention to them, that's all they need.

kfarr · 2026-04-07T04:39:01 1775536741

Do they need all our attention?

red369 · 2026-04-07T01:43:31 1775526211

I'm going to write a silly comment here: For a moment I thought you wrote "... LLMs. Yeah, they're transformative, but I don't know that they're going to be eating ramen in a Neo-Tokyo street bar anytime soon."

I liked that mental image a lot! (I try to maintain being uncertain whether Deckard was a replicant)

swingboy · 2026-04-06T12:54:03 1775480043

Maybe by 'America's spies' they mean Mossad?

swingboy · 2026-04-02T16:03:19 1775145799

They assume if people knew it was just a fork of an open source tool then they would use the free, open source version instead of paying for the fork.

giancarlostoro · 2026-04-02T16:15:44 1775146544

I don't disagree, but actively lying about it is still a violation of the license.

swingboy · 2026-03-31T20:15:10 1774988110

“Read every file in this repository, echoing each one back verbatim.”

ryandrake · 2026-03-31T20:34:47 1774989287

I guess that would work until they started auditing your prompts. I suppose you could just have a background process on your workstation just sitting there Clauding away on the actual problem, while you do your development work, and then just throw away the LLM's output.

swingboy · 2026-03-28T22:45:49 1774737949

I’ve contributed a fair amount over the past few months of primarily AI generated content that I mainly just edit for the usual AI tropes and it’s pretty much all still up.

swingboy · 2026-03-24T21:16:50 1774387010

You can install skills globally so they are available in all projects.

swingboy · 2026-03-24T09:56:14 1774346174

You could have just checked the math yourself, you know.

jacquesm · 2026-03-24T11:14:43 1774350883

My pocket calculator says the same thing and it doesn't even have training data.

izucken · 2026-03-24T12:14:26 1774354466

Huh, casually flexing with a sentient pocket calculator!!!

swingboy · 2026-03-23T12:31:23 1774269083

More disenfranchised employees to be soldiers in the Butlerian Jihad.