Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Neither of those is a logical problem.

1. This is indeed a simplification, but for any single task in your day job, it would be those tasks where you have the most experience. For example, I used to write video games, AI does a better job of game design than me, but I'm the better programmer.

2. Unimportant, as the consideration I was rejecting was performance in tasks.

As it happens, some of my other recent messages demonstrate that I agree they are low intelligence for this exact reason.

> 99% of things an LLM could supposedly "out-perform" a human at, the human would actually outperform if you provided that human with the same text resources the LLM used to conjure its answer

Could I pass a bar exam of a medical exam, by reading the public internet, with no notes and just from memory, which is what a base model does?

Nope.

Could I do it with a search engine, which is what RAG assisted LLMs do?

Perhaps.

> humans can do it easily (and do it without hallucinating

Hell no we mess that up almost constantly.

> When LLM's fail to do math, fail to strategize, fail to process rules of abstract games, etc. there is no textbook or lecture or article you can provide to the LLM that magically makes the problem go away

I'm sure I've seen this done. I wonder if I'm hallucinating that certainty…



> Could I pass a bar exam of a medical exam, by reading the public internet, with no notes and just from memory, which is what a base model does? Nope.

You're -still- ignoring the fact that these models spend millions of GPU-hours in training. I'm sure you could manage.

> Hell no we mess that up almost constantly.

"Almost constantly"? Is this satire? I'd fire any such person, and probably recommend them psychiatric treatment.

AI-hype people really think so little of human beings? I certainly hope my pilot isn't "almost constantly" hallucinating his aviation training.

> I'm sure I've seen this done. I wonder if I'm hallucinating that certainty…

Show me the conversation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: