Hacker Newsnew | past | comments | ask | show | jobs | submit | mt_'s commentslogin

Is there a open source mini robot kit that allows me to play-around with agentic robots?

SO-ARM101 I guess? Or more likely the Lekivi variant.

I was also just in the market for a small experiment robot. I got the hiwonder armpi-fpv. Avoid it, the actuators are pretty bad - they're very 'grindy', the robot jitters like crazy when it moves. Any such problems with the lekivi?

Exactly like human input to output.

We just need to figure out the qualia of pain and suffering so we can properly bound desired and undesired behaviors.

Ah, the Torment Nexus approach to AI development.

This is Mr Meeseeks.

this is probably the shortest way to AGI.

Well no, nothing like that, because customers and bosses are clearly different forms of interaction.

Just like that, in that that separation is internally enforced, by peoples interpretation and understanding, rather than externally enforced in ways that makes it impossible for you to, e.g. believe the e-mail from an unknown address that claims to be from your boss, or be talked into bypassing rules for a customer that is very convincing.

Being fooled into thinking data is instruction isn't the same as being unable to distinguish them in the first place, and being coerced or convinced to bypass rules that are still known to be rules I think remains uniquely human.

> and being coerced or convinced to bypass rules that are still known to be rules I think remains uniquely human.

This is literally what "prompt injection" is. The sooner people understand this, the sooner they'll stop wasting time trying to fix a "bug" that's actually the flip side of the very reason they're using LLMs in the first place.


Prompt injection is just setting rules in the same place and way other rules are set. The LLM doesn't know the rules being given are wrong, because they come through the same channel. One set of rules exhorts the LLM to ignore the other set - and vice versa. It's more akin to having two bosses than having customers and a boss.

This is not because LLMs make the same mistakes humans do, which (AFAICT anyway) was the gist of the argument to which I replied. LLMs are not humans. They are not sentient. They are not out-smarted by prompt injection attacks, or tricked, or intimidated, or bribed. One shouldn't excuse this vulnerability by claiming humans make the same mistakes.


The same place you're looking for exists deep inside the neural network, where everything mixes together to influence everything else, and no such separation is possible, or desired. Prompt injection isn't about where, it's about what. I stand by what I said: it's the same failure mode as humans have, and happens for the same reasons. Those reasons are fundamental to a general purpose system and have nothing to do with sentience, they're just what happens when you want your system to handle unbounded complexity of the real world.

This makes no sense to me. Being fooled into thinking data is instruction is exactly evidence of an inability to reliably distinguish them.

And being coerced or convinced to bypass rules is exactly what prompt injection is, and very much not uniquely human any more.


The email from your boss and the email from a sender masquerading as your boss are both coming through the same channel in the same format with the same presentation, which is why the attack works. Unless you were both faceblind and bad at recognizing voices, the same attack wouldn't work in-person, you'd know the attacker wasn't your boss. Many defense mechanisms used in corporate email environments are built around making sure the email from your boss looks meaningfully different in order to establish that data vs instruction separation. (There are social engineering attacks that would work in-person though, but I don't think it's right to equate those to LLM attacks.)

Prompt injection is just exploiting the lack of separation, it's not 'coercion' or 'convincing'. Though you could argue that things like jailbreaking are closer to coercion, I'm not convinced that a statistical token predictor can be coerced to do anything.


> The email from your boss and the email from a sender masquerading as your boss are both coming through the same channel in the same format with the same presentation, which is why the attack works.

Yes, that is exactly the point.

> Unless you were both faceblind and bad at recognizing voices, the same attack wouldn't work in-person, you'd know the attacker wasn't your boss.

Irrelevant, as other attacks works then. E.g. it is never a given that your bosses instructions are consistent with the terms of your employment, for example.

> Prompt injection is just exploiting the lack of separation, it's not 'coercion' or 'convincing'. Though you could argue that things like jailbreaking are closer to coercion, I'm not convinced that a statistical token predictor can be coerced to do anything.

It is very much "convincing", yes. The ability to convince an LLM is what creates the effective lack of separation. Without that, just using "magic" values and a system prompt telling it to ignore everything inside would create separation. But because text anywhere in context can convince the LLM to disregard previous rules, there is no separation.


the second leads to first, in case you still don't realize

If they were 'clearly different' we would not have the concept of the CEO fraud attack:

https://www.barclayscorporate.com/insights/fraud-protection/...

That's an attack because trusted and untrusted input goes through the same human brain input pathways, which can't always tell them apart.


Your parent made no claim about all swans being white. So finding a black swan has no effect on their argument.

My parent made a claim that humans have separate pathways for data and instructions and cannot mix them up like LLMs do. Showing that we don't has every effect on refuting their argument.

>>> The principal security problem of LLMs is that there is no architectural boundary between data and control paths.

>> Exactly like human input to output.

> no nothing like that

but actually yes, exactly like that.


These are different "agents" in LLM terms, they have separate contexts and separate training

There can be outliers, maybe not as frequent :)


I call them, entropy reducers.


It would be ironic if the very detection of hallucinations contained hallucinations of its own.


Discipline yourself before buying a new device.


How many lashes must I give myself before I buy this phone?


https://claude.ai/public/artifacts/0824e5b9-7d75-45f1-87f4-3...

This is now my favorite way to visualize these concepts in practice.


Four critique points:

- Who wants to drive across town to inspect a €50 item for a small fee (we can draw comparison to Uber Eats like platforms fees economies)?

- Can a random broker validate a luxury watch? Do we need another blockchain tech for broker validator skill reputation?

- Physical validation adds days to trades, in online economy, the faster the merrier

- Fees might price out low-value items

Let's see how this plays out.


Thanks for the critique! Here’s a breakdown of the points raised:

-Who wants to drive across town to inspect a €50 item? The focus is on mid to high-value, preferably niche items. Lower-value goods often don’t justify the costs involved in driving and the time spent on validation.

-Can a random broker validate a luxury watch? Not all brokers have the necessary expertise to validate every item, especially luxury goods. The proposal is to enhance the current system by assigning brokers based on item categories. This specialization will be particularly effective when there are enough brokers for specific categories, such as watches.

- Physical validation adds days to trades. While physical validation can slow things down, brokers who fail to validate effectively will phase out over time, ensuring that only those with the right expertise remain. It should be economically infeasible to accept assignment, where you have no expertise. This approach aims to streamline the validation process.

-Fees might price out low-value items. Focusing on mid to high-value items helps avoid the issue of fees pricing out lower-value goods.

Additionally, this idea is designed to integrate into existing niches where validation matters significantly, like trading cards, electronics, watches, and sneakers. Numerous businesses already specialize in validating these items and have the necessary expertise to navigate legal requirements.



Not AI generated. Tried to be nice, and structure my answer.


Can you prove it's AI-generated?


This smells AI generated, sorry


> ensuring that only those with the right expertise remain

How will this ensure waning/gaining expertise is accurately represented/fostered? Wouldn't you rather attract a steady-stream of experts indefinitely?


In practice, anyone with sufficient funds can become a broker. The pseudo-random selection process means that the probability of Broker A being chosen to audit or inspect an item is positive. If Broker A accepts and validates an item they are unfamiliar with, regardless of its actual validity, the likelihood of a dispute arising increases. Since Broker A lacks knowledge about the item, proving their case becomes challenging, potentially resulting in financial losses. Over time, this situation should lead to a pool of brokers with expertise. Consequently, the system is likely to attract a continuous stream of experts, as expertise will prove itself financially advantageous.


If A could gain sufficient funds through expertise in one area (by way of validating contracts), could it feign/game expertise in area B by having enough funds to recoup the losses from disputes? Or would such a situation be prevented?


May I ask how you generated this?


AI but does it matter, no


Thanks for sharing your opinion


I envy people that stick for a system like this for so long. Because when you master it, it is when you can build a system around it. For this piece, i suggest the author to build his own frontend app, that mimics this system but with a better, clean UI interface. Hell, he can just vibe code it in under a hour these days and at the end leverage the ergonomics of a clean interface, and of course implement integrations that the app will enables, to build systems around it, to become even more productive.


- Essentially zero input or transactional latency

- Proven effective after 14 years of heavy use

- Celebrated by user

- Zero dependencies

- Maximally portable

- Outage-proof

- Compatible with all backup systems and most version control systems

Have you considered that stuff like this is already "more productive" for fluent users than almost any alternative could be?

Somewhere along the line, product people started to mistake following design trends and adding complexity for productivity, forgetting that delivering the right combination of fluency, stability, simiplicity are often the real road to maximizing it.


The portability thing can't be stressed more. It took me ages to liberate my notes from onenote cloud when I moved over to obsidian. Which is of course exactly the point of Microsoft's.


> Celebrated by user

Oh I’m totally putting this in a performance review this year.


Why?

Why would he want to waste a single iota of effort trying to improve something that was working just fine for fourteen years when he wrote this post three years ago? What’s gonna be easier to use than the text editor he knows how to drive without a single thought? What does he gain by taking a simple text file he can sync to any device and replacing it with a database bound to a custom app that he now has to keep running? I mean besides the risk that an OS update will break this app and now he can’t get anything else done until he fixes it, because he’s the only person maintaining it? Most of the interaction is still going to be typing in free-form text, how is taking his hand off the keyboard to poke at a “new task” widget going to make it better and cleaner than just typing return, dash, space? What GUI kit is not going to fall over and whimper when you hand it 51k items to render? What does he gain by spending days trying different ways to get around that interface design problem in hopes of finding one as seamless as his simple text editor?


> besides the risk that an OS update will break this app

Tangential, but what a sad state of affairs is that an OS update can break your app. I'm not a windows user (not voluntarily, at least), but I always appreciated the stability and retrocompatibilità that allowed old apps to run unmodified on modern systems. I heard they dropped the ball on this as well, though.


Why build an app? It seems the whole benefit here is it doesnt need any app. Its completely agnostic and simple. The value is in the data and the way he enters it in.

It sounds like a good system but i still believe it takes the discipline of a strong willed person to do the system no matter what system you use.

If i did this i would give up after 2 days. He says he redoes his list every night ready for the next day —- THAT is the secret here, not the specific system he uses.

I’ve tried all sorts over the years different tools, different systems , different philosophies, inbox zero, gtd etc They don’t work for me. I get by with a notepad and pen and i write lists as and when. Theres people out there and some even have YouTube channeks dedicatd to disseminating their productivity hack and workflows for evey tool Imaginable, and they are really enthusiastic about it.

It doesn’t do it for me im too free spirited.


I started tracking everything I ate three years ago and even posted about it via this comment: https://news.ycombinator.com/item?id=32552288

I updated it substantially via AI this summer (includes micros, compounds, and various other stats and a webpage with charts now) and then I started making diet changes based on these new features. Is really neat to compare data from before and after those changes. And like you suggested, I keep making improvements to the system and to myself and it becomes really satisfying / motivational.

Is still driven by simple text files.


Satire?


I wish I could tell honest comments from satire apart. It's especially hard after reading the future HN created by Gemini that was posted yesterday.


Why would he?


Spend time reviewing outputs like a tech lead does when managing multiple developers. That's the upgrade you hust got in your career, you are now bound to how many "team members" you can manage at a single time. I'm grateful to live in such a time.


I would take this more seriously if the title were: > I tried every todo app and ended up with a .md file


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: