While it feels unlikely that a simple "write this spec from this code" + "write this code from this spec" loop would actually trigger this kind of hiding behaviour, an LLM trained to accurately reproduce code from such a loop definitely would be capable of hiding code details within the spec - and you can't reasonably prove that the frontier LLMs have not been trained to do so.
Eh, they're still keeping the impending switch to PDT, just ditching the future switch back to PST (and all future changes). That should give around 7-8 months for a new timezone file update to percolate.
On the other hand, Korean ditched the ideographic Hanja (Chinese Character) writing system because it was too difficult to learn, in favour of a much simpler phonetic one. In Japan, the classical Kanji Chinese writing system was considered a prestige language and was customarily not taught to ordinary folks; the modern Hiragana script evolved as an easy-to-learn alternative, (initially) used heavily by women who were often not taught Kanji.
Of course, Chinese characters are not pictographic and haven't been for a few thousand years, but they are still largely ideographic.
I would be very interested to know what string is being blocked here, and what the rest of its critical rules are. Maybe some hex-encoding or other obfuscation could be used to coax the rest of the system prompt out of the model? I wonder if the next tokens here are consumed by the middleware (to execute tools?).
The website notes that you can measure lag with an “expensive” high-speed camera setup.
My favorite trick, which I’ve used frequently (including in scientific publications on lag!) is to use the slo-mo cam on a smartphone. Phones will usually do anywhere from 120-240Hz. Set up the camera so it can see both your input (e.g. a side view of you pushing a button) and the display, record a video, and then pull it into a media player that supports frame-by-frame playback. You can then measure the number of frames elapsed from you pushing the button (pressing it down far enough to electrically activate it) and the corresponding reaction on screen. This gives you a cheap and easy setup capable of measuring latency down to ~4ms granularity, and doing a few repeated measurements can give you a very accurate picture of latency. Keep in mind that latency is a range (statistical distribution), not a single number, so you need repeated measurements to understand the shape of the distribution.
If you’re developing a game, you can add a prominent frame counter on screen to be captured on the video, and add the frame counter to your log output. Then you can match up the video with your game’s events, after accounting for display latency.
I don't know how scientifically valid this is (I hope very) but when a friend told me my USB hub / switcher would be introducing a lot of input lag, I bought a USB to eth adaptor and did a few thousand pings to the router, from direct to mobo and then from via the switcher. Unsurprisingly, there was no measurable latency (I had to use a 3P tool because Windows wouldn't go lower than ms by default).
I am aware that admitting to using Windows in these hallowed halls is a terrible sin, but the anecdote was too relevant to pass up and that's an important detail for anybody looking to repro.
So, the complaints against USB coming from PS/2 were around the switch from pushing inputs to polling with a relatively low rate on vastly slower CPUs. Nowadays CPUs are much faster and parallel; and the polling rates are in the 1-8 kHz as opposed to 100-400Hz.
I don't see any pro gamers carrying in any kind of PS/2 device, they even moved to wireless so the differences are likely meaningless these days.
> It was impossible to work out who, or where, Lucy was.
Lucy is a pseudonym. They were trying to get Facebook to tell them who the girl was through facial recognition. There’s no reason to expect a priori that the offender would be in any registry.
A new-ish feature of modern browsers is the ability to link directly to a chunk of text within a document; that text can even be optionally highlighted on page load to make it obvious. You could configure the LLM to output those text anchor links directly, making it possible to verify the quotes (and their context!) just by clicking on the links provided.
Somewhere on an HN thread I saw someone claiming that they "solved" security problems in their vibe-coded app by adding a "security expert" agent to their workflow.
All I could think was, "good luck" and I certainly hope their app never processes anything important...
Found a problem? Slap another agent on top to fix it. It’s hilarious to see how the pendulum’s swung away from “thinking from first principles as a buzzword”. Just engineer, dammit…
While it feels unlikely that a simple "write this spec from this code" + "write this code from this spec" loop would actually trigger this kind of hiding behaviour, an LLM trained to accurately reproduce code from such a loop definitely would be capable of hiding code details within the spec - and you can't reasonably prove that the frontier LLMs have not been trained to do so.
reply