Hacker Newsnew | past | comments | ask | show | jobs | submit | seertaak's commentslogin

The irony of an IP scraper on an absolutely breathtaking, epic scale getting its secret sauce "scraped" - because the whole app is vibe coded (and the vibe coders appear to be oblivious to things like code obfuscation cuz move fast!)...

And so now the copy cats can ofc claim this is totally not a copy at all, it's actually Opus. No license violation, no siree!

It's fucking hilarious is what it is, it's just too much.


The code is obfuscated, but they accidentally shipped the map file, i.e. the key to de-obfuscating it.

The irony of an IP scraper on an absolutely breathtaking, epic scale getting its secret sauce "scraped" - because the whole app is vibe coded (and the vibe coders appear to be oblivious to things like code obfuscation cuz move fast!)...

And so now the copy cats can ofc claim this is totally not a copy at all, it's actually Opus. No license violation, no siree!

It's fucking hilarious is what it is, it's just too much.


Interesting article; I've always been fascinated and intimidated by FPGA programming - it's one of the few remaining "dark arts" of software engineering.

> VHDL’s delta cycle algorithm is its crown jewel. It gives you built-in determinism. Let us treasure it - Verilog doesn’t have anything like it. At the same time, you will agree with me that there is nothing too complicated about the concept.

I agree with you -- and thank you!


> The deal was always simple: search engines had permission to crawl sites because they were going to be sending users to those sites. If they're hitting your site half a million times for every one user they send to your site, all they're giving you is higher costs.

Agree 100%.


Can you share what aspects of the design you (and Stroustroup) aren't happy with? Stroustroup has a tendency of being proven right, with 1-3 decade lag.

Certainly we can say that Bjarne will insist he was right decades later. We can't necessarily guess - at the time - what it is he will have "always" believed decades later though.

You made me laugh!...Bjarne indeed can't be accused of being a modest man. And by some accounts, he's quite a political animal.

But in fairness, when was D&E first published? Argued for auto there, long before their acceptance. Argued for implicit template instantiation - thank god the "everything-must-be-explicit" curmudgeons were vanquished there, too.

He's got a pretty good batting average - certainly better than Herb Sutter.


Well thats not always true. Initializer list is a glaring example. So are integer promotion some other things like

Integer promotion? - Stroustroup pleads C source compat else stillborn.

Initializes lists suck mainly because of C source compat constraints, too. In fact, most things that suck in C++ came from B via C.


I'll have a stab at this. I'll start with an attempt at justifying the remark that an agent which is a good coder will be good at other tasks.

1. Coding is, as a technical endeavour, relatively difficult (similarly for mathematics). So a model which performs well on this task can be expected to easily handle also-technical-but-slightly-easier tasks, like understanding (musical) harmony theory or counterpoint -- for much the same reason that human programmers/mathematicians/scientist don't struggle to understand those "easier" theories.

2. Reinforcement learning augments a base models ability to excel in something else that's "difficult", namely to "look ahead" and plan multiple steps in advance. That's literally how the training algorithm works, generating multiple paths at once, and rewarding intermediate steps in those paths which succeed in attaining the goal. And that skill, too, is extremely useful in other domains. An AI agent which learns that to break a problem into sub-problems, and then tackle each in turn methodically -- it stands to reason that it can apply that to, say, a business plan.

Note: 1 & 2 are not independent, nor are frontier models' excellence in these domains magical: it ultimately boils down to the availability of massive datasets (in particular for coding) and totally objective metrics (in the case of mathematics: solved math problems). That's the key ingrediant for reinforcement learning to be so effective.

So: the skills are transferrable because they're difficult, and require lots of planning. That models are so good at them is a fluke, and in a parallel world where humans created git repo after git repo of business plans, it might be that which we lean on to teach a reinforcement learning algorithm how to "reason" and "plan".

Now let's turn our attention to the "synergies" aspect, which I agree with. Let's say your agentic model, which is already excellent at reasoning and planning, acquires a new or improved capability which allows it to search the domain space, calculate, etc. much better than before -- this capability can now bear upon the plan, or be factored into the plan. For example, the model might be able to say "I don't need to worry about this particular subproblem for now; I can rely on my "mathematica" capability to deal with it when I absolutely need."

Or to put it differently: monkeys, like humans, are able to use (rudimentary) tools. They'll take a rock, and use it to crack open a coconut (or whatever). But a human being, with far superior reasoning and planning abilities, takes that tool, and uses it to make an even better tool -- and the result after many iterations of this process is civilization as we know it, while monkeys are still stuck trying to crack open nuts with rocks.


In my experience, an indicator that my interlocutor is (possibly) lying/BSing, when challenged for an explanation for something they did, is that they provide a list of reasons. The person who's telling the truth just gives one.

Is that really true? I would have expected by now that AI companies nowadays are doing RL on git histories, not just on the HEAD.

I also expected this. Please run some experiments and maybe other models are different

Claude definitely does

This -- and obviously David Ng's article -- are absolutely fascinating pieces of work.

I have a few (very naive) questions:

There is a widespread intuition, encapsulated in the very terms "feed-forward networks" and "deep neural networks", that computation in such networks is akin to a circuit wired in series. My "observation" is that residual layers offer an "escape hatch" from this, allowing layers (or sets of layers), to operate in parallel (and of course, something in between).

So here are my dumb questions:

1. Is my intuition about residual networks, at least in principle, allowing for in parallel layers, correct? Or am I missing something fundamental? Let's say the intuition is correct -- is it possible to measure the degree to which a layer operates in series or in parallel?

2. The formula for residual layers (at least to my mind) reminds of an Ornstein-Ühlenbeck time series process. If so, can we measure the degree of mean-reversion of a/several layer(s)? For me, this makes intuitive sense -- the goal of avoiding vanishing gradients feels similar to the goal of stationarity in time series processes.

3. Let's take as an article of faith the central idea of a tripartite network: input->latentspace block => reasoning block => latentspace->output block. Ng's intuition iiuc is that the reasoning block, more or less, wired in series. Intuitively, it feels like that is what it ought to be (i.e., a chain of calculations), though I'll add -- again hand-wavingly -- that OP's efforts appear to cast doubt on this conjecture. Are the two "translation" blocks wired "more" in parallel, then?

4. So what both Ng and OP did was to "tape together" the ostensibly reasoning layers -- in different ways but that's essentially it. Another thing you could do is to treat the input and output translation blocks as fixed. You now train a totally new model on a much smaller corpus of training data, only instead of feeding the input directly to your new model you feed it translated training data (similarly, your targets are now the activations at the entrance to the reasoning->output block. Let's assume it's exactly the same architecture in the middle as the standard netowrk, only it's initialized to random weights as per usual. Surely you should be able to pre-train that 6 layer reasoning network much, much faster. Has anyone tried this?

5. Having thus partitioned a very deep architecture into three distinct parts, there's no reason why you can't experiment with making the reasoning block wider or narrower. Has anyone tried that?

6. Another fun idea is to map a given input through input block and read the pre-reasoning activations. You now let that vector be a random variable and do a random walk through reasoning input space, and use this to "augment" your corpus of training data. Reasonable idea or bullshit?

Please remember, I'm only just (and belatedly) trying to wrap my head around how transformer architectures work -- I'm still waiting for my copy of "Build a Large Language Model (from scratch)"! I hope these questions aren't totally daft!


A UI design tool for TUIs -- made with Electron?... fun times!


I give it a month before someone launches a TUI-TUI.


You can run it as a web app, no need for electron.

Just `bun run dev`


That is concerning.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: