Hacker Newsnew | past | comments | ask | show | jobs | submit | __mharrison__'s commentslogin

This is cool. It's not for everyone and probably very heavy.

But I love the hacker feel of it.


I would personally pay money not to have this thing.

It's wonderful and I love that someone else loves it. The care put into it is fantastic. Vive la différence.

(https://en.wiktionary.org/wiki/vive_la_diff%C3%A9rence for those who may not recognize that phrase.)


How much money are you willing to pay? If sufficient I won't commission one and have it sent to your house. @AnthonyDavidAdams on venmo!

Cool, I've been doing a lot of "coding" (and other typing tasks) recently by tapping a button on my Stream Deck. It starts recording me until I tap it again. At which point, it transcribes the recording and plops it into the paste buffer.

The button next to it pastes when I press it. If I press it again, it hits the enter command.

You can get a lot done with two buttons.


This is exactly what I am building right now, Stream Deck with two buttons too (push to talk and enter)! It's a sweet little pet project, and has been a blast to build so far. Excited to finally add it to my workflow once its working well.

Bought my ergoxdoxen from them when they were just starting out.

They even sent me a gift box because my blog post about the keyboard had driven so much traffic. It had a CST mouse in it (among other things).

Still using the mouse.

Nowadays you can buy awesome small batch keyboards from small vendors.


Do you have a recommended small vendor? I am on the look out for some new switches!

Just bought a lily 58 from these guys. Highly recommended https://typeractive.xyz/pages/build#lily58_choc

For the past month, I've been claiming that $20/mo codex is the best deal in AI.

Now I'm going to have to find the new best deal.


Check out z.ai coder plan. The $27/mo plan is roughly the same usage as the 20x $200 Claude plan. I have both and Claude is a little better, but GLM 5.1 is much better value.

Agreed, I use Z.ai and the usage is fantastic the only temper that recommendation that it's often unreliable. Perhaps a few times per week it's unresponsive. Maybe more often it seems to become flakey.

It's very variable though recently I'm noticing it's more reliable but there was a patch where it was nearly unusable some days.

I guess I won't complain for the price and YMMV.


Agreed. They had a rough patch around the 4.7 to 5 upgrade. New architecture required hardware migration. The 5 to 5.1 upgrade was much smoother (same architecture new weights). As you say, little rough around edges, but still great value. Trick I learned is that it's max 2 parallel requests per user. You can put a billion tokens a month through it, but need to manage your parallelism.

If you're ok with a model provider that goes down all the time and has such a poor inference engine setup that once you get past 50k tokens you're going to get stuck in endless reasoning loops.

GH Copilot is still the best deal, while it lasts

I feel they will go token base at some point, currently if you only use it with precise prompts and not random suggestions, switch between models 5.4 and 5.4 mini depending on the work, it is the best deal.

Yeah, it's really good. Probably going to be the next best deal until they cut back.

I need to try the command line version.


> I need to try the command line version.

Is there any other?


Already paying for Google photo storage, AI pro for an extra $7 is a steal with anti-gravity.

I bought one of the google AI packages that came with a pile of drive storage and Gemini access.

Unfortunately gemini as a coding agent is a steaming useless pile. They have no right selling it, cheap open weight Chinese models are better at this point.

It's not stupid it just is incompetent at tool use and makes bad mistakes. It constantly gets itself into weird dysfunctional loops when doing basic things like editing files.

I'm not sure what GOOG employees are using internally, but I hope they're not being saddled with Gemini 3.1. It's miles behind.


Are you using gemini CLI or antigravity? The former is not really comparable to the latter in terms of quality. I wouldn't say antigravity is as good as the competition but it's pretty close. Miles behind is overstating it.

Gemini CLI but also used the Gemini models via opencode. They're terrible at CLI tool use. Like I said, just editing text files, they fall over rapidly, constantly making mistakes and then mistakes fixing their mistakes.

Antigravity wants me to switch IDEs, and I'm not going to do that.


This lines up with my experience. Antigravity doesn't have this shortcoming though. I think the agent harness matters equally to the model. Gemini CLI and opencode aren't very great harnesses in my opinion.

I too dislike having my choice of ide forced on me. Hopefully that situation improves, but antigravity demonstrates that Gemini isn't necessarily behind by that much.


Gemini 3.1 is a good coding agent. We've been totally spoiled now. Also, if you use Antigravity you can burn up Opus 4.6 credits off your Goog account instead, before you have to switch to Gem 3.1.

Good luck sticking within limits, I have been burning up my baseline limits insanely fast within a few prompts, a marked change from a few weeks ago.

There's a few complaints online about the same happening to multiple users.

Otherwise anti-gravity has been great.


I use the free Chat AIs all the time; Claude, ChatGPT, Gemini, Grok, Mistral.

In the last month they have all clamped down quite heavily. I use to be able to deep-dive into a subject, or fix a small Python project, multiple times per day on the free Web UIs.

Claude, this morning, modified a small Python project for me and that single act exhausted all my free usage for the day. In the past I could do multiple projects per day without issue.

Same with ChatGPT. Gemini at least doesn't go full on "You can use this again at 1100AM", but it does fallback to a model that works very poorly.

Grok and Mistral I don't really use that much, but Grok's coding isn't that bad. The problem is that it is not such a good application for deep-diving a topic, because it will perform a web search before answering anything, making it take long.

Mistral tends to run out of steam very quickly in a conversation. Never tried code on it though.


I use a quota monitor and grind out code on Gemini 3 flash. Only go to sonnet or pro is there's issues flash can't deal with or I have a critical architecture I need nailed on the first try.

I still review every line generated.

Gemini 3.1 pro on the web interface still works if my problems are scoped to a single module or two and my better model quotas are exhausted in the IDE.

For $7 over what I was already paying for storage, primarily using flash is still a good development experience for me.


That's only good for the web based UI. If you want Gemini API access which is what this article is about then you must go the AIStudio route and pricing is API usage based. It does have a free usage tier and new signups can get $300 in free credits for the paid tier so it's I think it's still a good deal, just not as good as using the subscriptions would be.

No? Isn't the article about Codex, which is roughly equivalent to "Gemini CLI" and Google's Antigravity? Google's subscriptions include quotas for both of those, albeit the $20 monthly "Pro" plan has had its "Pro" model quota slashed in the last few weeks. You still get a large number of "Gemini 3 Flash" queries, which has been good enough for the projects I've toyed with in Antigravity.

I guess that's true but I find Google's models better than their public tooling. The Pro subscription includes "Gemini Code Assist and Gemini CLI" but the Gemini Code Assist plugin for IntelliJ which is my daily driver is broken most of the time to the degree that it's completely unusable. Sometimes you can't even type in the input box.

The only way I can do serious development with Gemini models is with other tooling (Cline, etc) that requires API based access which isn't available as part of the subscription.


I agree. Gemini models are held back by their segmentation of usage between multiple products, combined with their awful harnesses and tooling. Gemini cli, antigravity, Gemini code assist, Jules.... The list goes on. Each of these products has only a small limit and they must share usage.

It gets worse than that though. Most harnesses that are made to handle codex and Claude cannot handle Gemini 3.1 correctly. Google has trained Gemini 3.1 to return different json keys than most harnesses expect resulting in awful results and failure. (Based on me perusing multiple harness GitHub issues after Gemini 3.1 came out)


Google is by far the best deal for AI, they give you so many 'buckets' of usage for a variety of products, and they seem to keep adding them.

If you aggressively use all buckets Google is incredibly generous. In theory for one AI pro subscription you can get what is a ridiculous return in investment in a family plan.

You could probably be charging google literally thousands if all 6 members were spamming video and image generation and antigravity.


The family sharing is the real hack lol. I don't think any other provider does that.

What has actually changed? It's unclear how much can you do right now, unless they've already switched you to the new plan and you're speaking from experience.

We are exiting a hype cycle, well into the adoption curve. Subscriptions were never going to last.

My next step is going to be evaluating open and local models to see if they are sufficiently close to par with frontier models.

My hope is that the end of seat based pricing comes with this tech cycle. I was looking for document signing provider that doesn't charge a monthly, I only need a few docs a year.


I'm developing software in this area right now, so I try a lot of the new models. They're not even close for coding tasks. It basically comes down to 26b parameters vs 1T parameters / quantisation / smaller context sizs, there's no comparison. However, for agentic work, tool calling, text summarisation, local LLMs can be quite capable. Workloads that run as background tasks where you're not concerned about TTFB, cold starts, tok/s etc., this is where local AI is useful.

If you have an M processor then I would recommend that you ditch Ollama because it performs slowly. We get double or triple tok/s using omlx or vmlx, respectively, but vmlx doesn't have extensive support for some models like gpt-oss.


Kimi K2.5 (as an example) is an open model with 1T params. I don't see a reason it has to be local for most use cases- the fact that it's open is what's important.

That is just idealism. Being "open" doesnt get you any advantage in the real world. You're not going to meaningfully compete in the new economy using "lesser" models. The economy does not care about principles or ethics. No one is going to build a long term business that provides actual value on open models. They can try. They can hype. And they can swindle and grift and scalp some profit before they become irrelevant. But it will not last.

Why? Because what was built with an open model can be sneezed into existence by a frontier model ran via first party API with the best practice configurations the providers publish in usage guides that no one seems to know exist.

The difference between the best frontier model (gpt-5.4-xhigh or opus 4.6) and the best open model is vast.

But that is only obvious when your use case is actually pushing the frontier.

If you're building a crud app, or the modern equivalent of a TODO app, even a lemon can produce that nowadays so you will assume open has caught up to closed because your use case never required frontier intelligence.


A model with open weights gives you a huge advantage in the real world.

You can run it on your own hardware, with perfectly predictable costs and predictable quality, without having to worry about how many tokens you use, or whether your subscription limits will be reached in the most inconvenient moment, forcing you to wait until they will be reset, or whether the token price will be increased, or your subscription limits will be decreased, or whether your AI provider will switch the model with a worse one, and so on.

Moreover, no matter how good a "frontier model" may be, it can still produce worse results than a worse model when the programmer who manages it does not also have "frontier intelligence". When liberated of the constraints of a paid API, you may be able to use an AI coding assistant in much more efficient ways, exactly like when the time-sharing access to powerful mainframes has been replaced with the unconstrained use of personal computers.

When I was very young I have passed through the transition from using remotely a mainframe to using my own computer. I certainly do not want to return to that straitjacket style of work.


The vision has been that the open and/or small models, while 8-16 months behind, would eventually reach sufficient capabilities. In this vision, not only do we have freedom of compute, we also get less electricity usage. I suspect long-term the frontier mega models will mainly be used for distillation, like we see from Gemini 3 to Gemma 4.

first session with gemma4:31b looks pretty good, like it may actually be up to coding tasks like gemini-3-flash levels

you can tell gemma4 comes from gemini-3


I recently experimented creating a Python library from scratch with Codex. After I was done, I took the PRD and Task list that was generated and fed them to opencode with Qwen 3.5 running locally.

Opencode was able to create the library as well. It just took about 2x longer.


Which version of Qwen 3.5 did you use?

which quant as well

Not at my computer now, either 27 or 35b not quantized.

Next week I will be trying qwopus 27b.


I'd rather have the metadata from click and typing events and use that to create a davinci project...

I've written over a dozen books.

Have used asciidoc, HTML, word, latex, and rst.

Markdown is the least painful of all. It's not perfect but the others are worse.

(My custom stack uses markdown (or Jupyter notebooks converted to markdown). Pandoc plus some custom filters creates typst (for PDF) or epub.)


I did a book in rst and liked that it had cool admonition, import, glossary, and index features that made it better than markdown for me. Still hate the heading conventions.

I have a custom pandoc filter for callouts and index entries. None of the simple lightweight markup languages has complete support for writing a real book. Writing custom rst code is a pain (and no one else in the world uses it). (I say this as a 25-year Python veteran and as a docutils committer!)

My sweet spot is something that runs on less than 128gb.

(I have a DGX Spark, and MBP w/ 128gb).


I just spent yesterday applying Kaparthy's autoresearch on an ML problem.

I teach ML for a living and was amazed with what the tokens gave back to me after many rounds of experiments. If Kaggle was still a thing, AI would generally beat it.

The challenge I've seen is that most data science/ml modeling work is quite weak. Folks don't even know the basic tools well. Not sure if giving AI to them will really open up many doors to them.

As always experts love minions of juniors doing their deeds. Non-experts get to wade through slop.


I agree AI could probably do a decent job on Kaggle problems. Of course, almost no DS job is building models with well-defined objectives and perfect data. The DS and MLE folks I work with mostly spend their time reframing ill-posed product requests into ML systems that can be maintained and improved with feedback loops.

A _huge_ part of a DS is saying "No" to bad ideas posed by non-experts. The issue with LLMs is all they ever say is "Yes" and "Wow, that's such a great idea!"


Yeah, once you move onto legitimate business evaluation metrics (where Precision@k or Recall@k don't actually fit your business model without modification), GPTs just seem to suffer without context, and hey, knowing the context is part of what gives a data scientist his value.

The data scientist is like in house lawyers in that respect.

Karparthys autoresearch is just automated overfitting no?

Is Kaggle no longer a thing?

My understanding was that rubber ducking was using a different portion of your brain by speaking the words.

The same discovery often happens when you explain a problem to a coworker and midway through the explanation you say "nvm, I know what I did wrong"


Love it! Let those pip users find the compromised packages for us uv users.

Until everyone waits 7 days to install everything so the compromise is discovered on the 8th day.

End result will be everyone runs COBOL only.


Or just scan all GitHub repos, find their .toml definition. Calculate the median and then add 7 days to that. That way you are always behind.

Or Forth with scientific library, bound to the constraints. Put some HTTP library on top and some easy HTML interface from a browser with no JS/CSS3 support at all. It will look rusty but unexploitable.

Enterprise computing with custom software will make a comeback to avoid these pitfalls. I depise OpenJDK/Mono because of patents but at least they come with complete defaults and a 'normal' install it's more than enough to ship a workable application for almost every OS. Ah, well, smartphones. Serious work is never done with these tools, even with high end tables. Maybe commercials/salespeople and that's it.

It's either that... or promoting reproducible environment with Guix everywhere. Your own Guix container, isolated, importing Pip/CPAN/CTAN/NPM/OPAM and who knows else into a manifest file and ready to ship anywhere, either as a Guix package, a Docker container (Guix can do that), a single DEB/RPM, an AppImage ready to launch on any modern GNU/Linux with a desktop and a lot more.


  > Or Forth with scientific library, bound to the constraints. Put some HTTP library on top and some easy HTML interface from a browser with no JS/CSS3 support at all. It will look rusty but unexploitable.
Let this be a lesson to you youngsters that nothing in unexploitable.

Forth has no standard library for interfacing with SQLite or any other database. You're either using 8th or the C ABI. Therefore, you'll most likely be concatenating SQL queries. Are you disciplined enough to make that properly secure? Do you know all the intricacies?


GForth might have then for sure (Sqlite it's small and supported by even jimtcl) . Also, there's Factor, a Forth inspired language.

I'm already ahead of you. I'm using `exclude-newer = "8 days"`

But not all project exploited in a supply chain attack get exploited on the same day.

So when project A gets pwned on day 1 and then, following the attack, project B gets pwned on day 3, if users wait 7 days to upgrade, then that leaves two days for the maintainers of project B to fix the mess: everybody shall have noticed on the 8th day that package A was exploited and that leaves time for project B (and the other projects depending on either A or B) to adapt / fix the mess.

As a sidenote during the first 7 days it could also happen that maintainers of project A notices the shenanigans.


:-) That might not even be enough as I hear (but haven't verified) that Claude does a pretty good job of making sense out of legacy COBOL code!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: