More

cmovq · 2026-04-15T07:44:51 1776239091

The Coca Cola company still makes advertisements, even though everyone already knows about Coke. You have to keep your name in the top of your target audience’s mind.

cmovq · 2026-04-13T01:44:53 1776044693

Vulkan is a pain for different reasons. Easier to install sure, but you need a few hundred lines of code to set up shader compilation and resources, and you’ll need extensions to deal with GPU addresses like you can with CUDA.

rdevilla · 2026-04-13T02:24:01 1776047041

Ah yes, but those hundred lines of code are basically free to produce now with LLMs...

cylemons · 2026-04-13T05:31:55 1776058315

Whatabout the extensions? is it widely supported

NekkoDroid · 2026-04-13T07:32:28 1776065548

That is always one check away: https://vulkan.gpuinfo.org/listextensions.php

cylemons · 2026-04-23T08:08:39 1776931719

VK_KHR_buffer_device_address has 91.3% support

and

VK_KHR_variable_pointers has 98.66% support

looks good to me

cmovq · 2026-04-12T16:40:19 1776012019

Except it’s actually called “cooked mode” [1] and predates the use of the slang.

[1]: https://en.wikipedia.org/wiki/Terminal_mode https://www.linusakesson.net/programming/tty/

fao_ · 2026-04-12T18:22:06 1776018126

"cooked mode" refers to physical teletypes, though. In the POSIX spec[1] it's called "canonical mode", same for the other specifications (if they're mentioned at all, I don't think the ANSI specification mentions either term).

https://en.wikipedia.org/wiki/POSIX_terminal_interface#Canon...

xsawyerx · 2026-04-13T09:06:20 1776071180

Thank you for this correction. I'll update the readme!

cmovq · 2026-04-06T20:31:37 1775507497

Software wants to be installed in C:\Program Files so that other software can’t modify their installation without admin permissions. Of course to do that your installer needs to be run as administrator which makes the whole thing rather silly.

akdev1l · 2026-04-07T01:09:11 1775524151

The fundamental issue is that installers shouldn’t exist

There’s no need to have an executable program just to essentially unzip some files to disk

gruez · 2026-04-07T01:24:56 1775525096

>There’s no need to have an executable program just to essentially unzip some files to disk

What if you need to install some registry keys? What about installing shared dependencies (redistributables)? What if you want granny to install your app and left to her own devices, it'll end up in some random folder in the downloads folder?

tredre3 · 2026-04-06T21:27:14 1775510834

Software installed through the Windows Store seem immutable enough even though they live in the user's AppData.

At least the system prevented me from seeing or modifying the files the last time I tried. I did not try very hard, admittedly, but by contrast modifying something in C:\Program Files is just one UAC confirmation away.

cmovq · 2026-04-04T06:20:44 1775283644

> The OS then provides a native API to return a user's age bracket (not full date-of-birth)

Call the API every day, when the age bracket changes you can infer the date-of-birth.

cmovq · 2026-03-27T20:40:08 1774644008

Even the function names are identical :/

Rexxar · 2026-03-28T00:53:57 1774659237

The difference in size seems to be mainly due to the missing documentation before each function.

cmovq · 2026-03-26T18:43:56 1774550636

I mean it kind of is considering that's comparable to a 5070 which has 672 GB/s? Benefit of NVIDIA being the only one using GDDR7 for now I guess.

daemonologist · 2026-03-26T18:48:09 1774550889

7800 XT has 624 GB/s as well, and can be found for $400 used. 16 GB of course.

BizarroLand · 2026-03-27T17:20:23 1774632023

I've heard ROCm is still a crapshoot though. Is that true?

daemonologist · 2026-03-28T13:26:07 1774704367

If you stick with your OS/package manager-distributed version, installation isn't painful anymore (provided that version approximately overlaps with your generation of GPU). It's okay for inference, and okay for training if you don't stray too far beyond plain torch. If you want to run code from a paper or other more esoteric stuff you're still going to have a bad time.

I don't have an Intel dGPU, but I suspect the situation there is even worse. I mean you go to the torch homepage: https://pytorch.org/get-started/locally/ and Intel isn't even mentioned. (It's here though: https://docs.pytorch.org/docs/stable/notes/get_start_xpu.htm...)

cmovq · 2026-03-20T21:21:31 1774041691

I've always assumed minifiers were a kind of lossless compression. I guess this optimization makes it lossy? Even if we can't tell the difference between oklch(0.659432 0.304219 234.75238) and oklch(.659 .304 234.752) they're still different colors.

cyanydeez · 2026-03-21T09:50:08 1774086608

Tge whole contexr of the article is answering your question: a "different color" is a specification laden structure and the answer is no according to spec.

cmovq · 2026-03-20T15:56:42 1774022202

When you're using a programming language that naturally steers you to write slow code you can't only blame the programmer.

I was listening to someone say they write fast code in Java by avoiding allocations with a PoolAllocator that would "cache" small objects with poolAllocator.alloc(), poolAllocator.release(). So just manual memory management with extra steps. At that point why not use a better language for the task?

andai · 2026-03-20T20:06:52 1774037212

I decompiled Project Zomboid (written in Java) a while back, because I was curious about the performance issues I was having with the game. (Very laggy on my 10 year old laptop, while looking like The Sims 1.) I figured, best case scenario I find some easy bottlenecks and I can patch in a fix.

Well, the whole thing was standard Java OOP, except they also had a bunch of functional programming stuff on top of that. I can relate to that -- I think they were university students when they started, and I definitely had an OOP and FP phase. But then they just... kept it, 10+ years later.

So while it's true that you can write C in any language... those kind of folks don't tend to use Java in the first place ;)

--

(Except Notch? Well, his code looks like C, not sure if it's actually fast! I really enjoyed his 4 kilobyte java games back in the day, I think he published the source for each one too.)

EDIT: Found it!

https://web.archive.org/web/20120317121029/http://www.mojang...

Edit 2: This one has a download, still works!

https://web.archive.org/web/20120301015921/http://www.mojang...

bigwheels · 2026-03-20T21:30:57 1774042257

What is the ending of your story!? Did you find and fix some bottlenecks?

ivan_gammel · 2026-03-20T17:24:06 1774027446

TBH, I do not see how Java as a language steers anyone to use one those shotguns. E.g. the knowledge about algorithmic complexity is foundational, the StringBuilder is junior-level basic knowledge.

nightpool · 2026-03-20T19:55:02 1774036502

How would you handle validating numeric input in a hot path then? All of the solutions proposed in #5 are incomplete or broken, and it stems from the fact that Java's language design over-uses exceptions for error handling in places where an optional value would be much safer and faster.

ivan_gammel · 2026-03-20T20:02:22 1774036942

Normally in 100% cases, with parseInt/parseDouble etc. Getting NumberFormatException so frequently on a hot path that it impacts performance means, that you aren’t solving the parsing number problem, you are solving a guessing type problem, which is out of scope for standard library and requires custom parser.

nightpool · 2026-03-20T20:07:40 1774037260

Okay, but this contradicts your original statement that "Java doesn't steer anyone to use these [footguns]". Every language has a way to parse integers, and most developers do not need a custom parser. Only in Java does that suddenly become a performance footgun.

ivan_gammel · 2026-03-20T20:29:26 1774038566

It does not. If you need to parse a number, you use standard library and you will be fine. The described case with huge impact on hot path is the demonstration why using brains is important. The developer that will get into this mess is the one who will find the way to suffocate his code with performance bottlenecks in thousand other ways. It’s not a language or library problem.

dionian · 2026-03-20T21:02:30 1774040550

Yes, parseInt et al work very fast for good inputs. What percentage of your inputs are invalid numbers and why ?

ivan_gammel · 2026-03-20T21:19:46 1774041586

> What percentage of your inputs are invalid numbers and why ?

This is a wrong question to ask in this context. The right question to ask is when actually exceptional flow becomes a performance bottleneck. Because, obviously, in a desktop or even in a server app validating single user input even 99% of wrong inputs won’t cause any trouble. It may become a problem with bulk processing, but then, and I have to repeat myself here, it is no longer a number parsing problem, it’s a problem of not understanding what your input is.

kerblang · 2026-03-20T21:54:02 1774043642

> Java's language design over-uses exceptions for error handling

No, library authors' design over-uses exceptions. Also refer to people using exceptions to generate 404 http responses in web systems - hey, there's an easy DDOS... This can include some of Java's standard libraries, although nothing springs to mind.

Exceptions are not meant for mainstream happy-path execution; they mean that something is broken. Countless times I have had to deal garbage-infested logs where one programmer is using exceptions for rudimentary validation and another is dumping the resulting stack traces left and right as if the world is coming to end.

It is a problem, but it's an abuse problem, not a standard usage problem.

nightpool · 2026-03-23T20:57:07 1774299427

I agree with you that the root problem is that the library author's design over-uses exception. But when the library in question is the standard library and the operation is as basic as Integer.parseInt, then I think it's fair to criticize that as a language issue, because the standard library sets the standard for what is idiomatic + performant for a language.

kerblang · 2026-03-24T17:31:36 1774373496

There is nothing wrong with Integer.parseInt(). It blows up if you give it invalid input. That's standard idiomatic behavior.

It might be helpful to have Integer.validateInt(String), but currently it's up to author to do that themselves.

cxr · 2026-03-20T19:36:17 1774035377

The problem with comments like these is that guessing what "better language" a commentator has in mind is always an exercise left up to the reader. And that tends to be by design—it's great for potshots and punditry, because it means not having make a concrete commitment to anything that might similarly be confronted and torn apart in the replies—like if the "better language" alluded to is C (and it generally is)—the language where the standard library "steers" you towards quadratic string operations because the default/natural way to refer to a string's length is O(n).

ablob · 2026-03-20T16:11:05 1774023065

You might have an application for which speed is not important most of the time. Only one or two processes might require allocation-free code. For such a case, why would you burden all of the other code with the additional complexity? Calling out to a different language then may come with baggage you'd rather avoid.

A project might also grow into these requirements. I can easily imagine that something wasn't problematic for a long time but suddenly emerged as an issue over time. At that point you wouldn't want to migrate the whole codebase to a better language anymore.

kykat · 2026-03-20T20:33:45 1774038825

I saw something like that being suggested when working with GIS data with many points as classes in Java, the object overhead for storing XYZ doubles is quite crazy. The optimization was to build a global double array and use "pointers" to get and set the number in the array.

Even JavaScript is much better for this, much, much better.

spankalee · 2026-03-20T21:24:11 1774041851

JavaScript has the exact same issue - objects are on the heap and require allocation and pointer dereferencing. For huge collections of numbers, arrays might be better.

But JS has another problem: there's no way to force a number to be unboxed (no primitive vs boxed types), so the array of doubles might very well be an array of pointers to numbers[1].

But with hidden class optimizations an object might be able to store a float directly in a field, so the array of objects will have one box per (x,y,z), while an array of "numbers" might have one box per number, so 3x as many. My guess is, without benchmarking, is that JS is much worse than Java then, because the "optimization" will end up being worse.

[1]: Most JS engines have an optimization for small ints, called SMIs, that use pointer tagging to support either an int or a references, but I don't think they typically do this optimization for floats.

lern_too_spel · 2026-03-20T20:44:38 1774039478

You're describing array of structs vs. struct of arrays. Even in JavaScript, you would have to manually do the latter.

kykat · 2026-03-20T20:45:46 1774039546

V8 automatically optimizes objects with the same shape into efficient structs, making array of objects much more efficient than in Java.

And the manually manager int array acts more like system memory, it's not continuous, so you could have point i 0 and 2 and the data would be: [1, 2, 3, x, x,x, 3, 2, 1] (3D points).

So I am not describing a struct of arrays.

lern_too_spel · 2026-03-21T00:29:35 1774052975

"Efficient structs" in v8 are just inefficient Java classes. https://www.dashlane.com/blog/how-is-data-stored-in-v8-js-en...

The optimization you discussed for GIS data is called struct of arrays. JavaScript does not do that automatically for you. You would have to do the same thing manually in JavaScript to avoid per-triple object overhead.

spankalee · 2026-03-20T21:25:57 1774041957

Hidden class optimizations just make JavaScript objects behave a little more like Java class instances, where the VM knows where to find each field, rather than having to look it up like a map.

It doesn't make JS faster than Java, it makes it almost as fast in some cases.

kykat · 2026-03-20T22:35:11 1774046111

I can only say what I observed in testing, and that's that having millions of instances of a class like Point3D{x, y, z} in JS uses significantly less memory than in Java (this was tested on Android, not sure if relevant). It was quite some time ago so I don't remember the details.

gf000 · 2026-03-21T07:39:54 1774078794

Well, Android is not running Java, it runs Android runtime (dalvik) byte code. In general, depending on when it was, that runtime is much much simpler and does a lot more at compile time at the expense of less JIT optimization.

It's also many versions behind the Java API (depending on when it happened).

So your data point is basically completely irrelevant.

ekkeke · 2026-03-20T18:41:53 1774032113

This point gets raised every single time managed languages and low latency development come up together. The trade off is running "fast" all of the time, even when you don't have to, vs running slow most of the time and tinkering when you need to go fast.

I've spent a fair few years developing lowish (10-20us wire to wire) latency trading systems and the majority of the code does not need to go fast. It's just wasted effort, a debugging headache, and technical debt. So the natural trade off is a bit of pain to make the hot path fast through spans, unsafe code, pre-allocated object pools, etc and in return you get to use a safe and easy programming language everywhere else.

In C# low latency dev is not even that painful, as there are a lot of tools available specifically for this purpose by the runtime.

cogman10 · 2026-03-20T16:21:33 1774023693

Bad idea. I've made a pool allocator before, but that was for expensive network objects and expensive objects dealing with JNI.

Doing it to avoid memory pressure generally means you simply have a bad algorithm that needs to be tweaked. It's very rarely the right solution.

gf000 · 2026-03-20T22:00:16 1774044016

Not sure why you are down voted. Depending on how its used it could actually be detrimental to performance.

The JVM may optimize many short lived objects better than a pool of objects with less reasonably lifetimes.

cogman10 · 2026-03-20T22:46:02 1774046762

This is the second time this week on HN that I've seen people suggesting object pools to solve memory pressure problems.

I generally think it's because people aren't experienced with diagnosing and fixing memory pressure. It's one of the things I do pretty frequently for my day job. I'm fortunate enough to be the "performance" guy at work :).

It'll always depend on what the real issue is, but generally speaking the problem to solve isn't reinventing garbage collection, but rather to eliminate the reason for the allocation.

For example, a pretty common issue I've seen is copying a collection to do transformations. Switching to streams, combining transformation operations, or in an extreme case, I've found passing around a consumer object was the way to avoid a string of collection allocations.

Even the case where small allocations end up killing performance, for example like the autoboxing example of the OP, often the solution is to either make something mutable that isn't, or to switch to primitives (Valhalla can't come soon enough).

Heck, sometimes even an object cache is the right solution. I've had good success reducing the size of objects on the heap by creating things like `Map<String, String>` and then doing a `map.computeIfAbsent(str, Function.identity());` (Yes, I know about string interning, no I don't want these added to the global intern cache).

Regardless, the first step is profiling (JFRs and heap dumps) to see where memory is spent and what is dominating the allocation rate. That's a first step that people often skip and jump straight to fixing what they think is broken.

laughing_man · 2026-03-20T19:20:38 1774034438

Java doesn't steer you into object pools. I wrote Java code for 20 years and never used a cache to avoid allocating objects, and never saw a colleague use one. The person you were talking to doesn't know what he's doing.

d_burfoot · 2026-03-20T18:59:50 1774033190

> So just manual memory management with extra steps

This is actually the perfect situation: you are allowed to do it carefully and manually for 1% of code on the hot path, but you don't have to worry about it for the 99% of the code that's not.

wiseowise · 2026-03-20T22:52:09 1774047129

> At that point why not use a better language for the task?

Such as?

cmovq · 2026-03-19T06:42:34 1773902554

This comment assumes game companies throw away all their code and start from scratch on their next title. Which is completely untrue, games are built on decades old code, like most software. There is absolutely a need for maintainable code.

awesome_dude · 2026-03-19T08:52:57 1773910377

I think that there might be a distinction on indie vs big publishers (like EA) - I would expect someone like EA to enforce coding standards, but they do so whilst living off the income of previous hits.