Hacker Newsnew | past | comments | ask | show | jobs | submit | more KerrAvon's commentslogin

CoreAudio was developed alongside xnu / IOKit for Mac OS X, so it’s kind of all of it. Apple had the opportunity to start fresh with a built-in super low-latency audio subsystem at the turn of the century, and they took it.


Please take anything this fella writes with several grains of salt.


Please could you elaborate?


I mean that goes without saying on anything I read on the damn internet. With that said, I'm not seeing a comparative informational source anywhere that even takes the time of day. Feel free to bombard me with new resources.


Genuine question: Are GitHub workflows stable enough to be used for benchmarking? Like CPU time quantum scheduling is guaranteed to be the same from run to run?


No, it’s sloppy benchmarking


Why are you surprised? Java always suffers from abstraction penalty for running on a VM. You should be surprised (and skeptical) if Java ever beats C++ on any benchmark.


The only "abstraction penalty" of "running on a VM" (by which I think you mean using a JIT compiler), is the warmup time of waiting for the JIT.


The true penalty of Java is that product types have to be heap-allocated, as there is no mechanism for stack-allocated product types.


You're right that Java lacks inline types (although it's getting them really soon, now), but the main cost of that isn't because of stack allocation (because heap allocations in Java don't cost much more than stack allocations), but because cache misses due to objects not being inlined in arrays.


P.S.

Even for flattened types, the "abstraction penalty", or, more precisely, its converse, the "concreteness penalty", in Java will be low, as you don't directly pick when an object is flattened. Instead, you declare whether a class cares about identity or not, and if not, the compiler will transparently choose whether and when to flatten the object, depending on how it's used.


> product types have to be heap-allocated

Conceptually, that’s true, but a compiler is free to do things differently. For example, if escape analysis shows that an object allocated in a block never escapes the block, the optimizer can replace the object by local variables, one for each field in the object.

And that’s not theoretical. https://www.bettercodebytes.com/allocation-elimination-when-..., https://medium.com/@souvanik.saha/are-java-objects-always-cr... show that it (sometimes) does.


Its a statement of our times that this is getting down voted. JIT is so underrated.


in my opinion, this assertion suffers from the "sufficiently smart compiler" fallacy somewhat.

https://wiki.c2.com/?SufficientlySmartCompiler


No, Java's existing compiler is very good, and it generates as good code as you'd want. There is definitely still a cost due to objects not being inlined in arrays yet (this will change soon) that impacts some programs, but in practice Java performs more-or-less the same as C++.

In this case, however, it appears that the Java program may have been configured in a suboptimal way. I don't know how much of an impact it has here, but it can be very big.


Even benchmarks that allow for jit warmup consistently show java roughly half the speed of c/c++/rust. Is there something they are doing wrong? I've seen people write some really unusual java to eliminate all runtime allocations, but that was about latency, not throughput.


> Is there something they are doing wrong?

Yes. The most common issues are heap misconfiguration (which is more important in Java than any compiler configuration in other languages) and that the benchmarks don't simulate realistic workloads in terms of both memory usage and concurrency. Another big issue is that the effort put into the program is not the same. Low-level languages do allow you to get better performance than Java if you put significant extra work to get it. Java aims to be "the fastest" for a "normal" amount of effort at the expense of losing some control that could translate to better performance in exchange for significantly more work, bot at initial development time, but especially during evolution/maintenance.

E.g. I know of a project at one of the world's top 5 software companies where they wanted to migrate a real Java program to C++ or Rust to get better performance (it was probably Rust because there's some people out there who really want to to try Rust). Unsurprisingly, they got significantly worse performance (probably because low-level languages are not good at memory management when concurrency is at play, or at concurrency in general). But they wanted the experiment to be a success, so they put in a tonne of effort - I'm talking many months - hand-optimising the code, and in the end they managed to match Java's performance or even exceed it by a bit (but admitted it was ultimately wasted effort).

If the performance of your Java program doesn't more-or-less match or even exceed the performance of a C++ (or other low level language) program then the cause is one of: 1. you've spent more effort optimising the other program, 2. you've misconfigured the Java program (probably a bad heap-size setting), or 3. the program relies on object flattening, which means the Java program will suffer from costly cache misses (until Valhalla arrives, which is expected to be very soon).


In my experience, if your C++ or Rust code does not perform as well as Java, it's probably because you are trying to write Java in C++ or Rust. Java can handle a large number of small heap-allocated objects shared between threads really well. You can't reasonably expect to meet its performance in such workloads with the rudimentary tools provided by the C++ or Rust standard library. If you want performance, you have structure the C++/Rust program in a fundamentally different way.

I was not familiar with the term "object flattening", but apparently it just means storing data by value inside a struct. But data layout is exactly the thing you should be thinking about when you are trying to write performant code. As a first approximation, performance means taking advantage of throughput and avoiding latency, and low-level languages give you more tools for that. If you get the layout right, efficient code should be easy to write. Optimization is sometimes necessary, but it's often not very cost-effective, and it can't save you from poor design.


> it's probably because you are trying to write Java in C++ or Rust

Well, sure. In principle, we know that for every Java program there exists a C++ program that performs at least as well because HotSpot is such a program (i.e. the Java program itself can be seen as a C++ program with some data as input). The question is can you match Java's performance without significantly increasing the cost of development and especially evolution in a way that makes the tradeoff worthwhile? That is quite hard to do, and gets harder and harder the bigger the program gets.

> I was not familiar with the term "object flattening", but apparently it just means storing data by value inside a struct. But data layout is exactly the thing you should be thinking about when you are trying to write performant code.

Of course, but that's why Java is getting flattened objects.

> As a first approximation, performance means taking advantage of throughput and avoiding latency, and low-level languages give you more tools for that

Only at the margins. These benefits are small and they're getting smaller. More significant performance benefits can only be had if virtually all objects in the program have very regular lifetimes - in other words, can be allocated in arenas - which is why I think it's Zig that's particularly suited to squeezing out the last drops of performance that are still left on the table.

Other than that, there's not much left to gain in performance (at least after Java gets flattened objects), which is why the use of low-level languages has been shrinking for a couple of decades now and continues to shrink. Perhaps it would change when AI agents can actually code everything, but then they might as well be programming in machine code.

What low-level languages really give you through better hardware control is not performance, but the ability to target very restricted environments with not much memory (as one of Java's greatest performance tricks is the ability to convert RAM to CPU savings on memory management) assuming you're willing to put in the effort. They're also useful, for that reason, for things that are supposed to sit in the background, such as kernels and drivers.


> The question is can you match Java's performance without significantly increasing the cost of development and especially evolution in a way that makes the tradeoff worthwhile?

This question is mostly about the person and their way of thinking.

If you have a system optimized for frequent memory allocations, it encourages you to think in terms of small independently allocated objects. Repeat that for a decade or two, and it shapes you as a person.

If you, on the other hand, have a system that always exposes the raw bytes underlying the abstractions, it encourages you to consider the arrays of raw data you are manipulating. Repeat that long enough, and it shapes you as a person.

There are some performance gains from the latter approach. The gains are effectively free, if the approach is natural for you and appropriate to the problem at hand. Because you are processing arrays of data instead of chasing pointers, you benefit from memory locality. And because you are storing fewer pointers and have less memory management overhead, your working set is smaller.


What you're saying may (sometimes) be true, but that's not why Java's performance is hard to beat, especially as programs evolve (I was programming in C and C++ since before Java even existed).

In a low-level language, you pay a higher performance cost for a more general (abstract) construct. E.g. static vs. dynamic dispatch, or the Box/Rc/Arc progression in Rust. If a certain subroutine or object requires the more general access even once, you pay the higher price almost everywhere. In Java, the situation is opposite: You use a more general construct, and the compiler picks an appropriate implementation per use site. E.g. dispatch is always logically dynamic, but if at a specific use site the compiler sees that the target is known, then the call will be inlined (C++ compilers sometimes do that, too, but not nearly to the same extent; that's because a JIT can perform speculative optimisations without proving they're correct); if a specific `new Integer...` doesn't escape, it will be "allocated" in a register, and if it does escape it will be allocated on the heap.

The problem with Java's approach is that optimisations aren't guaranteed, and sometimes an optimisation can be missed. But on average they work really well.

The problem with a low-level language is that over time, as the program evolves and features (and maintainers) are added, things tend to go in one direction: more generality. So over time, the low-level program's performance degrades and/or you have to rethink and rearchitect to get good performance back.

As to memory locality, there's no issue with Java's approach, only with a missing feature of flattening objects into arrays. This feature is now being added (also in a general way: a class can declare that it doesn't depend on identity, and the compiler then transparently decides when to flatten it and when to box it).

Anyway, this is why it's hard, even for experts to match Java's performance without a significantly higher effort that isn't a one-time thing, but carries (in fact, gets worse) over the software's lifetime. It can be manageable and maybe worthwhile for smaller programs, but the cost, performance, or both suffer more and more with bigger programs as time goes on.


From my perspective, the problem with Java's approach is memory, not computation. For example, low-level languages treat types as convenient lies you can choose to ignore at your own peril. If it's more convenient to treat your objects as arrays of bytes/integers (maybe to make certain forms of serialization faster), or the other way around (maybe for direct access to data in a memory-mapped file), you can choose to do that. Java tends to make solutions like that harder.

Java's performance may be hard to beat in the same task. But with low-level languages, you can often beat it by doing something else due to having fewer constraints and more control over the environment.


> or the other way around (maybe for direct access to data in a memory-mapped file), you can choose to do that. Java tends to make solutions like that harder.

Not so much anymore, thanks to the new FFM API (https://openjdk.org/jeps/454). The verbose code you see is all compiler intrinsics, and thanks to Java's aggressive inlining, intrinsics can be wrapped and encapsulated in a clean API (i.e. if you use an intrinsic in method bar which you call from method foo, usually it's as if you've used the intrinsic directly in foo, even though the call to bar is virtual). So you can efficiently and safely map a data interface type to chunks of memory in a memory-mapped file.

> But with low-level languages, you can often beat it by doing something else due to having fewer constraints and more control over the environment.

You can, but it's never free, rarely cheap (and the costs are paid throughout the software's lifetime), and the gains aren't all that large (on average). The question isn't "is it possible to write something faster" but "can you get sufficient gains at a justifiable costs", and that's already hard and getting harder and harder.


> Java in C++ or Rust.

This critic always forgets that Java is how most folks used to program in C++ARM, 100% of all the 1990's GUI frameworks written in C++, and that the GoF book used C++ and Smalltalk, predating Java for a couple of years.


Has anyone done a fork of the benchmark game or plb2 to demonstrate the impacts of jit warmup and heap settings?


I don't know what plb2 is, but the benchmark game can demonstrate very little for because, the benchmarks are small and uninteresting compared to real programs (I believe there's not a single one with concurrency, plus there's no measure of effort in such small programs) and they compares different algorithms against each other.

For example, what can you learn from the Java vs. C++ comparison? In 7 out of 10 benchmarks there's no clear winner (the programs in one language aren't faster than all programs in the other) and what can you generalise from the 3 where C++ wins? There just isn't much signal there in the first place.

The Techempower benchmarks explore workloads that are probably more interesting, but they also compare apples to oranges, and like with the benchmark game, the only conclusion you could conceivably generalise (in an age of optimising compilers, CPU caches, and machine-learning banch predictors, all affected by context) is that C++ (or Rust) and Java are about the same, as there are no benchmarks in which all C++ or Rust frameworks are faster than all Java ones or vice-versa, so there's no way of telling whether there is some language advantage or particular optimisation work done that helps a specific benchmark (you could try looking at variances, but given the lack of a rigorous comparison, that's probably also meaningless). The differences there are obviously within the level of noise.

Companies that care about and understand performance pick languages based on their own experience and experiments, hopefully ones that are tailored to their particular program types and workloads.


The linked article makes a specific carveout for Java, on the grounds that its SufficientlySmartCompiler is real, not hypothetical.


c++ certainly also has and needs a similarly sufficiently smart compiler to be compiled at all…


For the most naive code, if you're calling "new" multiple times per row, maybe Java benefits from out of band GC while C++ calls destructors and free() inline as things go out of scope?

Of course, if you're optimizing, you'll reuse buffers and objects in either language.


> maybe Java benefits from out of band GC

benchmarks game uses BenchExec to take 'care of important low-level details for accurate, precise, and reproducible measurements' ….

BenchExec uses the cgroups feature of the Linux kernel to correctly handle groups of processes and uses Linux user namespaces to create a container that restricts interference of [each program] with the benchmarking host.


I'm talking about memory management in-process, I dont think cgroups would affect that?


In the end, even Java code becomes machine code at some point (at least the hot paths).


yes, but that's just one part of the equation. machine code from compiler and/or language A is not necessarily the same as the machine code from compiler and/or language B. the reasons are, among others, contextual information, handling of undefined behavior and memory access issues.

you can compile many weakly typed high level languages to machine code and their performance will still suck.

java's language design simply prohibits some optimizations that are possible in other languages (and also enables some that aren't in others).


> java's language design simply prohibits some optimizations that are possible in other languages (and also enables some that aren't in others).

This isn't really true - at least not beyond some marginal things that are of little consequence - and in fact, Java's compiler has access to more context than pretty much any AOT compiler because it's a JIT and is allowed to speculate optimisations rather than having to prove them.


It can speculate whether an optimization is performant. Not whether it is sound. I don't know enough about java to say that it doesn't provide all the same soundness guarantees as other languages, just that it is possible for a jit language to be hampered by this. Also c# aot is faster than a warmed up c# jit in my experience, unless the warmup takes days, which wouldn't be useful for applications like games anyway.


> Not whether it is sound.

Precisely right, but the entire point is that it doesn't need to. The optimisation is applied in such a way that when it is wrong, a signal triggers, at which point the method is "deoptimised".

That is why Java can and does aggressively optimise things that are hard for compilers to prove. If it turns out to be wrong, the method is then deoptimised.


But how can it know the optimization violated aliasing or rounding order or any number of usually silent ub?


There's no aliasing in the messy C sense in Java (and no pointers into the middle of objects at all). As for other optimisations, there are traps inserted to detect violation if speculation is used at all, but the main thrust of optimisation is quite simple:

The main optimisation is inlining, which, by default, is done to the depth of 15 (non-trivial) calls, even when they are virtual, i.e. dispatched dynamically, and that's the main speculation - that a specific callsite calls a specific target. Then you get a large inlined context within which you can perform optimisations that aren't speculative (but proven).

If you've seen Andrew Kelley's talk about "the vtable boundary"[1] and how it makes efficient abstraction difficult, that boundary does not exist in Java because compilation is at runtime and so the compiler can see through vtables.

But it's also important to remember that low-level languages and Java aim for different things when they say "performance". Low-level languages aim for the worst-case. I.e., some things may be slower than others (e.g. dynamic vs. static dispatch) but when you can use the faster construct, you are guaranteed a certain optimisation. Java aims to optimise something that's more like the "average case" performance, i.e. when you write a program with all the most natural and general construct, it will, be the fastest for that level of effort. You're not guaranteed certain optimisations, but you're not penalised for a more natural, easier-to-evolve, code either.

The worst-case model can get you good performance when you first write the program. But over time, as the program evolves and features are added, things usually get more general, and low level languages do have an "abstraction penalty", so performance degrades, which is costly, until at some point you may need to rearchitect everything, which is also costly.

[1]: https://youtu.be/f30PceqQWko


I mostly do dsp and control software, so number heavy. I am excited at the prospect of anything that might get me a performance boost. I tried porting a few smaller tests to java and got it to c2 some stuff, but I couldn't get it to autovectorize anything without making massive (and unintuitive) changes to the data structures. So it was still roughly 3x slower than the original in rust. I'll be trying it again though when Valhalla hits, so thanks for the heads up.


You can use the Vector API (https://openjdk.org/jeps/529) for manual vectorisation.

Although there's no doubt that the lack of flattened object is the last remaining real performance issue for Java vs. a lower-level language, and it specifically impacts programs of the kind you're writing. Valhalla will take care of that.


macOS and Windows have a much smaller set of variants, and tend to ship a single UI with everything included with OS. Even the best single desktop Linux distros will ship divergent KDE and Gnome apps.

If you want essentially perfect high-DPI support out of the box and can afford higher end displays, use macOS. It just works. I see the comments above about scaling, and to that, I say: most people will never notice. However, a Win32 app being the wrong scale? They'll notice that.

But the real display weak point of Linux right now vs Windows is HDR for gaming. That's a real shitshow and it tends to just work on Windows.


We should want open borders. Immigration is a significant net positive. But we can settle for controlled immigration with liberal limits.

H1-B is stupid on its face. You're seriously telling me that this software engineering job absolutely cannot be filled by an American? That doesn't pass the laugh test.


> H1-B is stupid on its face. You're seriously telling me that this software engineering job absolutely cannot be filled by an American? That doesn't pass the laugh test.

The job description is a senior full stack product developer fluent in all programming languages and frameworks. Salary is $70,000/year. Somehow they can never find Americans to fill those jobs. They'll go on Linkedin complaining that Americans are too lazy and don't have the right hustle culture and talk about made up concepts like work life balance when the bosses demand 100 hour work weeks without overtime pay.


That seems low. Is it a corporate strategy to set a low salary and when nobody local fills it (because it's below the competitive rate) they get to hire H1-B?


No, because H1B has pay requirements. As someone who went through the process with Amazon I can confirm that they definitely do offer you a salary that is in line with the local market. There might be lower incentive for raises down the line, but that's a conspiracy theory at best


Yes.


That's the commonly used method for more than a decade, yes.


Link the job description because I don't believe this is real.


> Salary is $70,000/year

The lowest allowed limit for such a job is around $140k in areas like Seattle.


Allowed by whom?


By law. H1b requires the wages to be greater than the prevailing wage for similar positions in the region. They are published by DoL: https://flag.dol.gov/wage-data/wage-search

For this kind of experience, you'd be looking for level 2 _minimum_ and likely level 3. For King County in WA it's right now $149240 and $180710 respectively. Level 4 wage is $212202, btw.


The H1B requirements are even higher, but also WA state law requires software developer salaries to be 3.5 x minimum wage x 52 weeks per year. Currently, that is $124k+, because minimum wage is $17.13 per hour.

https://app.leg.wa.gov/wac/default.aspx?cite=296-128-535

https://www.lni.wa.gov/forms-publications/f700-207-000.pdf


Our competitors in another country will have no problem building those products.

Then they'll be sold in America to American consumers.

Then our industry deflates, because we can't compete on cost or labor scale / innovation.

If we put up tariffs, we get a short respite. But now our goods don't sell as well overseas in the face of competition. Our industries still shrink. Eventually they become domestically uncompetitive.

So then what? You preserved some wages for 20 years at the cost of killing the future.

I think all of these conversations are especially pertinent because AI will provide activation energy to accelerate this migration. Now is not the time to encourage offshoring.


If my job is shipped to India today why would I care that twenty years later the boss is Indian instead of American?


> If my job is shipped to India today

Immigration isn't "shipping the job to India". It's bringing the labor here and contributing to our economy. This might have a suppressive force on wages, but it lifts the overall economy and creates more opportunity and demand.

Offshoring is permanent loss. It causes whatever jobs and industry are still here to atrophy and die. The overall economy weakens. Your outlook in retirement will be bleaker.

If you have to pick between the two, it's obvious which one to pick.


> This might have a suppressive force on wages

And that's the general problem. People don't care about the overall economy when wages are going down and cost of living is going up. Even myself, I couldn't care less about the overall health of the economy. I care about being able to subsist mine and my family's life style, put food on the table, someday own a home, not live paycheck to paycheck because all the jobs are paying below a living wage, etc.

I'm extremely fortunate to make the salary that I do, but I know plenty of others not so fortunate, in other fields that don't pay nearly as well as tech does, and probably never will. The answer can't be "go into tech" nor should it be "let's suppress wages so labor isn't so expensive for our domestic companies." And obviously offshoring isn't great either.

We can still import talent without suppressing wages, by not abusing the program and actually only importing for roles that truly, beyond all reasonable doubt, could not be filled by a domestic worker.


Usually the next step of this failed discourse is to explain that locals are so entitled that they don't want to do hard jobs for the minimum wage, due to decades of wage suppression done thanks to immigration.

In France, being a cook used to pay very well, now that most cooks in Paris are from India or Sri Lanka, often without a proper visa or at the minimum wage, no local wants to do this anymore (working conditions are awful).

The industry then whines loudly about "the lack of qualified (cheap) workers"


Turns out this is a difficult problem with no one good solution. Subjecting labor to a race to the bottom is probably the most efficient individual system from a capitalist standpoint, but it destroys itself just as much as your customers can no longer afford to buy most of the products made. The selfish strategy ruins the entire system if everybody does it.

Capitalism and Communism have opposite problems. Communism attempts to manage the markets from a top down approach, making it relatively easy to handle systemic problems but almost impossible to optimize for efficiency because there is far too much information that doesn't make it to the top. Capitalism by contrast pushes the decisions down to where the information is, allowing for excellent efficiency but leaving it blind to systemic problems.

So the best solution is some kind of meet in the middle approach that is complex and ugly and fosters continual arguments over where lines should be drawn.


Innovation is why american salaries in tech are so high. They funded trillion dollar companies.

If that becomes so much of a commodity that some other countries can do it for pennies on the dime, then yes. Salaries will deflate. But we sure aren't offshoring (nor using most H1bs) to see more innovation. Quite the opposite.

Tech isn't manufacturing where the biggest supply line wins by default. That's why I'm not holding my breath that the US isn't going to be outcompeted on talent anytime soon. Of anything, its own greed will consume it.


You say "we should want open borders" then argue for something that is objectively not open borders. "Open borders" and "controlled immigration" are diametrically opposed things, regardless of whatever liberal limits you're imagining. Almost nobody is arguing for zero immigration.


Arbitrary cruelty is the point. There isn't a coherent rationale. You're going to the gulags if the subhuman pig in the mask says you are. You'll be lucky (?) if they don't just decide to execute you in the street. This is the reality on the ground right now. In a blue state in America.


> ... if the subhuman pig in the mask...

You're using exactly the kind of dehumanizing rhetoric that the administration is using in order to justify their violence and inhumane treatment of immigrants. You might need to think about that.

Other than the quoted snippet, everything you said may be true. Still, don't dehumanize your opponents.


Dehumanizing ICE will help the administration. They WANT war in the streets. That's how you solidify your power without proper democratic processes.


Would it be okay to dehumanize Nazi Gestapo officers going door–to–door searching for Jews? If not, don't dehumanize ICE.


They're dehumanizing people. How like them do you want to be?

In fighting them, don't become them.


Is there really any difference between the Nazis and the people who fought the Nazis?


First off, people have been calling police pigs since time immemorial.

Second, anybody who is abducting 5 year olds is subhuman.


This is a very silly argument.

There were several actual Unixes released based on Mach, and some of them more purely Mach than macOS/NeXT ever have been.


It is ridiculous. I skimmed through it and I'm not convinced he's trying to make the point you think he is. But if he is, he's missing that we do understand at a fundamental level how today's LLMs work. There isn't a consciousness there. They're not actually complex enough. They don't actually think. It's a text input/output machine. A powerful one with a lot of resources. But it is fundamentally spicy autocomplete, no matter how magical the results seem to a philosophy professor.

The hypothetical AI you and he are talking about would need to be an order of magnitude more complex before we can even begin asking that question. Treating today's AIs like people is delusional; whether self-delusion, or outright grift, YMMV.


> But if he is, he's missing that we do understand at a fundamental level how today's LLMs work.

No we don't? We understand practically nothing of how modern frontier systems actually function (in the sense that we would not be able to recreate even the tiniest fraction of their capabilities by conventional means). Knowing how they're trained has nothing to do with understanding their internal processes.


> I'm not convinced he's trying to make the point you think he is

What point do you think he's trying to make?

(TBH, before confidently accusing people of "delusion" or "grift" I would like to have a better argument than a sequence of 4-6 word sentences which each restate my conclusion with slightly variant phrasing. But clarifying our understanding of what Schwitzgebel is arguing might be a more productive direction.)


You should really read the book mentioned in the post you're responding to.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: