Hacker Newsnew | past | comments | ask | show | jobs | submit | kpw94's commentslogin

My non-controversial theory: It's all the attention-span-shortening stuff.

- tech apps starting with infinite scroll (facebook, 9gag, Instagram, etc.)

- media/tech shortened content: shorter tv shows, short video content, etc.

(Tiktok is the "state of the art" of those 2 trends pushed to the max)

Specifically, we're getting more & more addicted to things that increase the dopamine spikes frequency, making it increasingly difficult to go in deep focus work.


Absolutely, we are feeding kids so much attention-span killing things. Even as an adult I'm having hard time with YouTube shorts, and i cannot imagine a kids brain having the ways to deal with all that.

I am _fighting_ with elderly relatives to adjust their YouTube habits. They didn't even know it comes on autopilot by default. They don't even check sources, they just let the garbage in.

Also the parents using too many attention-span killing things which is hurting the attention they give their children.

I wonder if there's research on short form but educational content or if that's fundamentally impossible.

For example I remember reading a lot of science magazines / articles growing up (granted popsci but for a kid it still teaches some things) and as I grew up things like the Economist.

Similarly I also played games like math blaster as a kid and have realized I need to intentionally provide games like this to my kids that ideally teach something (the bar being greater than zero learning) rather than playing one of those infinite running games or whatever.

I think we're probably talking about the exact same thing but am curious where content vs. short form media is.

Thanks for sharing :)


Last time I dove into its research, I found that Math Blaster had no impact on student learning.

That doesn't line up though. See if you're 13 and meeting the level in 2012 your scores don't decline. So the levels would lag a few years. The 8 year-olds show up and miss the mark in 2017 that indicate the infinite scroll problem was having a toll on them. Additionally this would start to show in class specific measurements (those kids with access to home internet, personal devices, etc. would have worse scores). I think the argument about social media has merit in discussion of children, but it seems more of a social distinction rather than an objective indicator for academic performance.

I certainly feel several degrees dumber than I did as a teenager without that stuff

When you say tok/s here are you describing the prefill (prompt eval) token/s or the output generation tok/s?

(Btw I believe the "--jinja" flag is by default true since sometime late 2025, so not needed anymore)


Here is llama-bench on the same M4:

  | model                    |       size |     params | backend    | threads |            test |                  t/s |
  | ------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
  | qwen35 27B Q4_K_M        |  15.65 GiB |    26.90 B | BLAS,MTL   |       4 |           pp512 |         61.31 ± 0.79 |
  | qwen35 27B Q4_K_M        |  15.65 GiB |    26.90 B | BLAS,MTL   |       4 |           tg128 |          5.52 ± 0.08 |
  | qwen35moe 35B.A3B Q3_K_M |  15.45 GiB |    34.66 B | BLAS,MTL   |       4 |           pp512 |        385.54 ± 2.70 |
  | qwen35moe 35B.A3B Q3_K_M |  15.45 GiB |    34.66 B | BLAS,MTL   |       4 |           tg128 |         26.75 ± 0.02 |
So ~60 for prefill and ~5 for output on 27B and about 5x on 35B-A3B.

If someone doesn't specifically say prefill then they always mean decode speed. I have never seen an exception. Most people just ignore prefill.

But isn't the prefill speed the bottleneck in some systems* ?

Sure it's order of magnitude faster (10x on Apple Metal?) but there's also order of magnitude more tokens to process, especially for tasks involving summarization of some sort.

But point taken that the parent numbers are probably decode

* Specifically, Mac metal, which is what parent numbers are about


Yes, definitely it's the bottleneck for most use cases besides "chatting". It's the reason I have never bought a Mac for LLM purposes.

It's frustrating when trying to find benchmarks because almost everyone gives decode speed without mentioning prefill speed.


oMLX makes prefill effectively instantaneous on a Mac.

Storing an LRU KV Cache of all your conversations both in memory, and on (plenty fast enough) SSD, especially including the fixed agent context every conversation starts with, means we go from "painfully slow" to "faster than using Claude" most of the time. It's kind of shocking this much perf was lying on the ground waiting to be picked up.

Open models are still dumber than leading closed models, especially for editing existing code. But I use it as essentially free "analyze this code, look for problem <x|y|z>" which Claude is happy to do for an enormous amount of consumed tokens.

But speed is no longer a problem. It's pretty awesome over here in unified memory Mac land :)


Right, they're not the only FAANG company for which we know they're doing it: https://news.ycombinator.com/item?id=46318494

Some might be tempted to brush aside that Server Linux threat model is very different from Desktop Linux (to snarkily reply "we'll it's powering a vast majority of GDP via all of AWS, Azure, etc.").

However comparing apples to apples, what makes you say this isn't ready for government usage, when it's ready for trillion dollar big tech companies' majority of their workforce? (Aside from Microsoft, Apple obviously). Large employers like IBM etc also must be using red hat or some other distro


Google for example uses a fork of Ubuntu. When someone decided to compromise Google employees machines via a fake npm package they were able to do so successfully. When they reported this to Google they said it was okay for employee machines to be compromised and that it was part of Google's threat model. While this may be true for large companies I don't think the French government is ready to handle such a security model.


> that it was part of Google's threat model

That's just PR to avoid stocks going down.


> I don't know how to force this issue as a European. There are just too many levels of abstraction between me and Brussels.

> EU moves so much faster when it comes to regulations like forcing all of us in Denmark to use timesheets, annoying lids on our bottles, and invasive surveillance laws.

Rediscovering the principle of subsidiarity from first principles...


> I'll need to investigate further but it doesn't seem promising.

That's what I meant by "waiting a few days for updates" in my other comment. Qwen 3.5 release, I remember a lot of complaints about: "tool calling isn't working properly" etc.

That was fixed shortly after: there was some template parsing work in llama.cpp. and unsloth pulled out some models and brought back better one for improving something else I can't quite remember, better done Quantization or something...

coder543 pointed out the same is happening regarding tool calling with gemma4: https://news.ycombinator.com/item?id=47619261


The model does call tools successfully giving sensible parameters but it seems to not picking the right ones in the right order.

I'll try in a few days. It's great to be able to test it already a few hours after the release. It's the bleeding edge as I had to pull the last from main. And with all the supply chain issues happening everywhere, bleeding edge is always more risky from a security point of view.

There is always also the possibility to fine-tune the model later to make sure it can complete the custom task correctly. But the code for doing some Lora for gemma4 is probably not yet available. The 50% extra speed seems really tempting.


Wild differences in ELO compared to tfa's graph: https://storage.googleapis.com/gdm-deepmind-com-prod-public/...

(Comparing Q3.5-27B to G4 26B A4B and G4 31B specifically)

I'd assume Q3.5-35B-A3B would performe worse than the Q3.5 deep 27B model, but the cards you pasted above, somehow show that for ELO and TAU2 it's the other way around...

Very impressed by unsloth's team releasing the GGUF so quickly, if that's like the qwen 3.5, I'll wait a few more days in case they make a major update.

Overall great news if it's at parity or slightly better than Qwen 3.5 open weights, hope to see both of these evolve in the sub-32GB-RAM space. Disappointed in Mistral/Ministral being so far behind these US & Chinese models


You're conflating lmarena ELO scores.

Qwen actually has a higher ELO there. The top Pareto frontier open models are:

  model                        |elo  |price
  qwen3.5-397b-a17b            |1449 |$1.85
  glm-4.7                      |1443 | 1.41
  deepseek-v3.2-exp-thinking   |1425 | 0.38
  deepseek-v3.2                |1424 | 0.35
  mimo-v2-flash (non-thinking) |1393 | 0.24
  gemma-3-27b-it               |1365 | 0.14
  gemma-3-12b-it               |1341 | 0.11
  gpt-oss-20b                  |1318 | 0.09
  gemma-3n-e4b-it              |1318 | 0.03
https://arena.ai/leaderboard/text?viewBy=plot

What Gemma seems to have done is dominate the extreme cheap end of the market. Which IMO is probably the most important and overlooked segment


That Pareto plot doesn't seem include the Gemma 4 models anywhere (not just not at the frontier), likely because pricing wasn't available when the chart was generated. At least, I can't find the Gemma 4 models there. So, not particularly relevant until it is updated for the models released today.


Gemma 4 31B has now wiped out several of those models from the pareto frontier, now that it has pricing. Gemma 4 26B A4B has an Elo, but no pricing, so it still isn't on that chart. The Gemma 4 E2B/E4B models still aren't on the arena at all, but I expect them to move the pareto frontier as well if they're ever added, based on how well they've performed in general.


> Wild differences in ELO compared to tfa's graph

Because those are two different, completely independent Elos... the one you linked is for LMArena, not Codeforces.


> Very impressed by unsloth's team releasing the GGUF so quickly, if that's like the qwen 3.5, I'll wait a few more days in case they make a major update.

Same here. I can't wait until mlx-community releases MLX optimized versions of these models as well, but happily running the GGUFs in the meantime!

Edit: And looks like some of them are up!


absolute n00b here is very confused about the many variations; it looks like the Mac optimized MX versions aren’t available in Ollama yet (I mostly use claude code with this)


the benchmarks showing the "old" Chinese qwen models performing basically on par with this fancy new release kinda has me thinking the google models are DOA no? what am I missing?


That's not using tech that you're describing here. You're talking about literally learning some basic computer skills (such as word processor, excel, reading email, some basic website building, use printer, and some amount of programming)

For those, obviously you need a computer and completely agree that those are important skills to learn... But you maybe need to spend 1h/week during last 2 years of middle school on those at the computer lab (as it's been done since the 90s in many schools around the world)

But for any other course such as Math, English (or whichever primary language in your country), second languages, history, etc. : that's where using tech is a mistake

A bit of tech is ok, but it cannot be "everyone does their homework and read lesson on a iPad/Chromebook"


I am pretty skeptical about the value of learning to build websites. I think it is too tempting for students to devote significant time to something that is not foundational knowledge and where they won't get any valuable feedback anyway.

It makes me think back to my writing assignments in grades 6-12. I spent considerable time making sure the word processor had the exact perfect font, spacing, and formatting with cool headers, footers, and the footnotes, etc. Yet, I wouldn't even bother to proofread the final text before handing it in. What a terrible waste of a captive audience that could have helped me refine my arguments and writing style, rather than waste their time on things like careless grammatical errors.

Anyway, I do agree with the idea of incorporating Excel, and even RStudio for math and science as tools, especially if they displace Ed-tech software that adds unnecessary abstractions, or attempts to replace interaction with knowledgeable teachers. One other exception might be Anki or similar, since they might move rote memorization out of the classroom, so that more time can be spent on critical thinking.


Building websites, I agree has little value, but using it as a way to explain basics of how the web works I think is pretty valuable. Web likely isn't going anywhere for a long time, having some basic knowledge of how it works I think very useful for a lot of people. I hate the idea of any more MS apps like Excel being regularly incorporated, but basic usage of something similar definitely can help know of how to use a useful tool/computer skill. Even in the early 90's we had computer labs for learning computer skills which I think there is value. But forcing tech everywhere into teaching is an issue IMO.


The beautiful thing about programming (which also makes edtech such an appealing dream to chase) is that you get immediate feedback from the computer and don't have to wait for someone whose attention is at least semi-scarce to mark your paper.


re: Anki. It is not as optimized but you can do SRS with physical flash-cards.

* Have something like 5 bins, numbered 1-5.

* Every day you add your new cards to bin nr. 1 shuffle and review. Correct cards go to bin nr. 2, incorrect cards stay in bin nr. 1.

* Every other day do the same with bin nr. 1 and 2, every forth with bin nr. 1, 2 and 3 etc. except incorrect cards go in the bin below. More complex scheduling algorithms exist.

* In a classroom setting the teacher can print out the flashcards and hand out review schedule for the week (e.g. Monday: add these 10 new cards and review 1; Tuesday: 10 new cards and review box 1 and 2; Wednesday: No new cards and and review box 1 and 3; etc.)

* If you want to be super fancy, the flash card publisher can add audio-chips to the flash-cards (or each box-set plus QR code on the card).


Would it be a mistake to use Desmos in a math classroom, or 3Blue1Brown style animations, to build up visual intuition? Should we not teach basic numerical and statistical methods in Python? Should kids be forced to use physical copies of newspapers and journal articles instead of learning how to look things up in a database?

I'm all for going back to analog where it makes sense, but it seems wrongheaded to completely remove things that are relevant skills for most 21st century careers.


> Would it be a mistake to use Desmos in a math classroom, or 3Blue1Brown style animations, to build up visual intuition?

I don't think there's anything wrong with showing kids some videos every now and then. I still have fond memories of watching Bill Nye.

> Should we not teach basic numerical and statistical methods in Python?

No. Those should be done by hand, so kids can develop an intuition for it. The same way we don't allow kids learning multiplication and division to use calculators.


>> Should we not teach basic numerical and statistical methods in Python?

> No. Those should be done by hand, so kids can develop an intuition for it. The same way we don't allow kids learning multiplication and division to use calculators.

I would think that it would make sense to introduce Python in the same way that calculators, and later graphing calculators are introduced, and I believe (just based on hearing random anecdotes) that this is already the case in many places.

I'm a big proponent of the gradual introduction of abstraction, which my early education failed at, and something Factorio and some later schooling did get right, although the intent was rarely communicated effectively.

First, learn what and why a thing exists at a sufficiently primitive level of interaction, then once students have it locked in, introduce a new layer of complexity by making the former primitive steps faster and easier to work with, using tools. It's important that each step serves a useful purpose though. For example, I don't think there's much of a case for writing actual code by hand and grading students on missing a semicolon, but there's probably a case for working out logic and pseudocode by hand.

I don't think there's a case for hand-drawing intricate diagrams and graphs, because it builds a skill and level of intimacy with the drawing aspect that's just silly, and tests someone's drawing capability rather than their understanding of the subject, but I suppose everyone has they're own opinion on that.

That last one kind of crippled me in various classes. I already new better tools and methods existed for doing weather pattern diagrams or topographical maps, but it was so immensely tedious and time-consuming that it totally derailed me to the point where I'd fail Uni labs despite it not being very difficult content, only because the prof wanted to teach it like the 50s.


Fwiw calculators were banned in my school. Only started to use one in university - and there it also didnt really help with anything as the math is already more complex


I was allowed to use calculators when I started algebra in seventh grade.

I found that calculators didn't help all that much once you got into symbolic stuff. They were useful for the final reductions, obviously, but for algebra the lion's share of the work is symbolic and at least the relatively cheap two-line TI calculator I was using couldn't do anything symbolic.

I know that there are calculators that can do Computer Algebra System stuff, and those probably should be held off on until at least calculus.


Until most kids are about 12 - 14 years old, they're learning much more basic concepts than you're describing. I don't think anyone is trying to take intro to computer science out of high schools or preventing an advanced student younger than that from the same.

I would rather a teacher have to draw a concept on a board than have each student watch an animation on their computer. Obviously, the teacher projecting the animation should be fine, but it seems like some educators and parents can't handle that and it turns into a slippery slope back to kids using devices.

So for most classrooms full of students in grades prior to high school, the answer to your list of (presumably rhetorical) questions is "Yes."


There's an in-between point my math teacher loved using: an overhead projector. Hand-drawn transparencies that could be made beforehand or on the fly, protected large so everyone could see, without hiding the teacher behind a computer - they'd still stand at the front of the class facing the students.


Sure, that would work too. I wouldn't say that's in-between but a technique that can be used without incorporating any modern technology at all.


This has been replaced by a webcam on a stick and a computer monitor.


Those are great examples. Not familiar with Desmos, but 3Blue1Brown style animations are great.

The problem is that people seem to want to go to extremes. Either go all out on doing everything in tablets or not use any technology in education at all.

its not just work skills, its also a better understanding that is gained from things such as the maths animations you mentioned.


> The problem is that people seem to want to go to extremes. Either go all out on doing everything in tablets or not use any technology in education at all.

I think the latter is mostly a reaction to the former. I think there is a way to use technology appropriately in theory in many cases, but the administrators making these choices are largely technically illiterate and it's too tempting for the teachers implementing them to just hand control over to the students (and give themselves a break from actually teaching).


>Would it be a mistake to use Desmos in a math classroom

Maybe. Back in the day I had classes where we had to learn the rough shape of a number of basic functions, which built intuition that helped. This involved drawing a lot of them by hand. Initially by calculating points and estimating, and later by being given an arbitrary function and graphing it. Using Desmos too early would've prevented building these skills.

Once the skills are built, using it doesn't seem a major negative.

I think of it like a calculator. Don't let kids learning basic arithmetic to use a 4 function calculator, but once you hit algebra, that's find (but graphing calculators still aren't).

Best might be to mix it up, some with and some without, but no calculator is preferable to always calculator.


Skills are less important than foundation.

And Logo or BASIC >> Python in school context IMO.


> (as it's been done since the 90s in many schools around the world)

I had computer lab in a catholic grade school in the mid-late 80's. Apple II's and the class was once a week and a mix of typing, logo turtle, and of course, The Oregon Trail.


what's funny is that they don't even teach basic tech literacy but just rely on kids to figure it out!


The options from big companies to run untrusted open source code are:

1) a-la-Google: Build everything from source. The source is mirrored copied over from public repo. (Audit/trust the source every time)

2) only allow imports from a company managed mirror. All imported packages needs to be signed in some way.

Here only (1) would be safe. (2) would only be safe if it's not updating the dependencies too aggressively and/or internal automated or manual scanning on version bumps would catch the issue .

For small shops & individuals: kind of out of luck, best mitigation is to pin/lock dependencies and wait long enough for hopefully folks like Fibonar to catch the attack...

Bazel would be one way to let you do (1), but realistically if you don't have the bandwidth to build everything from source, you'd rely on external sources with rules_jvm_external or locked to a specific pip version rules_pyhton, so if the specific packages you depend on are affected, you're out of luck.


The Autoboxing example imo is a case of "Java isn't so fast". Why can't this be optimized behind the scenes by the compiler ?

Rest of advice is great: things compilers can't really catch but a good code reviewer should point out.


javac for better or worse is aggressively against doing optimizations to the point of producing the most ridiculously bad code. The belief tends to be that the JIT will do a better job fixing it if it has byte code that's as close as possible to the original code. But this only helps if a) the code ever gets JIT'd at all (rarely true for eg class initializers), and b) the JIT has the budget to do that optimization. Although JITs have the advantage of runtime information, they are also under immense pressure to produce any optimizations as fast as possible. So they rarely do the level of deep optimizations of an offline compiler.


Why should compiler optimize obviously dumb code? If developer wants to create billions of heap objects, compiler should respect him. Optimizing dumb code is what made C++ unbearable. When you write one code and compilers generates completely different code.


The problem is rather that Java doesn't have generics and structs, so you're kind of forced to box things or can't use collections.


No, in the example they provided, programmer wrote obviously stupid code. It has nothing to do with necessity:

    Long sum = 0L;
    for (Long value : values) {
        sum += value;
    }
I also want to highlight that there are plenty of collections utilizing primitive types. They're not generic but they do the job, so if you have a bottleneck, you can solve it.

That said, TBH I think that adding autoboxing to the language was an error. It makes bad code look too innocent. Without autoboxing, this code would look like a mess and probably would have been caught earlier.


>They're not generic but they do the job, so if you have a bottleneck, you can solve it.

But that's the thing, in other languages you don't need a workaround to work on primitives directly.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: