Hacker Newsnew | past | comments | ask | show | jobs | submit | nabakin's commentslogin

> Also, note that there's zero CUDA dependency. It runs entirely on Huawei chips.

That is a huge claim to make with no evidence.

I researched what you said, and I have found no statement to that effect in their paper[0], on huggingface[1], twitter[2], WeChat[3], or in their news release[4].

They only mention as a footnote in only the Chinese version of their news release that they plan to reduce inference costs with the Ascend 950 supernode when it releases[5]. The only mention of Huawei in their paper is that they validated a technique to lower interconnect bandwidth on Ascend NPUs and Nvidia GPUs[6].

[0] https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

[1] https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro

[2] https://xcancel.com/deepseek_ai/status/2047516922263285776

[3] https://mp.weixin.qq.com/s/8bxXqS2R8Fx5-1TLDBiEDg

[4] https://api-docs.deepseek.com/news/news260424

[5] https://api-docs.deepseek.com/zh-cn/img/v4-price.png

[6] Page 16


Comments like this are why I go to the comments! I never would have thought to check.

And while I'm here I want to note that I feel there's a big misunderstanding of what is and isn't demonstrated by DeepSeek. So far as I can tell the major (and important!) innovation is reproducing near-frontier level capabilities at a fraction of the cost, but it may be the case that iterating forward at the frontier is the costly thing and is a cost borne by Western companies and that nuance seems to get lost with DeepSeek. Which is not to say that as a matter of principle that non Western companies aren't sometimes capable of jumping into the lead (Kimi has been super impressive) but if GPT/Claude/etc "only" lead at the frontier with more expensive models, that's still a moat.


If you can get something almost as capable for a fiftieth of the price, in most cases you'll do that. You might still send a few tokens to the more expensive option for the exceptional, difficult cases, but that's maybe 10% of the tokens at most. I don't see how it'll be possible to keep spending what anthropic, openai, google etc are spending if they're only going to see the trickiest 10% of tokens.

Missed the point award

Maybe I need to spell out the step that connects them - how will those companies afford to keep "iterating forward at the frontier" when they probably have a huge crash in their income coming from competition with good enough, but 1/50th the price cheaper and open models.

Iterating forward at the frontier doesn't seem like a sustainable approach if everyone else can catch up with you in 6 months.


Thank you for this due diligence, I was just reading through the technical report and couldn’t find any references to the software stack or hardware mentioning Huawei either and came back here wondering about this comment that I had read earlier.

Not long ago the story was this:

DeepSeek’s next AI model delayed by attempt to use Chinese chips

https://www.ft.com/content/eb984646-6320-4bfe-a78d-a1da2274b...


Here's a note about running entirely on Huawei chips:

https://finance.yahoo.com/sectors/technology/articles/deepse...


> DeepSeek indicated that current service capacity for the V4 Pro series is constrained by a computing crunch, though pricing could fall after new clusters powered by Huawei's Ascend 950 chips come online in the second half of the year.

Only mention of Huawei in that article (as of now).


Did you read any part of the link you posted? Huawei is mentioned once and not in the context of the model being trained or currently running on Huawei chips.

Dammit, you found my technique of “citing” sources for papers in high school...

At least when I pulled random citations off Wikipedia I could reasonably trust whoever put it there figured it was tangentially related to what was being cited. I’m not sure I could get away with putting a literal press release that I didn’t read anywhere.

Big L for media literacy there.


They mention it uses MXFP4 quant which is a blackwell capability but it looks like this is also supported by ascend 950 series according to marketing material

DeepSeek is planning to use Huawei extensively for inference

“Due to constraints in high-end compute capacity, the current service capacity for Pro is very limited. After the 950 supernodes are launched at scale in the second half of this year, the price of Pro is expected to be reduced significantly.”

https://x.com/jukan05/status/2047516566149816627


Yes, that's the footnote from citation [5].

I said the same thing as you and I got summarily downvoted (https://news.ycombinator.com/item?id=47888227).

That HN is quick to upvote an unsubstantiated comment ( the grandparent one, because it aligns with the anti US bias? ) and downvote fact finding one doesn't bode too well for the community as a whole. I have seen enough how polticial ideology colors everything in my home country( Malaysia), and the decline of the country is palpable, and I don't expect to find such a thing here. We are supposed to be impassioned and rational, right ?

Render to Jesus what's due to him, ditto for Caeser.


Probably because you said you used DeepSeek. People don't want to see AI in the comments and don't trust AI responses.

> When it comes to information transfer and processing, light can do things that electricity can’t. Photons — particles of light — are far zippier than electrons at working their way through circuits.

Electrons themselves don't move at the speed of light, but information transfer (i.e. communication) via electrons does happen close to the speed of light.

A subtle, but important, distinction that's often misunderstood and means computational performance gains would probably come from bandwidth, not latency.


About 0.6c for cat6 cables, different types of cables can be slightly faster. Speed of light in fiber is also 0.6c due to the refractive index of the core.

Is that through solid-core fiber? Because hollow-core fiber also exists.

Yes. It does, but it's not widely used, it barely just got out of the lab.

> but information transfer (i.e. communication) via electrons does happen close to the speed of light

Speed of light in the medium, not speed of light in vacuum.


In electric circuits, information is transmitted through the electric field, which itself is close to the speed of light.

Nope, it's 1/2 - 2/3 the speed of light depending on the metals used.

The velocity factor is usually 0.6-0.7, never seen it as low as 0.5.

And it's set by the dielectric, not the conducting material.


You're both wrong. It's true that the first whisper of movement travels at the speed of light, but the time until the flow stabilizes (which you WILL need to wait for in electrical chips) is actually slower than the "speed of electricity".

Oh and also: currently the idea behind on-chip lasers is interconnects that don't have this limitation. For example, PCIE is looking to build optical interconnects, which will do the equivalent of bringing every GPU 10x closer to the memory.

Optical computation would require that light switches light transistors on and off, which doesn't seem to be possible with this technology. This is optical computation in the sense of allowing light beams to be produced according to formulas.


Why do you need to wait for it to stabilize? You can keep changing the voltage at one end of the connection even if you have megabits of data currently in transit, without waiting for it to stabilize. Yes, you'll need to do impedance matching. Yes, that's a solved problem. Transmission lines.

Looking at the discussion below this comment, I'd just add this video by AlphaPheonix:

https://www.youtube.com/watch?v=2Vrhk5OjBP8

Good discussion in the comments there as well.


Faster than TensorRT-LLM on Blackwell? Or do you not consider TensorRT-LLM open source because some dependencies are closed source?


I reviewed the TensorRT-LLM commit history from the past few days and couldn't find any updates regarding Gemma 4 support. By contrast, here is the reference for MAX:https://github.com/modular/modular/commit/57728b23befed8f3b4...


If OP meant they have the fastest implementation of Gemma 4 on Blackwell at the moment, I guess that is technically true. I doubt that will hold up when TensorRT-LLM finishes their implementation though.


How is the sglang performance on Blackwell for this model?


Dunno but there's a PR for it. Probably also more performant than Modular.


It's referring to the Lmsys Leaderboard/Lmarena/Arena.ai[0]. It's very well-known in the LLM community for being one of the few sources of human evaluation data.

[0] https://arena.ai/leaderboard/chat


Public benchmarks can be trivially faked. Lmarena is a bit harder to fake and is human-evaluated.

I agree it's misleading for them to hyper-focus on one metric, but public benchmarks are far from the only thing that matters. I place more weight on Lmarena scores and private benchmarks.


Concentrating on LMAreana cost Meta many hundreds of billions of dollar and lots of people their jobs with the Lllama4 disaster.


Lm arena is so easy to game that it's ceased to be a relevant metric over a year ago. People are not usable validators beyond "yeah that looks good to me", nobody checks if the facts are correct or not.


Alibaba maintains its own separate version of lm-arena where the prompts are fixed and you simply judge the outputs

https://aiarena.alibaba-inc.com/corpora/arena/leaderboard


I agree; LMArena died for me with the Llama 4 debacle. And not only the gamed scores, but seeing with shock and horror the answers people found good. It does test something though: the general "vibe" and how human/friendly and knowledgeable it _seems_ to be.


It's easy to game and human evaluation data has its trade-offs, but it's way easier to fake public benchmark results. I wish we had a source of high quality private benchmark results across a vast number of models like Lmarena. Having high quality human evaluation data would be a plus too.


Well there was this one [0] which is a black box but hasn't really been kept up to date with newer releases. Arguably we'd need lots of these since each one could be biased towards some use case or sell its test set to someone with more VC money than sense.

[0] https://oobabooga.github.io/benchmark.html


I know Arc AGI 2 has a private test set and they have a good amount of results[0] but it's not a conventional benchmark.

Looking around, SWE Rebench seems to have decent protection against training data leaks[1]. Kagi has one that is fully private[2]. One on HuggingFace that claims to be fully private[3]. SimpleBench[4]. HLE has a private test set apparently[5]. LiveBench[6]. Scale has some private benchmarks but not a lot of models tested[7]. vals.ai[8]. FrontierMath[9]. Terminal Bench Pro[10]. AA-Omniscience[11].

So I guess we do have some decent private benchmarks out there.

[0] https://arcprize.org/leaderboard

[1] https://swe-rebench.com/about

[2] https://help.kagi.com/kagi/ai/llm-benchmark.html

[3] https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

[4] https://simple-bench.com/

[5] https://agi.safe.ai/

[6] https://livebench.ai/

[7] https://labs.scale.com/leaderboard

[8] https://www.vals.ai/about

[9] https://epoch.ai/frontiermath/

[10] https://github.com/alibaba/terminal-bench-pro

[11] https://artificialanalysis.ai/articles/aa-omniscience-knowle...


And nowadays we have Debian running in a VM on Android [1]

[1] https://www.zdnet.com/article/how-to-use-the-new-linux-termi...


I would consider it reasonable if this was 4x TTFT and Throughput, but it seems like it's only for TTFT.


And right to repair


TIL Europe still has some presence in the Americas. Thought all of that was gone with the Monroe Doctrine


The Monroe Doctrine was about preventing colonial powers from enacting NEW efforts to reach into the Americas, not about getting rid of previous control.

"The occasion has been judged proper for asserting, as a principle in which the rights and interests of the United States are involved, that the American continents, by the free and independent condition which they have assumed and maintain, are henceforth not to be considered as subjects FOR FUTURE COLONIZATION by any European powers." (emphasis mine)

https://usinfo.org/PUBS/LivingDoc_e/monroe.htm


France's longest land border is the one it shares with Brazil.



Yeah, you can visit the EU by… sailing a ways Northeast(ish) from Maine, until you’re just south of (a part of) Canada. And by going to the Caribbean. And South America.

Mostly France and the Netherlands.


Ty this is great


I understand they are similar, but I think this post adds new information to the situation. Regardless, appreciate your help moderating the site.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: