Hacker Newsnew | past | comments | ask | show | jobs | submit | GrinningFool's commentslogin

128GB (112 GB avail) Strix AI 395+ Radeon 8060x (gfx1151)

llama-* version 8889 w/ rocm support ; nightly rocm

llama.cpp/build/bin/llama-batched-bench --version unsloth/Qwen3.6-27B-GGUF:UD-Q8_K_XL -npp 1000,2000,4000,8000,16000,32000 -ntg 128 -npl 1 -c 34000

    |    PP |     TG |    B |   N_KV |   T_PP s | S_PP t/s |   T_TG s | S_TG t/s |      T s |    S t/s |
    |-------|--------|------|--------|----------|----------|----------|----------|----------|----------|
    |  1000 |    128 |    1 |   1128 |    2.776 |   360.22 |   20.192 |     6.34 |   22.968 |    49.11 |
    |  2000 |    128 |    1 |   2128 |    5.778 |   346.12 |   20.211 |     6.33 |   25.990 |    81.88 |
    |  4000 |    128 |    1 |   4128 |   11.723 |   341.22 |   20.291 |     6.31 |   32.013 |   128.95 |
    |  8000 |    128 |    1 |   8128 |   24.223 |   330.26 |   20.399 |     6.27 |   44.622 |   182.15 |
    | 16000 |    128 |    1 |  16128 |   52.521 |   304.64 |   20.669 |     6.19 |   73.190 |   220.36 |
    | 32000 |    128 |    1 |  32128 |  120.333 |   265.93 |   21.244 |     6.03 |  141.577 |   226.93 |
More directly comparable to the results posted by genpfault (IQ4_XS):

llama.cpp/build/bin/llama-batched-bench -hf unsloth/Qwen3.6-27B-GGUF:IQ4_XS -npp 1000,2000,4000,8000,16000,32000 -ntg 128 -npl 1 -c 34000

    |    PP |     TG |    B |   N_KV |   T_PP s | S_PP t/s |   T_TG s | S_TG t/s |      T s |    S t/s |
    |-------|--------|------|--------|----------|----------|----------|----------|----------|----------|
    |  1000 |    128 |    1 |   1128 |    2.543 |   393.23 |    9.829 |    13.02 |   12.372 |    91.17 |
    |  2000 |    128 |    1 |   2128 |    5.400 |   370.36 |    9.891 |    12.94 |   15.291 |   139.17 |
    |  4000 |    128 |    1 |   4128 |   10.950 |   365.30 |    9.972 |    12.84 |   20.922 |   197.31 |
    |  8000 |    128 |    1 |   8128 |   22.762 |   351.46 |   10.118 |    12.65 |   32.880 |   247.20 |
    | 16000 |    128 |    1 |  16128 |   49.386 |   323.98 |   10.387 |    12.32 |   59.773 |   269.82 |
    | 32000 |    128 |    1 |  32128 |  114.218 |   280.16 |   10.950 |    11.69 |  125.169 |   256.68 |

Results are nearly identical running on a Strix Halo using Vulkan, llama.cpp b8884:

    $ llama-batched-bench -dev Vulkan2 -hf unsloth/Qwen3.6-27B-GGUF:IQ4_XS -npp 1000,2000,4000,8000,16000,32000 -ntg 128 -npl 1 -c 34000
    |    PP |     TG |    B |   N_KV |   T_PP s | S_PP t/s |   T_TG s | S_TG t/s |      T s |    S t/s |
    |-------|--------|------|--------|----------|----------|----------|----------|----------|----------|
    |  1000 |    128 |    1 |   1128 |    3.288 |   304.15 |    9.873 |    12.96 |   13.161 |    85.71 |
    |  2000 |    128 |    1 |   2128 |    6.415 |   311.79 |    9.883 |    12.95 |   16.297 |   130.57 |
    |  4000 |    128 |    1 |   4128 |   13.113 |   305.04 |    9.979 |    12.83 |   23.092 |   178.76 |
    |  8000 |    128 |    1 |   8128 |   27.491 |   291.01 |   10.155 |    12.61 |   37.645 |   215.91 |
    | 16000 |    128 |    1 |  16128 |   59.079 |   270.83 |   10.476 |    12.22 |   69.555 |   231.87 |
    | 32000 |    128 |    1 |  32128 |  148.625 |   215.31 |   11.084 |    11.55 |  159.709 |   201.17 |

you should try vulkan instead of rocm. it goes like 20% faster.

Is that based on recent experience? With "stable" ROCm, or the (IMHO better) releases from TheRock? With older or more recent hardware? The AMD landscape is rather uneven.

For this model results are identical. In my experience it can go either way by up to 10%.

In the intervening 6-12 months, they were policy. Since then he's tweet^H^H^H^H^Htruthedsome new tarriff policies that are currently in effect.

> The project or repo's star count _was_ a first filter in the past, a

I agree that it has been a first filter, but should it ever have been? A star only says that someone had a passing interest in a project. Not significantly different from a 'like' on a social media post.


> I made several errors then did a push -f to GitHub and blew away the git history for a half decade old repo. No data was lost, but the log of changes was. No problem I thought, I’ll just restore this from Backblaze.

`git reflog` is your friend. You can recover from almost any mistake, including force-pushed branches.


Before LLMs we had code generators and automation that eliminated a lot of time- and resource-consuming tasks. I think the point still holds.


> Meanwhile although there's no monster trucks on the White House lawn yet, Not /currently/... https://www.nbcnews.com/tech/elon-musk/trump-musk-tesla-whit...


Alternatively[1], for those of us who have enough clutter: Buying it digitally means you've paid for it. The author gets their cut, and you can now seek out unencumbered formats that best serve your usage with a clear conscience.

[1] this is not legal advice...


"The Silence" in Doctor Who touches on similar themes. https://tardis.fandom.com/wiki/Silent#Amnesia_and_hypnotic_a...


The rules say we should default to assuming good faith in comments. But it's hard when I see this comment in 2026.


“A pensar male degli altri si fa peccato ma spesso ci si indovina.” — Giulio Andreotti

(it's a sin to assume bad intent, but you often get it right)

He was a very controversial italian politician.


what would the bad faith motive even be?


$$$, one of the classic bad faith motives. Most of tech nowadays is subsidized by advertising and profiling to some degree, often quite a large degree.


Sooner or later, yes. What stops it , other than layers of imperfect process? And it's the perfect vector to exploit anyone who doesn't review and understand the generated code before running it locally


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: