Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
julianlam
4 days ago
|
parent
|
context
|
favorite
| on:
Gemma 4 12B: A unified, encoder-free multimodal mo...
Last time I tried Gemma 4 (26B-A4B) its memory usage would balloon and consume all of my swap until my machine died.
Qwen 3.6 on the other hand barely uses any memory at all for its KV cache.
help
verdverm
4 days ago
[–]
Turns out when you block people from the best and biggest hardware, they get innovative. It reminds me of the Pentium days when everyone was shipping inefficient programs because the processor would be better next year.
reply
iknowstuff
4 days ago
|
parent
[–]
we never stopped doing that!
reply
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
Qwen 3.6 on the other hand barely uses any memory at all for its KV cache.