Hacker Newsnew | past | comments | ask | show | jobs | submit | blackmanta's commentslogin

I have an instance that does search related to my research interests, tracks news related to viruses in the US, events happening around my area and an “urgent” news job that uses searx and for things going on around me. I used Qwen 3.5 9B and tuned some of the jobs with GPT 5.4. I recently switched to use Gemma 4 and there was seemingly no major difference. I’ve found it useful for the digest and for findings papers without much effort.

With a nvidia spark or 128gb+ memory machine, you can get a good speed up on the 31B model if you use the 26B MoE as a draft model. It uses more memory but I’ve seen acceptance rate at around 70%+ using Q8 on both models

1 token ahead or 2?

It's interesting - imo we'll soon have draft models specifically post-trained for denser, more complicated models. Wouldn't be surprised if diffusion models made a comeback for this - they can draft many tokens at once, and learning curves seem to top out at 90+% match for auto-regressive ones so quite interesting..


flow matching is making some strides right now, too

Unless the goal of the backdoor is to redirect traffic flows through packet inspection devices that the attacker also controls, the decoupling of the control and data plane in SDN deployments requires a more creative, intricate solution to allow for wiretapping compared to traditional routers.


You can actually enable search in codex but adding the following to the config file

[tools] web_search = true


I will look into adding this in a future update.


While the examples and provided prompt lean toward code (since that's my personal use case), YAMS is fundamentally a generic content-addressed storage system.

I will attempt to run some small agents with custom prompts and report back.


I have been using it for task tracking, research, and code search. When using CLI tools, I found that the LLM's were able to find code in less tool calls when I stored my codebase in the tool. I had to wrangle the LLMs to use the tool verse native rgrep or find.

I am also trying to stabilize PDF text extraction to improve knowledge retrieval when I want to revisit a paper I read but cannot remember which one it was. Most of these use cases come from my personal use and updates to the tool but I am trying to make it as general as possible.


This is an interesting approach! Why not offload PDF extraction to other frameorks that apply OCR pdf -> .md


I may explore this when I implement the vectordb implementation I started.


I have not, but that is something I plan to do when I have time.


I am working to improve the CLI tools to make getting this information easier but I have stored the yam repo in yams with multiple snapshots and metadata tags and I am seeing about 32% storage savings.


Cool. I have no idea what "stored the yam repo in yams" means. What do you mean by "block-level deduplication"? What is a block?


I stored the codebase for yams in the tool. The "blocks" are content-defined blocks/chunks, not filesystem blocks. They're variable-size chunks (typically 4-64KB) created using Rabin fingerprinting to find natural content boundaries. This enables deduplication across files that share similar content.


The graph functionality is exposed through the retrieval functionality. I may improve this later but the idea was to maximize getting the best results when looking for stored data.


There is no built in graph functionality correct? But one could use existing mechanisms like metadata or storing the link between documents as a document itself?


The graph functionality is stubbed but I will expose it in a future update. You can also use metadata and tags for similar things.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: