Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: CS – indexless code search that understands code, comments and strings (github.com/boyter)
12 points by boyter 15 hours ago | hide | past | favorite | 2 comments
I initially built cs (codespelunker) as a way to answer the question, can BM25 relevance search work without building an index?

Turns out it can, and so I iterated on the idea, building it into a full CLI tool. Recently I wanted to improve it by adding relevance of tools like Sourcegraph or Zoekt but again without adding an index.

cs uses scc https://github.com/boyter/scc to understand the structure of the file on the fly. As such it can filter searches to code, comments or strings. It also applies a weighted BM25 algorithm where matches in actual code rank higher than matches in comments (by default).

I also added a complexity gravity weight using the cyclomatic complexity output from scc as it scans. So if you're searching for a function, the implementation should rank higher than the interface.

    cs "authenticate" --gravity=brain           # Find the complex implementation, not the interface
    cs "FIXME OR TODO OR HACK" --only-comments  # Search only in comments, not code or strings
    cs "error" --only-strings                   # Find where error messages are defined
    cs "handleRequest" --only-usages            # Find every call site, skip the definition
v3.0.0 adds a new ranker, along with a interactive TUI, HTTP mode, and MCP support for use with LLMs (Claude Code/Cursor).

Since it's doing analysis and complexity math on the fly, it's slower than any grep. However, on an M1 Mac, it can scan and rank the entire 40M+ line Linux kernel in ~6 seconds.

Live demo (running over its own source code in HTTP mode): https://codespelunker.boyter.org/ GitHub: https://github.com/boyter/cs

 help



Can you share how it compares to Serena? https://github.com/oraios/serena

Not familiar with that tool. What follows is my best guess based on what I am seeing.

Serena looks to be a precision tool. Since it uses uses LSP its able to replicate a lot of what a IDE would allow and IDE for LLM's.

cs by contrast is more of a discovery tool. When you're trying to find where the work actually happens it can help you, and since there is no index involved you can get going instantly on any codebase while they are index.

You could use cs for instant to find where the complexity lies, and then use Serena to modify it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: