I've found Gemini 2.5 Flash is the best model in terms of speed/cost/quality. Pro is great as well, but probably not necessary for most chat-with-paper functionality.
I'll add too that building an AI layer on top of arXiv is a deep, deep rabbit hole depending on how far you want to take the project. Drop me a note if you want to chat more about my experience with it.
I’m amazed, the interface is pretty complete and slick for a days work - then again I’m not a WebDev so I’d do it a dumb way.. Curious how this was made..
My site, https://www.emergentmind.com, is exactly for this. It surfaces trending AI/ML/CS papers, summarizes them, links to social commentary, lets you read and download papers, links to topics, and more. Would love any feedback you have!
Hey all, I’m Matt Mazur, the founder of Emergent Mind, a new AI research assistant that helps you learn about any computer science research topic.
You can ask it questions (“What’s the difference between DPO and PPO?”), look up topics (“Graph Convolutional Neural Networks”), learn about papers (“KAN: Kolmogorov-Arnold Networks”), and research authors (“Yann LeCun”). Behind the scenes, it will try to find the most relevant computer science papers on arXiv, then synthesize their findings to generate a detailed, research-backed answer for you, all in a matter of seconds.
Unlike tools like ChatGPT, Emergent Mind is hyper-focused on computer science. It factors in citations and social media metrics to help rank papers (including Hacker News upvotes). It provides references so you know what papers the answers came from, is always up-to-date with arXiv (about 670k comp sci papers and counting), and encourages exploration using automatically-generated follow-up questions and topic links (similar to Wikipedia).
The tool is still fairly new and there’s endless room for improvement, but we wanted to share it with you all to get any feedback to make sure our product roadmap is aligned with what folks would find most useful.
Whether you're doing research or just keeping up with AI news, EM is doing great work. We've gathered millions of data points for each paper. Going forward, we plan to improve the capabilities and expand the sources behind every paper's knowledge graph. This is just the start.
The goal is to build the first AI research assistant that combines paper knowledge with insights from researchers and communicators. Always up to date.
Yes, Emergent Mind is 100% focused on AI/ML papers from arXiv. I think it makes more sense to focus on a niche because you can tailor everything to that niche, vs creating a general research paper site which won't wind up speaking to any audience well.
For anyone curious about Emergent Mind: it surfaces trending AI/ML papers by monitoring social media (HackerNews, Reddit, X, YouTube, and GitHub) for discussions about papers, then ranks them based on the amount of engagement they're getting (similar to how HackerNews uses upvotes). Then, for all trending papers, it automatically summarizes them using GPT-4o and links to relevant discussions so you can learn more.
We're working on a bunch of new capabilities that we'll announce soon too.
If you're a fan of tldr-ai, you might also like my site, EmergentMind.com, which does something similar: it surfaces trending AI papers based on social media engagement (including HackerNews upvotes!), then summarizes those papers using GPT-4 (a bullet point summary + detailed writeup based on the actual content of the paper), and highlights discussions on HN, Reddit, YouTube, GitHub, and X about that paper.
I don't want to highjack this launch post (we definitely need more tools in this space!), just wanted to share my tool for anyone interested since it's related. Feedback welcome: matt@emergentmind.com.
Hey all - I'd like to invite you all to check out Emergent Mind, a website I built to make it easier to stay informed about important new AI/ML research.
It works by monitoring HN, X, Reddit, YouTube, and GitHub for mentions of new arXiv AI/ML papers, and then surfacing trending papers for you. It also runs each paper through GPT-4 to generate an overview (including defining all technical terms) and links to those social media discussions and other resources (GitHub, YouTube videos, references, and related papers) so you can learn more. A lot of the UI and features are inspired by my experience browsing HN for many years.
OP here with a shameless plug: for anyone interested, I'm working on a site called Emergent Mind that surfaces trending AI/ML papers. This TinyLlama paper/repo is trending #1 right now and likely will be for a while due to how much attention it's getting across social media: https://www.emergentmind.com/papers/2401.02385. Emergent Mind also looks for and links to relevant discussions/resources on Reddit, X, HackerNews, GitHub, and YouTube for every new arXiv AI/ML paper. Feedback welcome!
To answer your question: an earlier version of the site focused on surfacing AI news, but that space is super competitive and I don't think Emergent Mind did a better job than the other resources out there. I tried selling it instead of just shutting it down, but ultimately decided to keep it. I recently decided to pivot to covering arXiv papers, which is a much better fit than AI news. I think there's an opportunity with it to not only help surface trending papers, but help educate people about them too using AI (the GPT-4 summaries are just a start). A lot of the future work will be focused in that direction, but I'd also love any feedback folks have on what I could add to make it more useful.
Pivoting into arXiv is a good idea. It helps you have focused prompts and templates.
A natural progression is aggregation, categorization, and related paper suggestions. Since arXiv has HTML versions of papers now, you can also consider allowing deeplinked citations directly from the LLM summaries.
A GPT-curated comments section for papers would also be nice, automatically filtering out any spam that gets past the regular Disqus filters, then scoring/hiding comments based on usefulness or insight.
For anyone interested in staying informed about important new AI/ML papers on arXiv, check out https://www.emergentmind.com, a site I'm building that should help.
Emergent Mind works by checking social media for arXiv paper mentions (HackerNews, Reddit, X, YouTube, and GitHub), then ranks the papers based on how much social media activity there has been and how long since the paper was published (similar to how HN and Reddit work, except using social media activity, not upvotes, for the ranking). Then, for each paper, it summarizes it using GPT-4, links to the social media discussions, paper references, and related papers.
It's a fairly new site and I haven't shared it much yet. Would love any feedback or requests you all have for improving it.
This is exactly what I was using HN for. But, yeah, in kinda sucked compared to yours. Another thing I was trying to create was some sort of NN model that could use the semanticscholar h-index of authors along with the abstract text and T5 to estimate the one-year out citations. Just for personal use, though. That whole thing fell apart because semanticscholar is kinda crap for associating author links to the same author. I frequently ended up with the wrong professors, which I'd think would be easily fixable for them.
Just a note to say that factoring authors into the ranking system is high on my todo list. v1 won't be too fancy - just a hardcoded list of prominent authors whose papers warrant extra visibility. A future version will likely automate it to avoid the hardcoded list.
Also, soon-ish I'm going to add the ability for users to follow specific authors, so you can get notified when they publish new papers.
> Also, soon-ish I'm going to add the ability for users to follow specific authors, so you can get notified when they publish new papers.
If you could do it, this would be a dream. My original intent was to be able to look through only papers citing a popular one and filtering the results for ones having at least one author with a set minimum h-index. Using Google Scholar data required using SerpAPI, which has some annoying limitations.
The core goal is obviously just not to miss out on a paper that will very likely be influential while not having to comb through the mountain of irrelevant papers.
What's funny is that Microsoft Academic was the best suited, but was retired in 2021.
Great site, thanks for sharing. Can you explain how you're determining how many times a paper is cited? Obviously papers include a list of references, but extracting them accurately from the PDF is difficult in my experience (two column formats, ugh) - though the new HTML versions help. And even if you have a list, many authors just mention arXiv paper titles, not their ids, making identifying specific references tricky.
FYI I started embedding the HTML pages in an iframe on Emergent Mind when the HTML version is available: https://www.emergentmind.com/papers/2312.11444 // should make it even easier to stay informed about trending papers
Thanks! I've got a lot more planned for it too. If anyone has any feedback that doesn't make sense to share here, or if you're a researcher who is open to some questions about how you currently follow arXiv papers, drop me a note at matt@emergentmind.com.
I might add comments down the road if there's enough interest and if there's enough traffic to warrant it. Don't want to add them just yet and have zero comments on everything and it look like a ghost town.
Keep the suggestions coming though as you use it more: matt@emergentmind.com.
I'm slowly adding older papers as I work out the kinks in the site. Down the road when the database is more comprehensive, this should definitely be possible.
Can you (or anyone experiencing similar issues) share any details about what's not working in Firefox? I tested it and all is well for me, though it's definitely possible there's an issue with some other version of it.
Emergent Mind is an AI news aggregator that I'm working on to help me and others stay more informed about the latest AI news. Today I launched this new feature that lets you adjust how the news is explained to you with the click of a button. It currently supports 5 explanation styles, though I'll likely add more in the future:
- Explain it Normally
- Like I'm 5 (ELI5)
- In 10 Words or Fewer
- Like an AI Influencer
- As a poem
My hope with Emergent Mind and this feature in general is to make it easier (and potentially more entertaining depending on the style) to not just follow, but get educated about the what's happening in the world of AI.
My site, https://www.emergentmind.com, is similar, though I'm two years in :)
I've found Gemini 2.5 Flash is the best model in terms of speed/cost/quality. Pro is great as well, but probably not necessary for most chat-with-paper functionality.
I'll add too that building an AI layer on top of arXiv is a deep, deep rabbit hole depending on how far you want to take the project. Drop me a note if you want to chat more about my experience with it.
Regardless, thanks for sharing this!