Hacker Newsnew | past | comments | ask | show | jobs | submit | cpio's commentslogin

I've got an IOException while trying to summarize http://matt-welsh.blogspot.com/2013/04/running-software-team... And a different one for http://googleblog.blogspot.com I guess you should put more effort in your html parser. Try Apache Tika, perhaps.



okay, added a fix to try and extract article text from non-feed urls. try http://tldrzr.herokuapp.com/tldr/?feed_url=http://matt-welsh... :)


ah, it won't work directly for webpages (html). the url is expected to be that of a RSS/Atom feed. for the html web pages, copy pasting the text to the textarea works.

Will try to add url content type detection in the next cut and summarizing non-feed url's next up


You forgot Poland, also.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: