cpio's comments

cpio · on April 10, 2013

I've got an IOException while trying to summarize http://matt-welsh.blogspot.com/2013/04/running-software-team... And a different one for http://googleblog.blogspot.com I guess you should put more effort in your html parser. Try Apache Tika, perhaps.

mohaps · on April 10, 2013

try this url: http://matt-welsh.blogspot.com/feeds/posts/default?alt=rss it works. same feed url pattern will work for google blog too http://googleblog.blogspot.com/feeds/posts/default?alt=rss

mohaps · on April 10, 2013

okay, added a fix to try and extract article text from non-feed urls. try http://tldrzr.herokuapp.com/tldr/?feed_url=http://matt-welsh... :)

mohaps · on April 10, 2013

ah, it won't work directly for webpages (html). the url is expected to be that of a RSS/Atom feed. for the html web pages, copy pasting the text to the textarea works.

Will try to add url content type detection in the next cut and summarizing non-feed url's next up

cpio · on Jan 27, 2011

You forgot Poland, also.