Mon
20
Feb '06
RSS Filter?
by Frank Spychalski filed under Computer, projects

There are a bunch of RSS Feed Aggregators like Planet, kickrss and rssmix. Mixing RSS feeds is a neat idea, but right now it’s not really mixing but more like dumping together the content of different feeds. To be really useful to me, they lack the feature to selectively remove some of the entries. I read a few feeds where the author posts his links nearly every day or adds a picture of the day (yes, I did that, too, but I’m over it :-) ) which leads to a pretty bad signal to noise ratio.

Last week I spent 30min to write a simple RSS Feed filter that removes entries containing a given string in their title. Right now this is Itch-Scratchware, but I want to make it a little more generic. But before I’ll do this I have some questions:

  • Is it redundant? Is something like this already available somewhere and I was just too stupid to find it?
  • I’m using php and MagpieRSS and RSSWriter to read and write the RSS feeds. Is there another library (in Ruby or PHP) which lets me parse a feed, remove items and write the same feed again? Or do I have to use different Readers/Writers?
  • Featurewise, I’m thinking along the lines of removing an RSS item if attribute x matches a given regexp or doesn’t match, like grep and grep -v. Any other ideas?

[tags]rss, filter, tools[/tags]


11 Responses to “RSS Filter?”

  1. 1

    One of these days Magpie will support this, I’ve got a prototype of it working. You might check out FeedTools for Ruby it support both parsing and generating. (I’ve only used the parsing piece)

    kellan (February 20th, 2006 at 19:15)
  2. 2

    Kellan, thanks for the pointer to FeedTools. It looks just like the kind of package I was looking for.

    Frank Spychalski (February 21st, 2006 at 09:07)
  3. 3

    Yeah, my FeedTools library should be ideal for this kind of thing. This snippet (or something similar) should work pretty well (though I haven’t actually tried it, might need some tweaking):


    feed = FeedTools.open('http://slashdot.org/')
    feed.entries = feed.entries.reject { |entry| entry.content =~ /web 2\.0/ }
    output_xml = feed.build_xml()

    Actually, I’m in the midst of writing an aggregator based on FeedTools with this exact “itch” in mind (and a few other “itches” for good measure).

    Bob Aman (February 21st, 2006 at 21:52)
  4. 4

    Er, yuck. Sure enough, there’s a bug in that code. Should be:


    feed = FeedTools::Feed.open('http://slashdot.org/')

    Saw it just as soon as I hit submit. :-P

    Bob Aman (February 21st, 2006 at 21:58)
  5. 5

    Bob, thanks for the feedback. How long do you think until your aggregator is usable? Do you need any beta-testers?

    Frank Spychalski (February 22nd, 2006 at 10:25)
  6. 6

    Well, to put it simply, I’m trying to build the most powerful aggregator, bar none. I’m hoping to keep things manageable by pushing most of the power features out into plugins, but its already a daunting task, even with the benefits of modularity. IE, it’ll be awhile, most likely. That said, it already has all of the basic features found in most simple aggregators, and the basic stuff only took maybe 3 days to write. It’s handy having such a flexible base as FeedTools.

    Mostly I just have to make sure I keep procrastination and yak-shaving to a minimum.

    Bob Aman (March 7th, 2006 at 17:28)
  7. 7

    Sounds cool. As I got other things to do anyway, I will let my filter rest for a while and hope that your aggregator will solve my itch, too. I’m looking forward to trying it…

    Frank Spychalski (March 10th, 2006 at 13:36)
  8. 8

    I’m glad to see others thinking about this kind of thing!

    I think a lot of this is too complicated for people who are just getting into feeds, but eventually feed management services will be important, and there’s a lot of thinking to be done on how they should be implemented, and what features they should provide.

    I wrote up a “feedmix wishlist” a while back:
    http://alwaysaskwhy.com/jameselee/blog/2005/07/metafeeds-feed-mixing_aggregate_multiple_feeds_into_one.html

    James E. Lee (March 21st, 2006 at 19:35)
  9. 9

    [...] I started working on my RSS filter again. It’s a rewrite of my php test as a small Ruby on Rails app and using Bob Aman’s FeedTools. All the time I was wondering, if this is really worth spending my time. To me the idea of filtering feeds seems obvious, but it felt strange that there were no other tools around to provide the same service. [...]

  10. 10

    I would like to suggest a great new site that organizes your RSS feeds.
    It employs a bayesian filter for RSS feeds where you can train the filter what you like and
    what you don’t like. It’s free, try it at http://www.filteredrss.com.

    web man (July 14th, 2008 at 21:37)
  11. 11

    hi web man,

    the idea sounds neat, but this

    HTTP Status 500 -
    type Status report
    message
    descriptionThe server encountered an internal error () that prevented it from fulfilling this request.
    Sun Java System Application Server 9.1_02

    sounds like you need more debugging…

    Additionally, from your whatis page”FilteredRSS is completely free. We will have some advertising on the RSS training pages. But your RSS feeds will be void of any inserted advertising”. I need to go to a training page? Why don’t you add training links? Or rewrite the URLs and every time I follow a link, this is counted as a vote for this entry?

    Frank Spychalski (July 15th, 2008 at 09:36)

Any comments? Or questions? Just leave a Reply: