Sep 18, 2009

Lazyfeed + RSSCloud + PubSubHubBub = Real Real-time


Hi guys,


I am happy to announce that, as of Today, Lazyfeed is now fully real-time with both RSSCloud and PubSubHubBub protocol integration.

What does this mean? This means posts from blogs that support these protocols will now be notified to you immediately after the author publish them. For example, if you saved the topic "Music", as soon as anyone (of over 1.1 million blogs we are indexing) writes a post about "Music", Lazyfeed will alert you immediately. It is now really like an instant messenger.

The blogs that support these protocols include:
We have been testing this internally for a while, and learned a couple of things that may be helpful for the community, so I will go ahead and share it here. Hopefully we can get the conversation going and contribute in helping the community move faster. So here they are:

1. RSS vs. Atom
Basically, RSS Cloud is a protocol based on RSS--it's included in the original RSS spec--and PubSubHubBub is a protocol based on Atom. For example, a RSS feed from a wordpress.com blog provides RSS Cloud information through <cloud> tag, but an Atom feed from the same blog cannot, because its spec doesn't include <cloud> element. Atom wasn't designed with PubSubHubBub in mind obviously, so it doesn't have a designated tag like RSSCloud. PubSubHubBub uses <link rel="hub"> instead, which leads to the issue I will mention below.

2. Relation with Feedburner : it's complicated.
Feedburner, a feed management service run by Google, provides feeds in RSS format, but supports PubSubHubBub. This is made possible with the tag <atom:link>--to be precise, in case of Feedburner it's actually <atom10:link>. Atom doesn't have a designated tag like RSSCloud, but you can specify the hub inside the <atom10:link> tag (reference). This adoption is a nice solution for PubSubHubBub in that it makes the protocol more flexible, but this causes some confusion.

Typepad uses http://hubbub.api.typepad.com/ as hub, while Blogger and Feedburner uses http://pubsubhubbub.appspot.com/. Now, the problem occurs when a Typepad blog also uses Feedburner. For example, let's take a look at the official everything TypePad blog (http://everything.typepad.com). Its feed includes the following hub-related information:

[1] <link rel="hub" href="http://hubbub.api.typepad.com/" />
[2] <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com" />

The feed URL for this blog is [A] http://everything.typepad.com/blog/atom.xml, which gets forwarded to [B] http://feeds.feedburner.com/TypePadNews on Feedburner. The only combination that works here is [1]-[A] and [2]-[B]. This means, it doesn't work if we try to access the hub [1] with the feed URL [B], not to mention [2] with [A].

We managed to handle this by utilizing only the valid combinations([1]-[A] or [2]-[B]), but I don't think this burden should be put on the aggregator's side, as the variation will only inrease as more parties implement these protocols. One way to solve this problem would be for PubSubHubBub protocol to come up with a way to somehow clearly define the relation between hub and feed, or another way is to allow hubs to handle multiple feeds.

3. Is Feedburner itself Real-time?
I don't know exactly how Feedburner works from behind but it seems that Feedburner itself is not real-time. Right now, all contents from Feedburner hosted blogs are not being collected in real-time. It seems that the feed is updated several minutes after the actual post is published. I am not sure if we are correct on this part, but this was our observation so far. If we really wanted to make this really real-time, there should be a way for a blog to notify even the Feedburner server in real-time, perhaps through either of these two real-time feed protocols, resulting in a relay of two real-time notifications. Again, this part is uncertain, and I am speculating based on our observation, so if anyone from Feedburner is reading this, your clarification will be helpful.

This was my two cents. Above points apply both to PubSubHubBub and RSSCloud--although it's only visible through PubSubHubBub now--and I hope sharing this observation here will contribute in some way to the real-time syndication community.

Lastly, I want to thank Dave Winer for inventing RSS Cloud, and Brad Fitzpatrick and Brett Slatkin for PubSubHubBub, and people at SixApart, Feedburner, and Automattic for amplifying this effort. The world has become a better and faster place.
blog comments powered by Disqus
 
©2010 Lazyfeed | Help | Contact | Terms | Privacy