Please be aware that you are viewing our bleeding edge unstable documentation. Unless you wanted to view the bleeding edge (and possibly unstable) documentation, we recommend you use our stable docs.

Go to Ably's stable canonical documentation »

I know what I'm doing, let me see the bleeding edge docs »

You are viewing our bleeding edge unstable documentation. We recommend you use our stable documentation »
Fork me on GitHub

WebSub - A deep dive

What is WebSub?

In a sentence: WebSub is an open protocol for distributed pub/sub communication. WebSub started life as PubSubHubbub, refined in the W3C Social Web Working Group, and subsequently published as a W3C Recommendation. It was initially an extension of the Atom and RSS protocols, serving the purpose of realtime event notifications from any data type. WebSub’s ‘USP’ is that it saves the client from having to fetch content over HTTP. Instead, WebSub will push the changes to the client, based on their subscriptions.

WebSub is being widely adopted to replace server polling (a relatively bandwidth heavy way of implementing realtime functionality), working within community-driven standards. A key advantage is that WebSub brings together publishers of and subscribers to realtime data, allowing content to be published and updated in a secure and trustable way. This makes it possible to avoid periodic requests to networks. Websub’s secure communication environment is provided through a Publisher-Hub-Subscriber data transfer chain. WebSub is a W3C proposed recommendation, in existence since April 2017. It is based on Publish/Subscribe pattern and on Topics.

According to W3C WebSub: “WebSub provides a common mechanism for communication between publishers of any kind of web content and their subscribers, based on HTTP webhooks. Subscription requests are relayed through hubs, which validate and verify requests. Hubs then distribute new and updated content to subscribers as and when it becomes available.”

How WebSub works?

WebSub uses a publish-subscribe pattern. It can work with any data type over HTTP without any server polling.

The main three entities in this pattern are Publishers, Hubs, and Subscribers.

  1. Publisher
    The publisher is the producer party which generates the content and publishes it to the Internet.
  2. Hub
    Hub acts as an intermediary party between publisher and subscriber. It accepts the content from the publishers and pushes them to subscribers, based on what they are subscribed to.
  3. Subscriber
    Subscriber is the consumer party in WebSub pattern, They find the Hubs advertised by Publisher’s topic and subscribe to them to get notifications whenever the Publisher publishes new content to the Topic they are subscribed to.

Publishers publish new content or update changes to the Hubs, referencing HTTP headers which contain the Topic’s information. Now, the Subscribers who are subscribing to these topics via Hubs will get served with the relevant content securely through the HTTP Post call from Hubs. Subscribers will have securely shared the HTTP end-point to Hubs at the time of subscription. The process for this is as below.


how WebSub works

Thus the providers, hubs and the subscribers jointly generate the WebSub nexus. The above diagram shows the visual information flow of the WebSub pattern. The steps below show how Subscribers and Hubs actually share the information securely with each other.

1. The subscriber sends a subscription request to Hub
To subscribe to the topic, the subscriber sends a form-encoded POST request to the Hub with the following parameters:

Here is the code snippet of subscribe request in Nodejs.

var request = require("request");

var options = {
 method: 'POST',
 url: 'https://pubsubhubbub.superfeedr.com',
 headers:
  {
    'Content-Type': 'application/x-www-form-urlencoded'
  },
 form:
  {
    'hub.mode': 'subscribe',
    'hub.topic': 'http//topic.feeder.com/subscribe/to',
    'hub.callback': 'http://subscriber.server.com/your/callback'
} };

request(options, function (error, response, body) {
 if (error) throw new Error(error);
 console.log(body);
});

It is a good idea to use a unique URL for each subscription so you know what feed is updated when you receive a notification. The Hub will reply with either
`202 Accepted` or
`400 Bad Request`.

If there were any problems with the request, if the subscriber attempts to subscribe to a topic that does not exist, for example, the Hub will return 400 Bad Request and a plain text description of the error. If the Hub accepted the subscription request, it will return 202 Accepted and will then attempt to verify the subscription request.

2. The Hub verifies the subscription request (with built-in security)
After sending the subscription request, the Hub will make a separate request back to the subscriber’s callback URL for the verification purpose. It is necessary to avoid arbitrary subscriptions by the attackers. It’ll be a simple GET request with below query params.

To confirm the subscription, Subscribers will need to reply with 200 OK and a request body of exactly the string provided in the “hub.challenge” parameter in the request. The response body should not contain anything else and is not form-encoded, just the plain string.

To reject the request, reply with 404 Not Found and an empty body. Any response other than 200 OK will indicate to the Hub that the subscription is rejected and you will not receive notification of new content.

To unsubscribe the subscription just follow the step 1 & 2 with hub.mode = “unsubscribe”

Now the subscription is created and the subscriber will get the contents through the callback URL provided at the time of subscription via POST method.

WebSub – use case

WebSub is useful in applications or implementations where content changes frequently and frequent updates are required. As such, it forms a basic part of internet infrastructure, especially the ever-increasing portion that plays out in realtime. Areas where WebSub is particularly useful include news aggregator platforms (CNN), stock exchanges, air traffic networks, analytics platforms, RSS like feeds, weather information providers, etc. In all of these examples fetching content via polling would not be good practice. WebSub is useful as Hubs distribute the latest status to all the subscribed nodes as soon as hub receives the information from the publishers. Using WebSub, subscribers can rest assured that they are always in sync with the most up-to-date data.

Straight out – an area where WebSub is not useful is where you have private or sensitive information. Hubs push the contents to all the subscribers who have subscribed to the topics and publishers share their content with Hubs, so here publishers and subscribers are completely unknown to each other. Private information shared by publishers may reach hundreds of unknown subscribers.

Big names using WebSub

Following are the companies list who are using WebSub for their services.

Publishers
Big name publishers relying on WebSub to make sure published content is received by people who want to read it include: Google (which uses WebSub for most of its content services); CNN; LiveJournal; MySpace; which uses WebSub to control its news feed function), and Wordpress (the world’s largest open source blogging platform, which publishes all their blogs and articles on WebSub).

Subscribers
Examples of companies relying on WebSub subscription systems include: InFeeds.com (an online directory for important stories and ideas shared by people, run on the basis of Google Reader shared items); Netvibes (a personal dashboard to manage and review media from a variety of different channels, using WebSub to get the most relevant media content from the maximum channels).

Hub
Hubs are the center party in WebSub networks, both the Publishers and Subscribers communicate with the Hubs to publish and subscribe the topics’ content.
Here are the list of some popular Hubs.

  1. Superfeedr provides feeds in realtime, It performs as an open Hub in WebSub network, pubsubhubbub.superfeedr.com
  2. Pubsubhubbub.appspot.com
  3. http://phubb.cweiske.de/ Open source PHP hub
  4. https://wordpress.org/plugins/pubsubhubbub/ Wordpress plugin for Hub

WebSub – challenges

The solution can be more technologically challenging on the hub side than on the data providers’ side. With the Hub the central atom of the whole WebSub system, Hubs need to be better prepared to address more complex technical implementation. If the hub goes down the WebSub network stops functioning.

Generally, the development of in-house WebSub system is complex to architect and demands high maintenance. Development considerations for added data distribution functionality include connection state recovery, stability, high performance, and cost-effectiveness. Scalability is another factor when traffic is unpredictable. The network must be scalable to handle the sudden high traffic.

These considerations are helpful to bear in mind when taking the decision to develop in-house solution or buy/use existing solutions.

Read more about engineering event-driven systems on the Ably Engineering blog.

Ably Realtime provides cloud infrastructure and APIs to help developers simplify complex realtime engineering. We make it easy to power and scale realtime features in apps, or distribute data streams to third-party developers as realtime APIs. Create a free Ably account to get started.

References and further reading – links to relevant repos, short, relevant code samples


Back to top