Archive pour la catégorie ‘PunchTab’

OnNewComment callback for Disqus 2012

Samedi 18 août 2012

PunchTab supports Disqus comments to give points and badges when users post a message. Their recommended solution for this is using:

function disqus_config() {
    this.callbacks.onNewComment = [function() {
        trackComment();
    }];
}

This solution works only if you define disqus_config before loading Disqus. In our case, we have no guaranty. So we ended up using a more reliable solution which works in both cases:

if (window.DISQUS.bind) {
    window.DISQUS.bind('comment.onCreate', function() {
        PT.event.trigger("main.comment", {})
    });
}

But with the new Disqus 2012 product, this was not working anymore. They still support the onNewComment callback, which works if PunchTab loads before Disqus, but the bind solution was broken if PunchTab loads after. We’ve finally found a solution for it:

var app;
if (window.DISQUS.App) {
    app = window.DISQUS.App.get(0);
    if (app) {
        app.bind('posts.create', function(message) {
            PT.event.trigger("main.comment", message)
        });
    }
}

This is not documented so it can break at any time. We’re in touch with the Disqus team to see if there is a better way to achieve this.

Cross-domain events in Javascript

Lundi 30 juillet 2012

As I was telling you in my previous post, we’re currently rewriting most of our Javascript code at PunchTab. As it’s now stable and released for some of our publishers (including this blog), I’m gonna take time to explain the awesome features we’ve built.

The context

As I was saying earlier, we’re dealing with the same technical challenges as Facebook or Google. Our publishers install our Javascript code snippet on their websites, as you would do for a Like button, and it loads our code which then opens some iframes. For many reasons (like connecting the user through Facebook), we have to communicate between the publisher site and our iframes. To achieve this, we’ve been using easyXdm, a cross-domain messaging library. But even so, every time we’ve needed to add a feature using cross-domain communication, it has been painful. That’s why we’ve decided that cross-domain should be one of the core features of our new framework when we started to talk about it.

Cross-domain

Our basic usage of easyXdm was the following:

var socket = new easyXDM.Socket({
    remote: 'iframe_you_want_to_open.html',
    ...,
    onReady: function(){
        socket.postMessage('first message to the iframe');
    },
    onMessage: function(message, origin){
        doSomethingWith(message);
    }
});
On the other side (in the iframe), the loaded page had quite the same code except the remote.
After several months, we’ve faced some difficulties. First, everybody was defining their own format for the messages, making it harder to understand and to treat. Then we had some sockets created multiple times in different places for the same iframe, with different treatments making hard to track where a message was coming from and who was receiving it. We didn’t want to deal with this communication layer anymore, which was not part of our business logic. So one day, someone said: « what if we could use events everywhere? »

Events

Before coming back to cross-domain, I will have to explain our event layer. If you’ve used jQuery, you probably know how to use events like $(‘.button’).click(callback);. Since the beginning, we’ve wanted our new framework to be event driven, it makes it far easier to write independent modules which can interact with each other by triggering and binding on events. So the first piece was to be able to do this:

PT.event.trigger('myEvent', message) // send an event of type 'myEvent' with a message object

PT.event.bind('myEvent', function(message){ alert(message) }); // when an event of type 'myEvent' is triggered, execute the callback

That part was easily done – it’s a few lines of code. As we were dealing with events, we also decided to solve the most annoying problem of events: their volatility.

Indeed, in the previous example, if I execute the code in this order, nothing will happen because we bind the callback after the event has been triggered so we miss it. In some cases, it doesn’t matter. Like for a click, if the user clicks before we are ready to bind on a click event, it doesn’t matter if we miss it, since the user may click again. But if you take a look back at my previous post, imagine we replace our Facebook init code with this (what we’ve actually done):

var previous_fbAsyncInit = window.fbAsyncInit;
if ((window.fbAsyncInit && window.fbAsyncInit.hasRun) ||
    (!window.fbAsyncInit &&
    window.FB !== undefined &&
    FB.Event !== undefined)) {
    PT.event.trigger('facebook.ready');
} else {
    window.fbAsyncInit = function () {
        if (previous_fbAsyncInit) {
            previous_fbAsyncInit();
        }
        PT.event.trigger('facebook.ready');
    };
}

It becomes better since we can just bind on the facebook.ready event for all our features relying on Facebook. But it means they all have to be bound to this event before this piece of code is executed to be safe. What we really want is to execute our callback when facebook.ready happens or execute it directly if facebook.ready has happened in the past.

That’s why we introduced PT.event.persistent(‘facebook.ready’). This triggers and stores the event. Then on a PT.event.bind(), if the event has already happened, it executes the callback directly. We’re typically using it for some events like dom.ready, user.connected, twitter.ready, google.ready, … No need to pay attention in which order you bind on one-time events now.

Propagating events

Ok, now we have an awesome event system. What if we could send all events through easyXdm to the other iframes? That’s what we’ve build by default in the PT.event.trigger function. Whenever we trigger an event, we loop on every iframe we’ve opened through easyXdm and we send them the event serialized in JSON. Then you can trigger and bind on events anywhere and consider all the iframes as the whole and single place. Even if you open an iframe later, we send it the history of persistent events which have already been triggered to make it work the same way.

So now, we just open sockets for cross domain communication and that’s it. All the work is then done with events which makes our life infinitely easier.

To sum up, have a look at this example which sends a message to an iframe through an event:

The code on this side (for explaination about ptReady, see previous post):

<input type="text" id="example" value="Example" />
<input type="submit" id="send" value="Send" />
<div id="iframe-container"></div>
<script type="text/javascript">
window.ptReady = window.ptReady || [];
ptReady.push(function(){
    document.getElementById('send').onclick = function(){
        PT.event.trigger(
            'message.sent',
            document.getElementById('example').value
        );
        return false;
    };
    PT.xdm.socket('iframe', {
        remote: 'http://static.alfbox.net/iframe.html',
        container: 'iframe-container',
    });
});
</script>

The code in the iframe:

<script type="text/javascript">
    PT.event.bind(
        'message.sent',
        function(message){ document.body.innerHTML = message}
    );
    PT.xdm.socket('parent');
</script>

Callback for 3rd party javascript

Vendredi 6 juillet 2012

At PunchTab, we’re dealing with the same technical challenges as Facebook or Twitter: we’re providing a 3rd party Javascript snippet that publishers install on their website.

One of the features we have to provide is a callback that we or our publishers can use to execute code once the PunchTab Javascript has been loaded and executed.

The Facebook approach

Currently, we’re using the Facebook approach: if the publisher has defined a specific function in the global namespace, we execute it when we’re done with our own functions. Where Facebook use fbAsyncInit, we use ptAsyncInit. The publisher may define this function as follows:

window.ptAsyncInit = function(){alert('PunchTab is ready')};

And at the end of our javascript, we simply add:

if (window.ptAsyncInit !== undefined) {
    window.ptAsyncInit();
}

It’s convenient for basic usage cases, but not when multiple 3rd party libraries are involved.

Indeed, a publisher may load Facebook and PunchTab asynchronously. As we’re registering some callbacks for Facebook, we have to use fbAsyncInit, but we never know which will be loaded first – them or us. Here is the code we use to execute our code for Facebook:

var previous_fbAsyncInit = window.fbAsyncInit;
if ((window.fbAsyncInit && window.fbAsyncInit.hasRun) ||
    (!window.fbAsyncInit && window.FB !== undefined && FB.Event !== undefined)) {
    // execute our code relying on FB
} else {
    window.fbAsyncInit = function () {
        if (previous_fbAsyncInit) {
            previous_fbAsyncInit();
        }
        // execute our code relying on FB
    };
}

The issue here is that we have to first detect if Facebook is not already loaded, which is tricky. If it is already loaded, we just execute our code directly. If Facebook isn’t loaded, we redefine the window.fbAsyncInit function to call our code once Facebook is ready. You can notice that our fbAsyncInit is a monkey patch to execute the previous fbAsyncInit if it exists.

As you can see, the global callback is not convenient for two reasons:

  • It is difficult to have multiple callbacks
  • It is not made for the case when Facebook is loaded before another application

The Twitter approach

To achieve the same effect with Twitter, you have to use the snippet on this page. Instead of using a public callback, it defines the global twttr object directly in the snippet (if it doesn’t exist) and then adds a useful function to it: ready():

return window.twttr || (t = { _e: [], ready: function(f){ t._e.push(f) } });

You can then call twttr.ready(your_callback) as many times as you want. It will be pushed into a queue (_e) which will be executed when Twitter is loaded. If Twitter is already loaded, this overwrites the function to directly execute the callback (smart!) and it solves the two previous issues.

The future PunchTab approach

We’re currently rewriting our Javascript SDK, where we are going to switch to the Twitter approach – but in an even simpler way. Twitter does this in a slightly trickier way to avoid leaking in the global namespace. But we chose to use a global variable ptReady to achieve the same.

You will able to do the following:

window.ptReady = window.ptReady || [];

which defines a basic Javascript array if it doesn’t already exist. And then, you just add this:

ptReady.push(your_callback)

if PunchTab is not ready, the queue will be processed when we are. If we are already loaded, we redefine the push function to directly execute the callback. Kaboom!

Redis, far more than a cache engine

Mercredi 13 juin 2012

So many changes since the last post, and maybe the best one is that my English should improve slightly since I’m in San Francisco for more than one year, working at PunchTab.

First of all, to set the context, PunchTab provides a loyalty program for website publishers. After installing a code snippet on their website, their users earn points for their visits, likes, tweets, +1’s and comments. Then these points can be redeemed toward prizes like Starbucks coupons. Long story short, I’ve installed the PunchTab loyalty program on this blog to let you see.

As a company, PunchTab explicitly focuses on building great product, and performance of the product is a big part of the experience. Redis has always helped improve the performance of part of our loyalty program.

But after having used it as a basic cache engine, our needs have made us use it more extensively. Also this post will present the main features of Redis and our use cases. The power of Redis is its data structures. You can find all the documentation for them in the command documentation.

Keys

To follow the Redis documentation, I’m gonna starts with key commands. All the data you store are referenced by a key. You can set or remove an expiration time on each key, search for them using basic regex, delete them, etc. All the classic stuff for a cache engine.

Strings

The first data structure we have used is the string. That’s what Django uses when you set Redis as the cache backend. It’s what is commonly used in cache engines. We are using it to cache Django views, to store some results which are heavy to compute. Nothing new on this side.

Where it starts to be interesting is that a string can be considered as an integer or a float. Redis provides INCR and INCRBY commands to increment these values (and respectively DECR and DECRBY). Even more interesting, it returns the new value. You can then define really efficient counters and that’s what we are using to synchronize parallel tasks while building our leaderboard.

Hashes

If you want to store an object or dictionary that you will access entirely at once, you will use a hash. It will be more efficient than storing each attribute to a different key. You can consider a hash as a dictionary of strings (so you can increment a particular field for instance). You can get the whole stored dictionary for a key with HGETALL or just a particular field with HGET. That’s what we use for our leaderboard to store the last activity of each user (the field) for each publisher (the key) as a JSON serialized string (the value). We could have used the pair publisher/user as a key and avoid the serialization, but the purpose was to avoid making grow the number of keys.

Sets

A set is an unordered list where each value is unique. You can blindly add an element with SADD, if it already exists, it will not be duplicated. Then you can get the content of a set with SMEMBERS, but even better, you can check if a element is in a set with SISMEMBER. Finally you have the usual set operations like union, intersection and diff. We’re using this to store the list of opted out users. Indeed a publisher can remove some users from his loyalty program by opting them out. We needed a really fast way to check if a user is opted out or not while storing an activity. So instead of checking a SQL database or getting a cached list entirely to then make something like « if user in opted_out_users » on the django side, we save time by directly asking redis if the user is opted out with SISMEMBER.

Sorted sets

They’re called ZSET in Redis. We started to use this when we needed to optimize the leaderboard. And actually ZSETs are leaderboards. A ZSET contains unique members like a SET with a score associated to each member. For each member, you can set (ZADD) or increment their score (ZINCRBY). And then, you can retrieve the score (ZSCORE) and the rank of a member (ZRANK), or get a range of members with their score and relative rank (ZRANGE). It becomes easy to get a TOP 10  or get the surrounding members of a particular user. Everything we needed for our leaderboard.

Monitor

I was about to forget a really useful command to see what’s happening: monitor. Type this in your redis client and you will see all commands run on your server. Perfect while developing or to see what’s happening in production for a short time.

Conclusion

I have only talked about what we are using inside PunchTab but there are some other interesting features like the Publisher/Subscriber pattern which I’ve tested really quickly to implement a chat system with socket.io and GEvent during a tutorial at Pycon this year.

The last two important things I would want to highlight are the in-memory model and complexity. As I’m used to say, the complexity may be the most important thing I’ve learned at school because the biggest problems I’ve encountered so far were related to this. Fortunately, each Redis command is documented with its complexity. To me, it shows that the developers really know their business and you can have a good idea how your Redis will scale or how to architecture it (like splitting logically your data on different servers). Concerning the in-memory model, you have to know that Redis stores everything in memory, syncing to disk for restarts or crashes. To keep it efficient you obviously cannot let it swap, so always be careful to use the best suited data structure to be fast and consume the less memory. For instance, a hash with 20 fields will be more efficient than 20 keys in terms of memory. Just have a look to the documentation to see that it’s far more than a cache engine. It’s an in-memory database which is really well suited for specific purposes, like leaderboards.