Anatomy of a WP8 app. Part 1: MPNS

10 September 2013

In this post, we take a look at how to work with raw push notifications in a Windows Phone 8 app with Azure Mobile Services. We go beyond the normal example code to examine some of the real-world issues you're likely to encounter.

Background

The background to the information presented here comes from experience gained using the Microsoft Push Notification Service (MPNS) as part of the development of roundup, our latest Windows Phone 8 app. In a nutshell, the app allows one person (the "inviter") to share their location with others ("invitees"). Inviter and invitees can see each other on the app's map. Location data is exchanged between devices using the MPNS, via Azure Mobile Services.

When I started the design and development of roundup in early May 2013, I knew immediately that the MPNS would be a key technology, which I would use to broadcast location and other information updates between devices. Of course, one always comes away at the end of a project knowing more than when you started. That's inevitable. However, with MPNS, it all seemed rather straightforward at first. Certainly, all the documentation and sample code suggests that you won't encounter many problems. However, my use of MPNS to send raw notifications from an Azure Mobile Services (AMS) service highlighted quite a few areas where things aren't quiet as simple as they at first seem.

So, my purpose in writing-up my experiences with MPNS is to provide a useful reference for other developers who are tackling similar development tasks.

This post will cover the following topics:

Overview of MPNS. What it is, and what problems if solves

MPNS solves the problems of discoverability, inter-device communication and update polling.

MPNS exists because, as a developer, you can't (easily) send data directly from one device to another (to start with, you'd need some mechanism for devices to register their IP addresses with a cloud-based service, which starts to sound like the MPNS). It provides a standardized way for pushing data from the cloud to a device. Without the MPNS, apps would have to periodically poll a cloud service to request updated data (which would have serious implications for battery-life, cellular data usage, etc.)

In order to make use of MPNS, the developer writes a short piece of code that requests a communication Channel URI from the MPNS. It then gives that URI to the app's cloud service (e.g. AMS, or a web service). Whenever the cloud service wants to push data to the device, it makes a request to the MPNS and provides the device's channel URI. The MPNS handles the actual sending of the notification (e.g. a tile update, a toast message, or a raw notification in the form of a JSON object).

The following diagram shows the arrangement:

The following references provide a good background on using MPNS with AMS:

Using MPNS in your app

So, the app code essentially has two tasks:

  1. Get the channel URI and register it with the cloud service
  2. Handle incoming notifications

The following code shows how to tackle the first of these tasks:

public void Connect()
{
    try
    {
        _httpNotificationChannel = HttpNotificationChannel.Find(ChannelName);
        if(_httpNotificationChannel == null)
        {
            // Request a new channel uri from the MPNS
            _httpNotificationChannel = new HttpNotificationChannel(ChannelName);
            SubscribeToChannelEvents();
            _httpNotificationChannel.Open();  
        }
        else
        {
            // The channel is already open, just re-register for events
            ChannelUri = _httpNotificationChannel.ChannelUri; 
            SubscribeToChannelEvents();
        }
    }
    catch(Exception ex)
    {
        // Log the error
    }
}

private void SubscribeToChannelEvents()
{
    _httpNotificationChannel.ChannelUriUpdated += OnChannelUriUpdated;
    _httpNotificationChannel.HttpNotificationReceived += OnHttpNotificationReceived;
    _httpNotificationChannel.ErrorOccurred += OnChannelErrorOccurred;
    _httpNotificationChannel.ConnectionStatusChanged += OnConnectionStatusChanged;
}

public void OnChannelUriUpdated(object sender, notificationchannelurieventargs args)
{
    // This handler is called when the MPNS sends us our channel uri
    try
    {
        ChannelUri = args.ChannelUri;  // Save the channel uri in a property
    }
    catch(Exception ex)
    {
        // Log the error                                
    }
}

Once we have the channel URI, we can send it to our cloud service (in this case, AMS) as follows:

public async Task<bool> UpdateChannelUriAsync(int sessionId, string channel)
{
    // In this example, the AMS table we will update is named "Session"
    var session = new Session
    {
        id = sessionId,
        Timestamp = DateTime.Now.ToUniversalTime(),
        Channel = channel,  // The new channel uri
        RequestMessageId = (int)RoundUpRequestMessage.UpdateInviterChannelUri
    };

    try
    {
        // Update the Channel column in the Session table (where the row id = sessionId)
        // Note that _mobileService = new MobileServiceClient("service uri", "key");
        await _mobileService.GetTable<Session>().UpdateAsync(session);
        return true;
    }
    catch(MobileServiceInvalidOperationException ex)
    {
        // Error handling
    }
}

In order to use the AMS client API, simply add the WindowsAzure.MobileServices package to your WP8 project using NuGet. To do this, type the following from the Package Manager Console window in Visual Studio:

install-package WindowsAzure.MobileServices

Using MPNS with AMS

Using the MPNS with AMS is (generally) fairly straightforward. The easiest mechanism is to use the "CRUD" scripts that run whenever a Create, Read, Update or Delete operation is requested on a table. You can also hook into the MPNS if you have a custom API setup for your AMS service.

For example, here's a simplified extract from the insert script that runs on AMS when the roundup app updates geographic location information for a device. In this example, we want to push the updated location data to another device (we'll refer to the two devices as the "invitee" and the "inviter").

function update(item, user, request) {

    // Update the Invitee table with the new location information
    
    request.execute({
        success: function() {

            // Notify the Inviter that an Invitee changed location
            // Here we create a JSON object which will form the raw
            // MPNS notification payload
                        
            var payloadMsg = "{" +
                "SessionId:" + item.sid.toString() + "," +
                "InviteeId:" + item.id.toString() + "," +
                "Latitude:" + item.Latitude.toString() + "," +
                "Longitude:" + item.Longitude.toString() + "}";

            // Send a raw notification to the inviter using the built-in push.mpns object.
            // The InviterChannelUri var holds the MPNS channel uri for the inviter.
            // This is read from the "Session" table (code not included for clarity)
            
            push.mpns.sendRaw(InviterChannelUri, { payload: payloadMsg }, {
                success: function() {

                    request.respond();  // Success
                    return;
                },
                error: function() {
                    
                    // Error handling
                }
            });
        }          
    });
}

Waiting for a Channel URI

So far the code and techniques I've discussed are fairly standard. At this point we'll start looking a little more deeply, and consider some of the not-so-obvious issues the developer is likely to encounter.

One of the first MPNS "funnies" I noticed was that the amount of time you have to wait to receive the channel uri back varies from 0..30 seconds or so, sometimes longer. Clearly, in some apps this won't be a problem. But in roundup it's pretty critical. I ended up raising an event (in the class I created to encapsulate the MPNS) when the channel uri is available. I then handle that event in the View Model.

Receiving MPNS notifications

When the MPNS sends a notification to a device we receive it by handling the HttpNotificationChannel.HttpNotificationReceived event.

The following code shows a simplified example of how roundup handles incoming notifications:

public void OnHttpNotificationReceived(
    object sender, 
    HttpNotificationEventArgs httpNotificationEventArgs)
{
    try
    {
        // Read the notification as a stream of bytes

        byte[] bytes;
        using(var stream = httpNotificationEventArgs.Notification.Body)
        {
            bytes = new byte[stream.Length];
            stream.Read(bytes, 0, (int) stream.Length);
        }
        
        // Decode a string representation of the notification

        var msgText = Encoding.UTF8.GetString(bytes, 0, bytes.Length);
        
        // Deserialize the notification. It's a JSON representation of a RoundUpNotification
        // object (see the format of the JSON object sent in the AMS script)
        // We use the Newtonsoft.Json NuGet package to deserialize

        var notification = JsonConvert.DeserializeObject<RoundUpNotification>(msgText);
        
        // Update our list of received notifications

        Notifications.Add(notification);
        
        // Fire our PushNotification event (allows View Model to process the notification)

        OnPushNotification(new RoundUpNotificationEventArgs(notification));
    }
    catch(Exception ex)
    {
        // Log the error                                    
    }
}

An important point to note is that the HttpNotificationChannel.HttpNotificationReceived event arrives on a non-UI thread. This means that if you need to update any UI components as a result of the notification, you need to synchronize access to the UI thread as follows:

Deployment.Current.Dispatcher.BeginInvoke(() =>
{
    // Update the UI
});

How to handle MPNS as part of the WP8 lifecycle

Can I use HttpNotificationChannel.Find(ChannelName) to re-use my previous channel?

I noticed that if the app successfully got an MPNS channel uri, was deactivated/tombstoned and then reactivated, using HttpNotificationChannel.Find() to attempt to re-use the previous channel always failed. This wasn't a major problem, because the code given earlier simply creates a new channel request. However, I was curious as to why this was happening. Turns out, that this re-use of a channel uri only works for tile and toast notifications, not raw notifications, like I was using.

To quote an MSDN article I found on the topic: A push channel only persists after an app exits if it has been bound to Tile or toast notifications so that these notifications can still be received even though the app is not running. When using raw notifications, the push channel is removed when the app is not running.

What happens to notifications when a device is deactivated/tombstoned?

This is funny. Although, when I realized the implications of my misreading of the MPNS documentation, I didn't laugh very loudly at the time. After a quick reading of various MSDN sources, I was confident that the MPNS would handle the nasty details involved in delivering notifications to devices that were actively running or deactivated/tombstoned. I could swear that I read somewhere that if the MPNS attempted to deliver a notifcation to an app that was not active, it would keep trying to deliver for a period of time.

I soon realized that this is not true. The truth (certainly for raw notifications) is that if an app/device is not active/available (even if only momentarily) when the MPNS attempts to deliver a notification, that notification is discarded and no further attempts are made to deliver the notification.

To solve this particular issue I made the following changes to the app:

  • Keep an AMS table of notifications that are sent (or, more accurately, requested to be sent) via the MPNS
  • When the app receives an MPNS notification, it adds it to a list of received notifications (and persists it to local, isolated storage)
  • When the app is activated, it requests from AMS the list of notifications sent (it's a short/small list) then compares what was sent with what it actually received. It then requests any "missing" notifications to be re-sent

This actually works very well, as the amount of data travelling across the wire is very small, and AMS read operations seem to reliably respond in a very timely manner.

How to handle MPNS disconnections

I noticed this during initial testing. The MPNS (almost randomly, or so it seems) will disconnect your app's channel URI. Sometimes this can happen within a a minute or two, sometimes (more usually) after about 20 minutes (which sounds much more like a standard "session timeout" value). The only way you can know this has happened is by handling the HttpNotificationChannel.ConnectionStatusChanged event:

private void OnConnectionStatusChanged(object sender, NotificationChannelConnectionEventArgs args)
{
    switch(args.ConnectionStatus)
    {
        case ChannelConnectionStatus.Connected:

            break;

        case ChannelConnectionStatus.Disconnected:

            // At this point you can either immediately request a new channel uri, 
            // or wait until you really need it, then request it (which is what I do)

            break;
    }
}

As the comments in the above code suggest, when MPNS disconnects your channel uri, you can either immediately request a new channel uri (see the Connect() method shown above), or you can wait until you really need an active channel, and then request it.

How to handle MPNS errors in client code

Handling exceptions in WP8 client code almost always has the following general format:

try
{
    await _mobileService.GetTable<Entity>().OperationAsync(entity instance);
}
catch(MobileServiceInvalidOperationException ex)
{
    // Handle exceptions raised in the AMS tier
    // Custom errors will be in ex.message (see "How to handle MPNS errors in AMS scripts" below)

    Logger.Log(string.Format("{0} {1}", ex.Response.StatusCode, ex.Message));
}
catch(Exception ex)
{
    // Handle other exceptions
}

The dreaded MPNS "lock-out"

I encountered this problem once, on one of the two physical devices I used to test roundup. What happens is that, for reasons unknown, the MPNS stops responding to requests for a new channel uri from a particular device, and raises the HttpNotificationChannel.ErrorOccurred event. Other devices (and emulators) are unaffected.

The following code shows how I handle the error, along with comments

private void OnChannelErrorOccurred(object sender, NotificationChannelErrorEventArgs args)
{
    Logger.Log(string.Format("Channel Error! {0}, {1}, {2} ",
        args.ErrorType, 
        args.Message, 
        args.ErrorCode));

    // Raise the ChannelError event so the view model can deal with it
    // (show a support message re date/time, battery, etc.)

    OnChannelError(args);

    // Notes on ChannelErrorType.ChannelOpenFailed (error code -2129589901).
    //
    // I saw this error during development on a Lumia 520 device 
    // (but not on a 920 device or the emulators). It happens after calling Open() 
    // on the HttpNotificationChannel in our Connect() method. It seems that 
    // particular devices suddenly get this problem (it's like their MPNS
    // connection status gets "locked" somehow). In my case, the only cure was  
    // resetting to factory defaults, then restoring settings from backup.
    // It's possible also that (according to Microsoft staff) this problem can be
    // caused by not having the right date/time set, low battery or having the
    // battery saver turned on. 
}

How to handle MPNS errors in AMS scripts

In AMS scripts you return errors (which surface as MobileServiceInvalidOperationException exceptions in client code) by modifying the HTPP response through the request.respond object.

You should always return a recognized HTTP status code (e.g. 400 == Bad Request, 404 == not found, etc.). Optionally, you may return a custom error code or message by adding a parameter to request.respond, as shown below (note that AMS defines a global statusCodes object, which means you don't have to use literal values for response codes):

function insert(item, user, request) {

    // Raise an error if the channel URI is null

    if (item.Channel.length === 0) {
        request.respond(statusCodes.BAD_REQUEST, "ERR_CHANNEL_URI_NULL");
        return;
    }
:
:
}

In the above example, the "ERR_CHANNEL_URI_NULL" string will be available in MobileServiceInvalidOperationException.Message in your WP8 client code.

Distinguishing between types of MPNS errors

You should be aware that the AMS push.mpns object can raise a number of error conditions, some of which are "real" errors, and some that aren't. The following simplified snippet from one of the roundup AMS update scripts, shows the general format of how I handle MPNS errors:

function update(item, user, request) {

    :
    :

    var payloadMsg = "{" +
        "SessionId:" + item.sid.toString() + "," +
        "InviteeId:" + item.id.toString() + "," +
        :
        :
                                        
    push.mpns.sendRaw(row.Channel, { payload: payloadMsg }, {
                                            
        success: function() {

            request.respond();  // Notification sent OK (probably!)
        },
        error: function(err) {

            // Is it a non-fatal error (can happen when the device is off-line)?

            if (isRealMpnsError(err)) {

                logMpnsResult(err, "Update notification failed");

                // If using the throttled (500 notifications/device/app/day limit)
                // version of MPNS, was the limit exceeded?

                if (notificationLimitExceeded(err)) {
                    request.respond(statusCodes.BAD_REQUEST, "ERR_NOTIFICATION_LIMIT_EXCEEDED");
                    return;
                }

                // Additional error handling here as required
                if(err.statusCode == 412 ) {
                :
                :

                // Report success, it was a minor MPNS error (device deactivated, no network, etc.)

                request.respond();  
                :
                :
}

function logMpnsResult(result, message) {
    console.error(
        "MPNS failure: " + message +
        ", statusCode = " + result.statusCode +
        ", notificationStatus = " + result.notificationStatus +
        ", deviceConnectionStatus = " + result.deviceConnectionStatus +
        ", subscriptionStatus = " + result.subscriptionStatus);
}

function isRealMpnsError(mpnsResult) {

    // If the status code of an mpns operation is one of the following values we 
    // consider it not to be an error, for the reason given:
    //
    // 200 : Not an error. Op succeeded
    // 404 : Not found. Device is off-line (e.g. no network or the client app is 
    //     : not the active foreground app)
    // 412 : Precondition failed. An internal MPNS error.
    //     : keeping trying to send future messages
    // 503 : Service Unavailable. MPNS is not available. Try again in future
    //
    // The following status codes ARE errors. Scripts should check for these and 
    // send the client the appropriate custom error code:
    //
    // 400 : Bad request (ERR_BAD_REQUEST). Something was malformed
    // 401 : Unauthorized (ERR_UNAUTHORIZED). MPNS certificate errors
    // 405 : Not allowed (ERR_NOT_ALLOWED). Bad HTTP verb used
    // 406 : Not acceptable (ERR_NOTIFICATION_LIMIT_EXCEEDED). 
    //     : The device + app combination has exceeded the 500 notification limit

    if (mpnsResult.statusCode == 200 ||
        mpnsResult.statusCode == 404 ||
        mpnsResult.statusCode == 412 ||
        mpnsResult.statusCode == 503) return false;

    return true;
}

function notificationLimitExceeded(mpnsResult) {

    if (mpnsResult.statusCode == 406) return true;

    return false;
}

Did MPNS really send my notification?

Finally, a problem that I noticed during testing of roundup, is that occassionally the MPNS will fail to deliver a notification. No errors are raised in either the client or AMS tier, the device is definately active and has a good network connection, but the notification fails to arrive.

It seems relatively rare, but I definately was able to spot it happening on a number of occassions. There's not much to be done about this. My roundup app handles it by (as noted earlier) keeping track of sent/received notifications, and re-sending itself anything that was missed or never arrived.

Conclusion

Looking back over what I've written, I'm aware that I've rather focused on errors and other "boundary-conditions" you're likely to encounter when using the MPNS and Azure Mobile Services with raw notifications. However, I must say that, in general, my experience with using the MPNS from AMS has been very positive. Once I understood some of the nuances of using it, I found that MPNS is actually pretty simple to work with.

Using Azure Mobile Services has been a really great experience. It's fast, reliable and very straightforward to work with, both in app client code, and in the cloud CRUD scripts. Initially, I nearly decided to create my own ASP.NET REST API-based service to handle my app's cloud requirements. However, I definately made the right decision by going with an AMS-based solution. Having CRUD operation-attached scripts is such a neat idea, and it enables the developer to get something up-and-running really quickly. With the recent addition of scheduled tasks, AMS provides everything I need.