Well as most of you know last week RIM suffered not 1 but 2 major outages to there NOC (Network Operations Center). This occurred due to the fact that by design BlackBerry’s are designed with security in mind and being paramount. BlackBerry’s are in my opinion one of the most secure devices to use when dealing with sensitive emails and data transmissions (The US government military use them so I am guessing they are secure enough for you and I). The reason BlackBerry’s are so secure is the design of RIMs security model that routes all traffic through RIMs NOC using encryption, this way all traffic is useless if intercepted and is only decipherable by RIMs servers and services. The problem with this is that if the NOC suffers an outage all services that are supplied by that server or services or NOC are down until the outage can be corrected.
How does this effect you? Well you can’t get email, Live Messenger, Facebook or whatever service went down until its been corrented and brought back online! What!!!! Are you kidding me? (reaction of some consumers to the outage last week). As RIM starts to draw in a larger consumer base that is not business oriented and care less about security this reaction is almost expected. Most business uses like myself will deal and dealt with the outage and continued when it was back up. I live, breath, and die (not yet) by my device. Like most of you I combined all my devices into one for convenience, I carry it with me at all times, so don’t get me wrong I feel your pain when these services go down.
So the question was proposed by Kyle at BlackBerry Cool (http://www.blackberrycool.com/2009/12/28/is-rim-in-need-of-more-redundancy-to-prevent-outages/) does RIM need more redundancy or more NOCs to prevent these outages? That got my brain turning and you all know what happens when my brain starts thinking (no smoke does not come from my ears) it comes up with an idea. So to that question from Kyle I answer….. it depends what caused the outage.
With the last 2 outages caused by a software programming issue with BlackBerry Messenger I would say that no matter how many distributed NOCs or redundant services or services you have it would not have stopped this issue (unless this is the story RIM gave the public because someone tripped over the cord to the network gear and didn’t see it for 8 hours). I believe from what I have read that no matter how many NOCs you have the same problem would have happened to all of them resulting in all of them being effected until it could be corrected.
Well what is the solution then? One of the best solutions…. Give the power to the people! Say what!?! Are you crazy, give people power over the NOC? Nope, give people the control as to how the BlackBerry communicates during an outage. If an outage occurred and consumers were given the choice to switch to an insecure network, most would say “hell yeah, like I care I’m just planning a movie with my wife over IM+”. This of course would be preceded with a “Are you sure you wan to switch to an insecure network?” just to make sure the user know what they are about to do. Users could then continue on there merry way but would not be able to use effected services of course in this case BlackBerry Messenger or anything else that requires the NOC to operate. Once the NOC corrected the problem a push update would go out to all BlackBerry’s and switch them back to the secure RIM NOC connection and notify them the services were back up. I know you business readers are screaming security in your mind right now, don’t worry you would be given the option through BES to block this feature on business devices. These users would be forced to wait for the outage to be corrected before using services again.
From working at major NOCs myself, outages can occur from upstream providers, network issues, failed servers or services on those server. I can assure you there is redundancy in RIMs infrastructure and services or else RIM could not of made it his far without them. I bet the farm on it!
© Caspan 2010