RSHTOTF! – Turn off the NOCs

Well as most of you know last week¬†RIM suffered not 1 but 2 major outages to there NOC (Network Operations Center). This occurred due to the fact that by design BlackBerry’s are designed with security in mind and being paramount. BlackBerry’s are in my opinion one of the most secure devices to use when dealing with¬†sensitive¬†emails and data¬†transmissions¬†(The US government military use them so I am guessing they are secure enough for you and I). The reason BlackBerry’s are so secure is the design of RIMs security model that routes all traffic through RIMs NOC using encryption, this way all traffic is useless if intercepted and is only¬†decipherable¬†by RIMs servers and services. The problem with this is that if the NOC suffers an¬†outage¬†all services that are supplied by that server or services or NOC are down until the outage can be corrected.

How does this effect you? Well you can’t get email, Live Messenger, Facebook or whatever service went down¬†until¬†its been corrented and brought back online! What!!!! Are you kidding me? (reaction of some consumers to the¬†outage¬†last week). As RIM starts to draw in a larger consumer base that is not business oriented and care less about security this reaction is almost expected. Most business uses like myself will deal and dealt with the outage and continued when it was back up. I live, breath, and die (not yet) by my device. Like most of you I combined all my devices into one for¬†convenience,¬†I carry it with me at all times, so don’t get me wrong I feel your pain when these services go down.

So the question was proposed by Kyle at BlackBerry Cool ( does RIM need more¬†redundancy or more¬†NOCs to prevent these¬†outages? That got my brain turning and you all know what happens when my brain starts¬†thinking¬†(no smoke does not come from my ears) it comes up with an idea. So to that question from Kyle I answer….. it depends what caused the outage.

With the last 2¬†outages¬†caused by a software programming issue with BlackBerry Messenger I would say that no matter how many distributed NOCs or redundant services or services you have it would not have stopped this issue (unless this is the¬†story¬†RIM gave the public because someone tripped over the cord to the network gear and¬†didn’t¬†see it for 8 hours). I believe from what I have read that no matter how many NOCs you have the same problem would have¬†happened¬†to all of them resulting in all of them¬†being¬†effected until it could be corrected.

Well what is the solution then? One of the best solutions…. Give the power to the people! Say what!?! Are you crazy, give people power over the NOC? Nope, give people the control as to how the BlackBerry communicates during an outage. If an outage¬†occurred¬†and¬†consumers were given the choice to switch to an¬†insecure¬†network, most would say “hell yeah, like I care¬†I’m¬†just planning a movie with my wife over IM+”. This of course would be¬†preceded¬†with a “Are you sure you wan¬†to¬†switch to an insecure network?” just to make sure the user know what they are about to do. Users could then continue on there merry way but would not be able to use effected services of course in this case BlackBerry Messenger or¬†anything¬†else that requires the NOC to operate. Once the NOC corrected the problem a push update would go out to all¬†BlackBerry’s¬†and switch them back to the secure RIM NOC connection and notify them the services were back up. I know you business readers are screaming security in your mind right now, don’t worry you would be given the¬†option¬†through BES to block this feature on business devices. These users would be forced to wait for the outage to be¬†corrected¬†before using services again.

From working at major NOCs myself, outages can occur from upstream providers, network issues, failed servers or services on those server. I can assure you there is redundancy in RIMs infrastructure and services or else RIM could not of made it his far without them. I bet the farm on it!

© Caspan 2010