BlackBerry outage suggests IT infrastructure weakness

BlackBerry users are once again getting their CrackBerry fix after a network outage caused widespread service disruptions across North America. But according to one industry analyst, if Research in Motion wants to maintain its market lead, it had better re-evaluate its centralized infrastructure model.

Reports of service interruptions began Tuesday night at 8:00 pm EST, with users posting to RIM message boards complaining of service outages. Not all users were impacted though, and service appeared to be restored by Wednesday morning, as a backlog of messages was slowly worked through.

RIM declined a request for an interview, releasing a statement once service had been restored that confirmed the “service interruption” and noted voice service had not been impacted.

“Root cause is currently under review, but service for most customers was restored overnight and RIM is closely monitoring systems in order to maintain normal service levels,” said the statement.

The impact across Canada appears to have been mixed, with not all users losing service. Telus spokesperson Julie Smithers confirmed Telus clients across Canada were impacted by the outage.

“It looks like not all Telus clients were impacted, but we don’t have an idea of how many were at this point,” said Smithers, noting the carrier did not receive a lot of complaints from users. “I don’t have the exact call volume but it didn’t look like a huge increase in calls.”

The federal government is a large BlackBerry user, but a spokesperson for Public Works and Government Services Canada said it had received no reports of disruptions. Mark Marino, director of research and development with Toronto-based St. Joseph Print, also said his 20 to 30 BlackBerry users hadn’t reported any issues.

Not everyone was as lucky however. David Maynor is thinking about dropping his BlackBerry service because of RIM’s poor response to the situation. He couldn’t get information on the outage from the BlackBerry Web site, or by calling his carrier’s support line where “wait times were insane.” Instead, he had to turn to online BlackBerry discussion forums.

“I’m actually really mad about it. I’m mad enough to switch to another service,” said Maynor, chief technology officer with Errata Security Inc. in Atlanta. “Everyone makes mistakes but their cardinal sin is that they didn’t inform their users.”

It’s the concerns of users like Maynor that RIM is going to have to address said Carmi Levy, senior analyst with London, Ont.-based Info-Tech Research Group. Levy said RIM has followed a centralized infrastructure model, and this outage may force the company to reevaluate that strategy.

RIM operates two Network Operations Centres (NCOs) to serve its global customers, both located in Waterloo. Companies that provide BlackBerry service connect their mail servers to a BlackBerry Enterprise Solution server located on their premises which in turn is linked to one of RIM’s NCOs, which means all RIM e-mail traffic is routed through Canada.

While the cause of the outage is still unknown, Levy said it appears clear RIM suffered a catastrophic failure in the NOC that handles messaging for the Western hemisphere, with whatever redundancy that was in place failing to maintain an acceptable quality of service.

While the centralized infrastructure model allowed RIM to control costs as it grew, with RIM adding 1.02 million subscribers in the quarter ending March 3rd for a worldwide subscriber base of 8 million users Levy said it’s time for the company to consider a more distributed architecture.

“I would think that they will ask themselves some very hard questions about whether this is the optimal infrastructure plan going forward,” said Levy. “I would expect there to be some concrete changes to that strategy, to send the message both to their user community and the investment community that they are making changes to avoid a repeat of this in the future.”

Craig Read heads the Toronto Wireless User Group, and while he has a BlackBerry he wasn’t impacted by the outage. He agreed, though, that RIM needs to have stronger parallel and redundancy systems in place.

“As their network keeps growing and they’re adding more users they’ll have to increase the capacity of their network and their servers,” said Read, noting RIM will also be facing market pressure from newer players like Nokia, Motorola and Palm. “As you grow a company it’s normal to have infrastructure problems and some QoS issues, and I’m sure the managers at RIM are quite intelligent enough to address those issues going forward.”
— With files from IDG News Service

Comment: [email protected]

Would you recommend this article?


Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.

Jim Love, Chief Content Officer, IT World Canada

Featured Download

Jeff Jedras
Jeff Jedras
Jeff Jedras is a technology journalist with IT World Canada and a member of the IT Business team. He began his career in technology journalism in the late 1990s, covering the Ottawa technology sector for Silicon Valley North and the Ottawa Business Journal. He later covered the technology scene in Vancouver before joining IT World Canada in Toronto in 2005, covering enterprise IT for ComputerWorld Canada and the channel for Computer Dealer News. His writing has also appeared in the Vancouver Sun & the Ottawa Citizen.

Featured Story

How the CTO can Maintain Cloud Momentum Across the Enterprise

Embracing cloud is easy for some individuals. But embedding widespread cloud adoption at the enterprise level is...

Related Tech News

Get ITBusiness Delivered

Our experienced team of journalists brings you engaging content targeted to IT professionals and line-of-business executives delivered directly to your inbox.

Featured Tech Jobs