Menu

Topics

Connect

Comments

Want to discuss? Please read our Commenting Policy first.

Rogers lacked protections, redundencies that may have prevented 2022 outage: report

RELATED: Rogers outage sparks new deal between Canada's telecom companies – Sep 7, 2022

An independent report into the 2022 Rogers outage says the company lacked several protections and redundancies that could have either prevented the outage or ended it sooner.

Story continues below advertisement

The report delivered to the Canadian Radio-Television and Telecommunications Commission says that since the outage, the telecom company has implemented the changes needed to address the cause of the outage and improve network resiliency and reliability.

In a separate letter posted to its website Thursday, the CRTC confirmed that Rogers has also implemented all the report’s additional recommendations.

“We said we would fix this – we completed a full review of our networks, strengthened our network resiliency, implemented all the report recommendations, and today our networks are recognized as the most reliable by global benchmarking leaders,” said Rogers spokeswoman Sarah Schmidt in a statement.

The outage in the early morning of July 8 two years ago lasted more than 24 hours and affected more than 12 million customers.

An configuration error during the network upgrade caused a flood of data to the core network routers, which crashed, according to the executive summary of the report by Xona Partners Inc. posted online Thursday.

Story continues below advertisement

The network failure could have been prevented if the core network routers had been configured with an overload limit, the report said.

Once the outage occurred, the report says it was prolonged by several factors.

The Rogers network operation centre and other critical remote infrastructure sites did not have redundant connectivity from other service providers, the report said, limiting access to critical equipment during the outage. Staff had to be physically dispatched to remote sites in order to access the affected routers, delaying recovery efforts.

Story continues below advertisement

In addition, Rogers staff also didn’t have backup connectivity from alternative service providers, and so they couldn’t communicate with each other until the company sent SIM cards from other service providers to its remote sites.

The daily email you need for 's top news stories.
Get the day's top stories from  and surrounding communities, delivered to your inbox once a day.

Get daily news

Get the day's top stories from and surrounding communities, delivered to your inbox once a day.
By providing your email address, you have read and agree to Global News' Terms and Conditions and Privacy Policy.

The report said that staff also didn’t initially have access to information like the routers’ error logs and were unable to pinpoint the root cause of the outage for around 14 hours. There had also been multiple configuration changes made that day. These two factors contributed to the root cause being initially misdiagnosed, the report said.

The measures taken by Rogers since the outage include addressing the critical deficiencies exposed by the outage, separating the IP core for its wireless and wireline networks, and improving the processes for change management and incident management, the report said.

The report made seven recommendations of additional measures Rogers could take to improve its network resiliency.

Among the recommendations, which have since been taken by Rogers, are that the company test emergency roaming with other mobile network operators, develop a detailed root cause analysis for future outages, and expand the scope of incident management drills.

Story continues below advertisement

Rogers sent a letter to the CRTC on Jan. 17 outlining how it responded to the report’s recommendations of additional measures.

In the CRTC’s letter confirming those additional measures were implemented, the commission said that by July 4 next year, Rogers must report on whether the measures continue to address reliability issues, and on progress made in separating wireline and wireless core networks.

Rogers is partnering with Cisco in its work to split and build a new dedicated IP core, separating the two networks, said Schmidt. The company has also introduced new change controls that will limit the effects of “customer-impacting events,” she said, as well as “AI-based predictive simulation capabilities to strengthen our testing and monitoring.”

The report also included recommendations for all telecom network operators based on the “important lessons learned” from the outage. These include implementing router overload protection in the IP core and distribution networks; providing backup connectivity for the network operation centre, critical remote sites and critical staff; and simulating network failure and outage scenarios to uncover deficiencies.

Advertisement
Advertisement

You are viewing an Accelerated Mobile Webpage.

View Original Article