Facebook explained reasons for the global failure

Facebook explained reasons for the failure

Yesterday, Facebook, Instagram and WhatsApp did not work for more than five hours around the world and after fixing the problems, representatives of the social network explained the reasons for the global outage.

The failure was caused by a BGP routing issue. Currently, all services are already operating normally.

Amid problems with access, rumours of hacking and a colossal data leak began to spread across the network: the company was allegedly hacked and the information of 1.5 billion Facebook users was leaked to the network. This information turned out to be a lie.

Crash

On October 4, at about 6 pm Moscow time, Facebook, Instagram and WhatsApp went offline around the world. Apps didn’t work and browsers showed DNS error when trying to connect to sites. An attempt to connect directly to Facebook’s DNS servers also failed.

Facebook explained reasons for the failure

At first, it seemed that the problem was related to DNS, but later it turned out that everything is somewhat worse.

As experts including Giorgio Bonfiglio, head of Amazon AWS Technical Support, explained, Facebook’s routing prefixes suddenly disappeared from BGP routing tables, making it impossible to connect to any services hosted on those IP addresses.

As it turned out later, when social networks started working again, the experts were completely right. Facebook officials issued an official press release stating that the crash was caused by an error while changing the configuration of the backbone routers.

Our engineering teams found that configuration changes on the backbone routers that coordinate network traffic between our data centres caused problems that interrupted communications. This disruption to network traffic had a cascading effect on our data centres, making our services unavailable.wrote Santosh Janardhan, VP of Engineering and Infrastructure, Facebook.

It also reported that configuration issues have impacted the company’s internal systems and tools, making it even more difficult to diagnose and recover. It is worth saying that yesterday, numerous anonymous sources in the media and social networks reported that Facebook employees were not able to quickly get into their own data centres and access critical equipment, since real chaos reigned in the company itself due to a failure.

For a better understanding of what happened, Bleeping Computer explained that BGP (Border Gateway Protocol) is the routing protocol on which the entire Internet operates, it allows devices on one side of the world to connect to devices on the other using routes (prefixes).

To make it easier to understand: BGP is similar to the “mail system” of the Internet, facilitating the transfer of traffic from one (autonomous) system of networks to another. When a network wants to be seen on the Internet, they must communicate their routes or prefixes to the rest of the world. If these prefixes are removed, no one on the Internet knows how to connect to [Facebook] servers.said Lawrence Abrams, head and founder of Bleeping Computer.

Because Facebook configured its entire organization to use a domain registrar and DNS servers hosted on their own routing prefix, when the prefixes were removed, no one could connect to those IP addresses and the services running on them.”Facebook developers have already apologized for what happened:

Anyone affected by our platform disruptions today: sorry. We know that billions of people and businesses around the world depend on our products and services and must stay connected. We appreciate your patience.

Interesting consequences

  • Pavel Durov said that amid global shutdown of Facebook, Instagram and WhatsApp, Telegram’s audience increased by 70,000,000 people in one day. Durov greeted new users and promised that Telegram will not fail when others fail.
  • According to Haystack analysts, during the five-hour outage, developer activity increased significantly: the number of pull requests increased by 32%.

Fake leak

During the global shutdown of Facebook and other services of the company, a real panic arose on the network. The fact is that many media outlets reported that the failure did not occur by accident, the company was allegedly hacked, and now the personal data of one and a half billion users of the social network are sold on the darknet.

A huge (about 600 TB) dump that actually appeared recently on the RAID forum, allegedly contains names, email addresses, phone numbers, IDs, gender and user locations.

Facebook explained reasons for the failure

The problem is that this dump went on sale at the end of September, and the data, apparently, was collected using scraping (that is, collecting and aggregating already open data). Such databases appear on the black market regularly. Moreover, as noted by Vice Motherboard, other members of the hack forum have already accused the seller of fraud.

Scamer. Sends only [data sample] 20 users. No more. Doesn’t accept escrow (moderator). But he expects you to believe in the [reality] of these 20 samples and send him $5,000. Instead of 1.5 billion, I think it has data from 150 users for social engineering.writes one of the forum participants.
Hahahaha 600 TB of Mark Zucker’s burger selfies: D.another RAID user laughs.

Researchers at PrivacyAffairs report that while the seller is trying to deny these allegations and continues to claim that the data is genuine, but there is little faith in this, as many researchers and information security journalists note.

Let me remind you that I also said that Information of 533 million Facebook users leaked to the public.

By Vladimir Krasnogolovy

Vladimir is a technical specialist who loves giving qualified advices and tips on GridinSoft's products. He's available 24/7 to assist you in any question regarding internet security.

Leave a comment

Your email address will not be published. Required fields are marked *