On Monday evening, October 4, thousands of users of Facebook, as well as its Instagram and WhatsApp, complained about malfunctions in the services. Reports about the failures began to appear on the Downdetector portal around 6:30 p.m. Moscow time. Many users were unable to load social networking sites, send messages, connect to the system or access the messenger.
The scale of the failure was global: 126 thousand people wrote about problems with Facebook, WhatsApp - more than 35 thousand users, and Instagram - about 100 thousand users. We are only talking about those who reported the failures on the Downdetector portal: in reality, the number of those who were unable to use the company's services was higher.
Later the failure was called the largest in the history of Facebook for the past 13 years. The company experienced longer breakdowns in 2008: then its services were unavailable for a day.
As The New York Times journalist Ryan Mac found out, the problem affected not only the listed platforms. He noted that Facebook's internal tools and communications platforms, including Workplace, have also stopped working.
2. The cause of the failure was a malfunction in the servers
Facebook told Twitter it was aware of the problem. "We are working to return to normal as soon as possible, and we apologize for the inconvenience," the social network wrote.
The company has not yet named the exact cause of the glitch. Nevertheless, some experts believe that the problems could be caused by malfunctions in the DNS-servers used by Facebook.
The problem could be that the DNS entries that the system used to find Facebook.com and Instagram.com may have been deleted from the global routing tables, explained Kentik Inc.'s director of Internet analysis. Doug Madori. "If Facebook's DNS records are gone, no one can find it," NPR quoted the expert as saying.
Igor Bederov, an expert at NTI's SafeNet engineering center, also thought the problem was caused by a DNS-related malfunction. "It is used to make users enter the address of a text link instead of the ugly IP address of the site. Now the IP transition is working, but the domain name is not," Bederov said. When the work of the sites began to recover, the reason for the failure of the service was called updates to the dynamic routing protocol.