On June 27, 2024, Cloudflare’s widely used 1.1.1.1 public DNS resolver service experienced disruptions due to a combination of Border Gateway Protocol (BGP) hijacking and a route leak. This incident left a small percentage of users worldwide unable to access the service or facing significant latency issues.
Incident Details
- Date of Incident: June 27, 2024
- Affected Service: Cloudflare’s 1.1.1.1 public DNS resolver
- Cause: BGP hijacking and route leak
Technical Analysis
- BGP Hijacking: A Brazilian network operator, Eletronet (AS267613), mistakenly announced ownership of the 1.1.1.1/32 prefix. This misannouncement diverted traffic away from Cloudflare’s servers, leading to disruptions.
- Route Leak: Another Brazilian network, Nova Rede de Telecomunicações (AS262504), erroneously leaked the 1.1.1.0/24 prefix upstream. This exacerbated the issue by spreading the incorrect routing information further.
Impact
- Unreachability: The combined BGP hijacking and route leak caused immediate unreachability for 1.1.1.1 from over 300 networks across 70 countries.
- User Impact: The overall impact on users was minimal, with less than 1% of users in the UK and Germany affected. However, some users experienced complete unreachability, while others faced high latency.
- Affected Regions: Intermittent access issues were particularly visible in traffic from Germany and the United States.
Security Implications
- BGP Vulnerabilities: The incident highlights the vulnerabilities in the global BGP ecosystem, which remains susceptible to both hijacking and route leaks.
- Importance of RPKI: Resource Public Key Infrastructure (RPKI) is a security framework designed to prevent BGP hijacking. While Cloudflare had implemented RPKI, the complexity of this incident, involving both hijacking and leaking, resulted in a partial outage.
Mitigation and Recommendations
- Enhanced BGP Monitoring:
- Regularly monitor BGP announcements and routing tables to detect and respond to suspicious activities promptly.
- RPKI Implementation:
- Ensure full implementation of RPKI across all networks to prevent unauthorized BGP announcements.
- Incident Response:
- Develop and maintain robust incident response plans to quickly address BGP-related disruptions.
- Collaboration with Network Operators:
- Foster collaboration among network operators globally to improve the security and stability of BGP routing.
The disruption of Cloudflare’s 1.1.1.1 DNS service due to BGP hijacking and a route leak underscores the critical need for enhanced BGP security measures. While the overall user impact was limited, the incident highlights persistent vulnerabilities within the global internet routing infrastructure. By adopting comprehensive security frameworks like RPKI and improving BGP monitoring, the resilience of internet services can be significantly strengthened.