Internet Border Gateway Protocol (BGP)

June 04, 2017

This article explores the Internet Border Gateway Protocol (BGP), a standardized exterior gateway protocol designed for exchanging routing and reachability information among different Autonomous Systems (ASes) or Internet Service Providers (ISPs) on the Internet. Below, we detail the importance, capabilities, challenges, and solutions associated with this protocol.

2017 06 04

1. The Border Gateway Protocol and its Functions

In January 1989, at the 12th Internet Engineering Task Force (IETF) meeting, Len Bosack, Kirk Lougheed, and Yakov Rekhter created BGP with the design goal of developing a protocol that could offer policy control, loop detection, and the scalability needed to support hundreds of thousands of networks through address aggregation techniques.

BGP serves as an inter-Autonomous System routing protocol, facilitating connections between ISPs. For example, Hutchison and China Mobile exchange Network Layer Reachability Information (NLRI). In an environment where the Internet lacks centralized control, these entities must exchange NLRI to integrate their autonomous networks. Each controls its own equipment and uses different intra-autonomous system routing protocols; they need to cooperate to exchange information about IP addresses associated with their customers.

The primary function of a BGP-speaking system has evolved to address this engineering and research problem: enabling information exchange between autonomous networks without centralized control. Packets sent to a service provider require table look-ups to determine their next destination, which could be on a completely different network on the other side of China. BGP serves as the foundational architecture for the global TCP/IP Internet.

Another key role of BGP is managing commercial issues. For instance, China Mobile might not want Hutchison to send excessive traffic, as it would incur additional costs. Different protocols operate within these autonomous networks, and the "best route" may differ depending on contracts and policies. BGP allows for flexibility in defining what constitutes the best route for different parties.

2. The Operations of BGP

The current version of BGP is Version 4, published as RFC 4271 in 2006. Unlike pure distance vector or link-state algorithms, BGP employs a path vector algorithm. It uses path information stored in the AS_PATH attribute to avoid traditional routing issues. Routing tables are traversed to reach the target network, providing loop avoidance. BGP also supports address aggregation, thereby significantly reducing the size of Core Internet Routing Tables.

When one Internet path fails, BGP offers network stability, enabling routers to quickly adapt and reroute packets. Each BGP router maintains a standard routing table used in conjunction with the Routing Information Base (RIB), continually updated as changes occur.

BGP updates routing table information only when changes occur. It lacks an automatic discovery mechanism, meaning peer connections must be established manually. The protocol uses an incremental update strategy to conserve bandwidth and processing power, relying on TCP for reliable transport.

3. Examples to Illustrate How ASes Can Learn About Internet Reachability

Consider a scenario with five ASes identified by unique 32-bit Autonomous System Numbers (ASNs), as shown below:

6700c 1dm0stkotj5qm6tebl lf9a

BGP enables routers within these ASes to learn multiple paths via internal and external BGP speakers. It selects the best path and installs it in the RIB. When a customer in the AS104 network wants to send data to the AS100 network, BGP helps routers within AS104 decide which path to take, updating reachability information accordingly.

BGP also provides for the management of trust and distrust among different service providers and is outlined in RFC 4271. It allows networks with common routing policies to be uniquely identified and is widely used in Internet backbones.

BGP makes best-path decisions based on current reachability, hop counts, and other path attributes. It can be configured to communicate an organization's routing preferences and has a mechanism for defining arbitrary tags, known as communities, to control route advertisement behavior by mutual agreement among peers.

4. BGP Packet Formats and Field Functions

BGP messages are transmitted over TCP connections. A message undergoes processing only after it is completely received. The maximum message size is 4096 octets, whereas the smallest permissible message consists of a 19-octet header without any data. Below, we highlight the functions of some of the fields:

4.1 Message Header Format

Marker: This 16-octet field is included for compatibility and must be set to all ones.

Length: This 2-octet unsigned integer represents the total length of the message, including the header, in octets. It helps in locating the Marker field of the next message in the TCP stream. The field value must always be greater than 19 and less than 4096. Padding with extra data after the message is prohibited; thus, the field must contain the smallest required value.

Type: This 1-octet unsigned integer specifies the message's type code. The type codes are: 1 — Open, 2 — Update, 3 — Notification, 4 — Keepalive.

4.2 Open Message Format

After establishing a TCP connection, both sides send an Open message as the first message. If the Open message is accepted, a Keepalive message confirming the Open is sent in response.

Version: This 1-octet unsigned integer indicates the protocol version number of the message.

My Autonomous System: This 2-octet unsigned integer specifies the sender's AS number.

Hold Time: This 2-octet unsigned integer suggests a value for the Hold Timer in seconds. Upon receiving an Open message, a BGP speaker calculates the hold timer by taking the lesser of its configured hold time and the received hold time. This time must be either 0 or at least 3 seconds. Connections may be rejected based on this time value.

BGP Identifier: This 4-octet unsigned integer identifies the sender's BGP Identifier. The value is determined at startup and remains consistent across all local interfaces and BGP peers.

Opt Param Len: This 1-octet unsigned integer shows the total length of the Optional Parameters field in octets. A zero value indicates that no Optional Parameters are present.

Optional Parameters (variable): This field contains a list of optional parameters, each encoded as follows:

  • Parameter Type: 1-octet field identifying individual parameters.
  • Parameter Length: 1-octet field specifying the length of the Parameter Value field in octets.
  • Parameter Value (variable): Interpreted based on the Parameter Type field's value.

The Open message's minimum length, including the header, is 29 octets.

4.3 Update Message Format

This format is used to exchange routing information between BGP peers, helping to build a graph that represents the relationships among various Autonomous Systems (AS). It identifies and eliminates routing loops and other anomalies in inter-AS routing.

An Update message serves to advertise feasible routes with common path attributes or to withdraw multiple unfeasible routes. It may both advertise a feasible route and withdraw multiple unfeasible routes simultaneously.

Withdrawn Routes Length (2 octets): Indicates the total length of the Withdrawn Routes fields; a value of 0 implies no routes are being withdrawn.

Withdrawn Routes (variable): Contains a list of IP address prefixes of the routes being withdrawn.

Length (1 octet): Specifies the length, in bits, of the IP address prefix; a value of 0 matches all IP addresses.

Prefix (variable): Contains an IP address prefix and the minimum number of trailing bits needed to align the field's end on an octet boundary.

Total Path Attributes Length (2 octets): Indicates the total length of the Path Attributes fields in octets. A value of zero signifies that neither the NLRI nor the Path Attribute fields are present.

Path Attributes (variable): A triplet consisting of <attribute type, attribute length, attribute value>. The attribute type is a 2-octet field that includes:

  • Attr. Flags: Various bits are used for different purposes, such as optional bit, transitive bit, partial bit, and Extended Length bit.
  • Attr. Type Code: Codes like Origin, AS_PATH, NEXT_HOP, MULTI_EXIT_DISC, LOCAL_PREF, ATOMIC_AGGREGATE, and AGGREGATOR specify different types of path attributes.

Network Layer Reachability Information (variable): Contains a list of IP address prefixes. Its length is not explicitly encoded but can be calculated using the formula:

( \text{Updated message length} - 23 - \text{Total Path Attributes Length} - \text{Withdrawn Routes Length} )

  • "Updated message length" is the value encoded in the fixed-size BGP header.
  • "Total Path Attributes Length" and "Withdrawn Routes Length" are variable parts of the update message.
  • 23 is the combined length of the fixed-size BGP header, the Total Path Attribute Length field, and the Withdrawn Routes Length field.

The reachability information is encoded as one or more 2-tuples, each having:

Length (1 octet): Indicates the length, in bits, of the IP address prefix. A value of 0 matches all IP addresses, with the prefix itself consisting of zero octets.

4. Packet Formats in BGP and Highlighting Functions of Some Fields

BGP messages are sent over TCP connections. A message is processed only after it has been entirely received. The maximum message size is 4096 octets, while the smallest permissible message consists of a 19-octet header without a data portion. Below, we highlight the functions of some fields:

4.1 Message Header Format

  • Marker: A 16-octet field included for compatibility, which must be set to all ones.

  • Length: A 2-octet unsigned integer that indicates the total length of the message, including the header, in octets. This helps locate the Marker field of the next message in the TCP stream. The value must always be greater than 19 and smaller than 4096. Padding with extra data after the message is not allowed.

  • Type: A 1-octet unsigned integer indicating the message's type code. The type codes are: 1—Open, 2—Update, 3—Notification, 4—Keepalive.

4.2 Open Message Format

After establishing a TCP connection, the first message each side sends is an Open message. If the Open message is acceptable, a Keepalive message confirming the Open is sent in return.

  • Version: A 1-octet unsigned integer that indicates the message's protocol version number.

  • My Autonomous System: A 2-octet unsigned integer indicating the sender's AS number.

  • Hold Time: A 2-octet unsigned integer indicating the proposed value for the Hold Timer in seconds. Upon receiving an Open message, a BGP speaker must calculate the value of the Hold Timer using the lesser of its configured hold time and the hold time received. The Hold Timer value must be either 0 or at least 3 seconds.

  • BGP Identifier: A 4-octet unsigned integer indicating the sender's BGP Identifier, set to an IP address assigned to that BGP speaker.

  • Opt Param Len: A 1-octet unsigned integer indicating the total length of the Optional Parameters field in octets.

  • Optional Parameters (variable): A list of optional parameters, each encoded as a triplet: Parameter Type, Parameter Length, Parameter Value.

The minimum length of the Open message, including the header, is 29 octets.

4.3 Update Message Format

This message type transfers routing information between BGP peers. Update messages can advertise feasible routes or withdraw multiple unfeasible routes. An Update message can simultaneously advertise a feasible route and withdraw multiple unfeasible routes.

  • Withdrawn Routes Length (2 octets): Indicates the total length of Withdrawn Routes fields. A value of 0 means no routes are being withdrawn.

  • Withdrawn Routes (variable): Contains a list of IP address prefixes for routes being withdrawn.

  • Length (1 octet): Indicates the length in bits of the IP address prefix. A 0 means a prefix matching all IP addresses.

  • Prefix (variable): Contains an IP address prefix, followed by enough trailing bits to ensure the field ends on an octet boundary.

  • Total Path Attributes Length (2 octets): Indicates the total length of the Path Attributes fields in octets. A value of zero means that neither the NLRI nor the Path Attribute field is present.

  • Path Attributes (variable): A triple consisting of attribute type, attribute length, and attribute value.

Additional details are provided for individual attribute types like Attr. Flags, Attr. Type Code, and Network Layer Reachability Information.

The minimum length of an Update message is 23 octets: 19 for the fixed header + 2 for the withdrawn routes length + 2 for the total path attribute length.

4.4 Keepalive Message Format

BGP doesn't use any TCP-based keep-alive mechanisms to determine if peers are reachable. Instead, Keepalive messages are exchanged frequently enough to prevent the Hold Timer from expiring. The maximum reasonable time between Keepalive messages is one-third of the Hold Time interval. Keepalive messages should not be sent more often than once per second.

4.5 Notification Message Format

A Notification message is sent when an error is detected. The BGP connection is immediately closed after sending this message. The Notification message includes the following fields:

  • Error Code (1-octet): Indicates the type of error.

  • Error Subcode (1-octet): Provides additional information about the error.

  • Data (variable): Used for diagnosing the reason for the notification.

The minimum length of a Notification message is 21 octets.

5. Instability Problems in BGP and Proposed Solutions

Instability is defined as rapid changes in network reachability and topology information. Various issues such as software bugs, TCP attacks, or congestion can lead to loss of service, wasteful utilization of network resources, and degraded performance for Quality of Service (QoS)-demanding applications.

One classic problem in BGP is known as the "black-hole phenomenon." An incorrect manual configuration can cause a BGP router to improperly announce routes through its Autonomous System (AS), leading other BGP routers to update their routing tables accordingly. This results in a massive amount of traffic being forwarded to that AS, causing significant packet loss and ultimately, network congestion.

Another symptom of instability is the disappearance of an existing route, termed "flapping" if the route reappears shortly thereafter. Flapping occurs when a router sends a routing update and then withdraws it shortly afterward. This forces peer routers to propagate and then withdraw updates, affecting the performance of the network and potentially causing transient loss of connectivity.

Internal congestion within an AS can also lead to instability by causing the TCP connections between two BGP routers to time out.

A robust BGP implementation should ensure that instability in a subset of routes does not affect the router's advertisements or forwarding of stable routes. Instability should not be caused by peers with varying levels of stability or different processing speeds. The impact of unstable peers on the network's convergence time should be limited.

One proposed solution is route flap damping. This prevents heavy processing loads on routers, which could otherwise delay updates. Route flaps are exponentially decayed to mitigate denial-of-service attacks.

6. Security Concerns in BGP and Enhancements

BGP is susceptible to various attacks due to the lack of message integrity and authentication. Communications between BGP peers are vulnerable to both active and passive wiretapping. Unauthorized access to a router can result in the alteration of its software, configuration information, and routing databases, transforming the router into a hostile entity.

Another significant vulnerability stems from the underlying transport protocol, TCP. BGP is susceptible to the same types of attacks that plague TCP, such as SYN flooding, which can exhaust server resources like memory and bandwidth.

Attackers can also disrupt TCP connections to impersonate legitimate peer routers. Since the RFC-defined mechanism does not provide peer-entity authentication, these connections may be susceptible to replay attacks, leading to the delivery of spoofed BGP messages.

Attackers could also generate false route flaps to cause a victim’s prefix to be damped. To mitigate this, parameters should be adjusted to more conservative values, reducing risk and partially countering false flap attacks.

To improve security further, each protected peer should have a unique key for communication. Using the same key for multiple peers increases the risk of compromising one router and adversely affecting others.

Lastly, keys used for MAC computation should be rotated periodically, ideally every 90 days, to minimize the risks associated with key compromise or successful cryptanalytic attacks. Additionally, keys should be selected to be difficult for attackers to guess.


Profile picture

Victor Leung, who blog about business, technology and personal development. Happy to connect on LinkedIn