Friday, January 27, 2012

BGP Attributes

BGP routers send Update messages about destination networks. BGP Update messages contain information about BGP metrics, which are called path attributes – the characteristics of a route. The following are some terms defining how BGP attributes are implemented:
 An attribute is either well-known or optional, mandatory or discretionary, and transitive or non-transitive. An attribute may also be partial.
 Not all combinations of these characteristics are valid.
BGP path attributes fall into one of the following 4 categories:
- Well-known mandatory
- Well-known discretionary
- Optional transitive
- Optional non-transitive
 Only Optional Transitive attributes may be marked as partial.

A well-known attribute is an attribute that must be recognized by all BGP implementations. These attributes are propagated throughout BGP routers. Well-known attributes are either mandatory, which must be included in all BGP Update messages; or are discretionary, which may or may not be sent in a specific Update message. BGP routers must recognize and act upon the information in discretionary attributes when they are present in the Update messages. Well-known attributes are always transitive, therefore the Transitive bit is always set to 1.

Attributes that are not well-known are called optional attributes, which are either transitive or non-transitive. An optional attribute does not need to be supported by all BGP implementations; it could be a private attribute. If it is supported, it might be propagated throughout BGP routers.
An optional transitive attribute that is not supported by a BGP router should be propagated to other BGP routers remains unchanged (preserved), and the attribute is marked as partial [1]; an optional non-transitive attribute is dropped by a BGP router that does not recognize and support it, and the attribute will not be propagated to other BGP routers. New optional transitive attributes may be attached to the path by the originator or by any other AS in the path.
[1] A router that doesn’t understand and relay an optional transitive attribute sets the Partial bit to inform downstream routers that the attribute is not being processed as desired at all previous hops.

BGP has defined the following path attributes:
- Well-known mandatory attributes:
> ORIGIN (Type Code 1)
> AS_PATH (Type Code 2)
> NEXT_HOP (Type Code 3)
- Well-known discretionary attributes:
> LOCAL_PREF (Type Code 5)
> ATOMIC_AGGREGATE (Type Code 6)
- Optional transitive attributes:
> AGGREGATOR (Type Code 7)
> COMMUNITY (Type Code 8; originally Cisco-specific, now standardized in RFC 1997)
- Optional non-transitive attributes:
> MULTI_EXIT_DISC – MED (Type Code 4)
> ORIGINATOR_ID (Type Code 9, Cisco-specific)
> CLUSTER_LIST (Type Code 10, Cisco- specific)
The following sections discuss all BGP attributes above, except the ATOMIC_AGGREGATE and AGGREGATOR attributes.

Cisco has also defined the administrative weight BGP attribute – WEIGHT, which is a parameter that applies only upon paths within an individual router and is not communicated to other routers.


The ORIGIN Attribute

The BGP ORIGIN attribute is a well-known mandatory attribute. It specifies the origin of a routing update or path information. When BGP has multiple routes, it uses the ORIGIN attribute as a factor in determining the preferred route. The ORIGIN can be one of the following 3 values:
IGP. The NLRI was learned via an interior routing protocol in the originating AS. BGP routes that are originated within the AS and being advertised using the network statements are assigned an origin of IGP. It is indicated with an i in the BGP table.
EGP. The NLRI was learned via EGP – Exterior Gateway Protocol. It is indicated with an e in the BGP table. EGP is considered a historic routing protocol and is not supported on the current Internet as it performs only classful routing and does not support CIDR.
Incomplete. The origin of route is unknown or the NLRI was learned via some other means. An incomplete origin is indicated with an ? in the BGP table. Incomplete does not imply that the route is faulty, but rather that the information for determining the origin of the route is incomplete. This usually occurs when a route is being redistributed into BGP, in which there is no way to determine the original source of the route.


The AS_PATH Attribute

The BGP AS_PATH attribute is a well-known mandatory attribute. It lists the ASNs of the inter-AS path to reach a destination, with the ASNs of the most recent AS and originating AS at the beginning and ending of the list. When a BGP router advertises a route to an EBGP peer, it prepends (put at the beginning of the list) its own ASN to the AS_PATH for the route. No ASN is added when the route is being advertised between IBGP peers within the same AS.

BGP AS_PATH Attribute

RT1 advertises 192.168.1.0/24 in AS 65001. When the route traverses through AS 65002, RT2 prepends its own ASN to it. When the route reaches RT3, it has 2 ASNs attached to it. From the perspective of RT3, the path to reach 192.168.1.0/24 is (65002, 65001).
RT3#sh ip bgp
BGP table version is 4, local router ID is 192.168.3.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 192.168.1.0      23.23.23.2                             0 65002 65001 i
*> 192.168.2.0      23.23.23.2               0             0 65002 i
*> 192.168.3.0      0.0.0.0                  0         32768 i
RT3#

The same concept applies for 192.168.2.0/24 and 192.168.3.0/24. The path to 192.168.3.0/24 from RT1 is (65002, 65003); RT2 traverses the path (65001) to reach 192.168.1.0/24 and the path (65003) to reach 192.168.3.0/24.

BGP routers use the AS_PATH attribute for loop avoidance. A BGP router does not accept a route received from an EBGP peer in which the AS_PATH attribute includes its own ASN. ASNs are prepended only when routes are being advertised between EGBP peers. The AS_PATH attribute remains when routes are being advertised between IBGP peers.

BGP routes are considered most desirable when they traverse the least possible number of ASes. As when prefixes are being advertised from ASes to ASes, the AS_PATH gets longer and longer. The shorter the AS_PATH, the more desirable it is. Usually, having multiple instances of the same ASN in the list does not make sense and defeat the purpose of the AS_PATH attribute. However, below shows a scenario that adding multiple instances of an ASN to the AS_PATH is useful.

AS Path Prepending

Remember that outgoing BGP route advertisements directly influence incoming traffic. Normally, the traffic originated from the Internet and destined to 192.168.1.0/24 passes through AS 300 as the AS_PATH of the route is shorter. But what if the link via AS 200 is the preferred path for incoming traffic for AS 100? Due to the bandwidth of the links along the (400, 200, 100) path is much higher than the bandwidth of the links along the (300, 100) path; or AS 200 is the primary provider and AS 300 is only the backup provider – outgoing traffic is sent via AS 200, and therefore it is desired to receive incoming traffic through the same path.

AS 100 can influences its incoming traffic by changing the AS_PATH of its advertised route. By adding multiple instances of its own ASN into the AS_PATH of route advertised to AS 300, AS 100 can make routers in the Internet think that the (400, 200. 100) path is the shorter path. The procedure of adding extra ASNs into the AS_PATH attribute is called AS Path Prepending. ASNs are inserted at the beginning of the AS_PATH; just after the ASN of the originating router.


The NEXT_HOP Attribute

The BGP NEXT_HOP attribute is a well-known mandatory attribute. It specifies the next-hop IP address to be used to reach a destination. BGP, like IGPs, is a hop-by-hop routing protocol. However, unlike IGPs, BGP routes AS-by-AS, not router-by-router; and therefore the default next-hop always towards the next AS. The next-hop address for a network from another AS is the IP address of an entry point of the next AS along the path to that destination network.

The IP address of the BGP NEXT_HOP attribute is not always the address of a neighboring router. The following rules apply when determining the IP address in the BGP NEXT_HOP attribute:
 If the advertising router and receiving router are in different ASes (EBGP peers), the NEXT_HOP is the BGP Router ID of the advertising EBGP peer.
 If the advertising router and receiving router are in the same AS (IBGP peers), and the NLRI of the update refers to a destination within the same AS, the NEXT_HOP is the BGP Router ID of the advertising IBGP peer. It is very important to understand the dependency of IBGP upon the IGP for the recursive route lookup process.
 If the advertising router and the receiving router are in the same AS (IBGP peers), and the NLRI of the update refers to a destination in a different AS, the NEXT_HOP is the BGP Router ID of the advertising EBGP peer.

BGP NEXT_HOP Attribute

For EBGP, the NEXT_HOP is the IP address of the EBGP neighbor that advertised the update. RT1 advertises 192.168.1.0/24 to RT2 with a NEXT_HOP attribute of 12.12.12.1; and RT2 advertises 192.168.2.0/24 to RT1 with a NEXT_HOP attribute of 12.12.12.2. Therefore, RT1 uses 12.12.12.2 as the next-hop address to reach 192.168.2.0/24, and RT2 uses 12.12.12.1 as the next-hop address to reach 192.168.1.0/24.

For IBGP, the NEXT_HOP attribute advertised by EBGP is preserved when carried into IBGP. Due to this IBGP NEXT_HOP rule, RT2 advertises 192.168.1.0/24 to its IBGP peer RT3 with a NEXT_HOP of 12.12.12.1, the IP address of RT1 E0/1; not 23.23.23.2 as we might expect. Therefore, it is very important to make sure that RT3 is able to reach the 12.12.12.0/24 subnet, either via a static route or an IGP; otherwise, it will drop packets destined to 192.168.1.0/24, as it is unable to get to the next-hop address for the destination.
In fact, without an IGP running on RT2 and RT3 for RT2 to advertise the 12.12.12.0/24 to RT3, the route to 192.168.1.0/24 does exists in the BGP table but not in the IP routing table on RT3.

IBGP routers often perform recursive route lookup to find out how to reach the BGP next-hop addresses using IGP entries in the IP routing table. RT3 learns a BGP update about network 192.168.1.0/24 from the route source 23.23.23.2 – RT2, with a next-hop of 12.12.12.1 – RT1. RT3 installs the route to 192.168.1.0/24 in the routing table with a next-hop of 12.12.12.1. Assuming that RT2 advertises network 12.12.12.0/24 using its IGP to RT3, RT3 then installs the route in its IP routing table with a next-hop of 23.23.23.2. Remember that IGP uses the source IP address of a routing update as the next-hop address; while BGP uses the NEXT_HOP attribute to indicate the next-hop address of a network. When RT3 forwards a packet to 192.168.1.1, it lookups the network in the routing table and found a BGP route with a next-hop of 12.12.12.1. RT3 then performs a recursive route lookup for the route to 12.12.12.0/24 and found an IGP route to the 12.12.12.0/24 network in the routing table with a next-hop of 23.23.23.2. Eventually RT3 forwards the packet destined to 192.168.1.1 to 23.23.23.2 – RT2.

Instead of advertising the DMZ link (the shared network between ASes) through an IGP, the next-hop-self configuration can be implemented on AS border BGP routers to override the NEXT_HOP attribute with the IP addresses of the AS border routers. IBGP peers within the AS would then forward packets destined to external networks through the AS border routers.

When BGP routers reside on a common subnet, they would set the NEXT_HOP to an appropriate address upon route advertisement to avoid inserting additional hops into the forwarding path. This feature is known as third-party next-hop and does not require any special configuration; it is often utilized in the route server setups in the ISP environments. Since it is impossible for a single router from large backbone ISPs to peer with every router from different backbone ASes at the major Internet Exchange Point (IXP) public peering points; route servers and route server clients are setup to reside on a shared L2 broadcast medium with a common subnet, eg: Ethernet; route server clients from different ASes only require a single EBGP peering with the route server, the route server then receives the routing information from its clients and relays back to its clients. Eventually the route server clients - BGP routers from different ASes forward packets destined to different ASes using the appropriate next-hop addresses. Additionally, the BGP routers from an AS may peer with BGP routers in different ASes and then exchange external routes via IBGP.

BGP NEXT_HOP Attribute over Multi-Access Network

RT1 and RT2 are EBGP peers; RT2 and RT3 are OSPF neighbors in AS 65002.
RT2 can reach 192.168.3.0/24 via 123.123.123.3. RT2 sets 123.123.123.3 instead of its own IP address 123.123.123.2 as the NEXT_HOP when it advertises the route to RT1 via EBGP. As the routers reside on a common subnet, it is more efficient for RT1 to use RT3 as the next-hop to reach 192.168.3.0/24, rather than having an extra hop through RT2.

However, problems might occur when routers reside on a non-broadcast multi-access (NBMA) medium. For example, RT1, RT2, and RT3 in the figure below are connected via Frame Relay. RT2 can reach 192.168.3.0/24 via 123.123.123.3. RT2 sets 123.123.123.3 instead of its own IP address 123.123.123.2 as the NEXT_HOP when it advertises the route to RT1 via EBGP. Routing between 192.168.1.0/24 and 192.168.3.0/24 fails if RT1 and RT3 are unable to communicate directly, eg: missing Frame Relay static mapping configuration.

BGP NEXT_HOP Attribute over NBMA Medium

The solution to this problem is implements the next-hop-self option on RT2 to override the third-party next-hop feature and advertise itself as the next-hop for routes that advertised to RT1.


The LOCAL_PREF Attribute

The BGP Local Preference (LOCAL_PREF) attribute is a well-known discretionary attribute. It directs the routers within an AS towards the preferred path to exit the AS for a particular route. Local preference is an attribute that is configured and exchanged only between IBGP peers. The path with the highest local preference is preferred. The default local preference value is 100.

BGP LOCAL_PREF Attribute

RT1 and RT2 are IBGP peers. AS 65001 can reach 192.168.1.0/24 via multiple paths. The local preferences for 192.168.1.0/24 are set to 150 and 200 on RT1 and RT2 respectively. The local preference information is exchanged within AS 65001; all traffic in AS 65001 destined to 192.168.1.0/24 is forwarded through RT2 as the exit point for AS 65001.


The MULTI_EXIT_DISC (MED) Attribute

The BGP Multiple Exit Discriminator (MULTI_EXIT_DISC, MED) attribute is an optional non-transitive attribute. It is also called the metric in Cisco IOS. The MED attribute is known as the INTER_AS attribute in BGP-2 and BGP-3.

The LOCAL_PREF attribute affects the outbound traffic leaving an AS; while the MED attribute influences inbound traffic entering an AS. The MED attribute informs EBGP peers upon the preferred path to enter an AS when the AS has multiple ingress points.
Note: The MED is used as a suggestion to an external AS regarding the preferred entry point into the local AS, because the external AS that is receiving the MED might be using other BGP attributes for route selection.

The lowest MED value is preferred, as MED is considered a metric; the lowest metric means the lowest distance, and therefore is being preferred.

Unlike local preference, the MED is carried in EBGP updates and exchanged between ASes. MEDs are carried into an AS and used there, but are not being passed beyond the receiving AS, which means that MEDs are used only to influence traffic between 2 directly connected ASes; AS path prepending must be used to influence route preferences beyond the neighboring AS.

BGP MULTI_EXIT_DISC (MED) Attribute

RT1 and RT2 set the MED attribute to 200 and 150 respectively. When RT3 receives the route to 192.168.1.0/24 from RT1 and RT2, it selects RT2 as the best next-hop to enter into AS 65001.

Note that IBGP is being used between routers within an AS to exchange MEDs between IBGP peers for them to notice and select the preferred path towards the neighboring AS.

By default, MEDs are compared only for paths through the EBGP peers in the same AS; MEDs are not compared for paths to the same destination that received from different ASes. The bgp always-compare-med BGP router subcommand configures a BGP router to compare the MEDs for the paths through EBGP peers from different ASes.


The COMMUNITY Attribute

The BGP COMMUNITY attribute is an optional transitive attribute of variable length. It is designed to simplify routing policy enforcements. Originally a Cisco-specific attribute, it is now standardized in RFC 1997 – BGP Communities Attribute.

A community identifies a group of prefixes that share some common properties. BGP communities provide a way to filter incoming and outgoing routes. A BGP router can tag routes using communities, and allow other BGP routers to make decisions based upon the tag. Any BGP router can tag routes in both incoming and outgoing updates; even upon redistribution. A BGP router may also modify the COMMUNITY in a receive route according to the local policy.

Ex: An ISP may assign a particular COMMUNITY attribute upon all its customer routes. The ISP can then set the LOCAL_PREF and MED attributes based on the COMMUNITY value rather than on each individual prefix. This significantly simplifies the configuration of the routers.

When a router receives updates with communities but does not understand the concept of communities, it can just forward the updates to other routers. However, if the router understands them, it must be configured to propagate them; otherwise, communities are dropped by default. Communities are not restricted within an AS; they have no boundaries and can be propagated across ASes.

The BGP COMMUNITY attribute is a set of 4-byte (32-bit) values. RFC 1997 specifies the format as AA:NN, in which AA as the AS and NN as the administratively defined identifier. Communities are shown as numeric numbers by default. The ip bgp-community new-format global configuration command changes the Cisco default format to the RFC 1997 format.
The best practice dictates that the AA should be the ASN of the network defining the community.

-x: A route from AS 123 has a COMMUNITY identifier of 321. The COMMUNITY attribute in the AA:NN format is 123:321 and is represented in hex as a concatenation of the 2 numbers – 0x007B0141, where 123 = 0x007B and 321 = 0x0141. The RFCs use the hex representation, but Cisco IOS represent COMMUNITY attribute values in decimal.
In this case, 123:321 is represented as 8061249, the decimal equivalent of 0x007B0141.

The COMMUNITY attribute values from 0 (0x00000000) through 65,535 (0x0000FFFF) and from 4,294,901,760 (0xFFFF0000) through 4,294,967,295 (0xFFFFFFFF) are reserved.
Out of the reserved ranges, RFC 1997 has defined the following well-known communities that have global significance:
INTERNET (0x00000000). All routes belong to this community be default. Received routes belong to this community can be advertised freely.
NO_EXPORT (0xFFFFFF01 or 4,294,967,041). Received routes carrying this community attribute value cannot be advertised to EBGP peers – BGP speakers outside of the local AS; or outside a BGP confederation if a BGP confederation is configured.
NO_ADVERTISE (0xFFFFFF02 or 4,294,967,042). Received routes carrying this community attribute value cannot be advertised at all, to either EBGP or IBGP peers.
NO_EXPORT_SUBCONFED or LOCAL_AS (0xFFFFFF03 or 4,294,967,043).
Received routes carrying this community attribute value cannot be advertised to EBGP peers, including peers in other member autonomous systems within a BGP confederation.

A prefix can be a member of more than one community, hence have multiple community attributes. A BGP router can act based on one, some, or all of the attributes; it has the option of adding and/or modifying community attributes before propagating routes on to other internal or external peers.


The ORIGINATOR_ID and CLUSTER_LIST Attributes

The ORIGINATOR_ID and CLUSTER_LIST attributes are optional non-transitive attributes. They are used to detect and prevent routing loops when implementing route reflection.

The 4-byte ORIGINATOR_ID is created by a route reflector to indicate the BGP Router ID of the originating router in the local AS. When the originator sees its own Router ID as the ORIGINATOR_ID of a received path, it knows that a loop has occurred, and ignores the route. Note: The originator is not necessary the route reflector! It is actually the AS edge BGP router that received the EBGP route and advertised route into the local AS through IBGP.

A cluster identifies the routers involved in route reflection; and a Cluster ID identifies a cluster. The CLUSTER_LIST lists a sequence of Cluster IDs which indicates the clusters that an update has passed through – the route reflection path. When a route reflector sees its local cluster ID in the CLUSTER_LIST of a received oath, it knows that a loop has occurred, and ignores the route.


The WEIGHT Attribute

The administrative weight attribute is a Cisco-proprietary parameter that is assigned upon paths for the path-selection process. The weight values for different paths are configured locally and effective on a router on a per-neighbor or per-route basis and are not propagated to other peers. Note that caution must be taken when manipulating weight as it only impacts path selection on the local router and hence may result in routing loops.

The WEIGHT attribute value ranges from 0 to 65,535 (inclusive). Paths that are originated by the local router have a weight of 32,768 by default, and all other paths have a weight of 0 by default. Path with the highest weight are preferred when there are multiple paths to the same destination.

The LOCAL_PREF attribute is used when there are multiple routers provide multiple exit points. The WEIGHT attribute is used when a single router has multiple exit points out an AS.

BGP WEIGHT Attribute

RT1 and RT3 learn about 172.16.0.0/16 from AS 65002 and propagate the update to RT4. RT4 has 2 paths to reach 172.16.0.0/24 and must decide which preferred path. RT4 sets the weight for updates coming from RT1 and RT3 to 150 and 200 respectively. RT4 will then use RT3 as the next-hop to reach 172.16.0.0/16.

No comments:

Post a Comment