Wednesday, February 15, 2012

BGP Path Manipulation

Manipulating path selection criteria affects the inbound and outbound traffic policies of an AS. Unlike IGPs, BGP was never designed to choose the fastest routing path. It was designed to manipulate traffic flow to maximize or minimize the bandwidth usage. The figure below shows a common situation that can result when implementing BGP without any policy manipulation.

BGP Network without Policy Manipulation

Using default configuration for BGP path selection may cause uneven bandwidth utilization. RT1 in AS 65001 is using 60% of its outbound bandwidth to RT3 in AS 65002; but RT2 is using only 20% of its outbound bandwidth to RT4. No manipulation is needed if this utilization if acceptable to the organization. However, when the load that averages at 60% is bursts above 100% of the bandwidth, it can results in packet loss, higher latency, and higher CPU utilization. It makes sense to divert some of the traffic to another path when another link is able to reach the same destinations and is not heavily utilized or underutilized. Manipulating the local preference attribute can change the outbound path selection from AS 65001.

Determining the traffic to be manipulated first starts with performing a traffic analysis upon the outbound traffic to examine the most heavily visited IP addresses, web pages, and domain names through the network management accounting records or firewall accounting information. After this information has been determined, route maps can then be implemented to manipulate the local preference attribute for particular destinations to influence the BGP routers within an AS to forward packets destined to them through different edge routers and different paths.

After some path manipulation has been performed, the outbound utilization for RT2 may increase from 20% to 40%, and the outbound utilization for RT1 may decreased to from 60% to 40%. This change will make the outbound utilization on both links to be more balanced.

Just as there was an outbound utilization issue from AS 65001, a similar problem can occurs for inbound utilization. When the inbound utilization towards RT2 is much higher than the inbound utilization towards RT1, the BGP MED attribute can manipulate how traffic enters AS 65001. MED is considered a recommendation, as it is not considered until later in the BGP path selection process than the local preference. The receiving AS can override it by manipulating other attributes that are considered before the MED is evaluated.

If an outbound or inbound load averages less than 50%, path manipulation might not be needed. However, as soon as a link starts to spike up to its capacity for an extended period of time, either more bandwidth should be needed or path manipulation should be considered.

Without path manipulation, the most common criteria for path selection is the shortest AS path.


Manipulating the Local Preference Attribute

The local preference attribute is used only within an AS between IBGP routers to determine the preferred path to leave the AS to an outside destination. BGP routers will ignore the local preference attribute sent from EBGP neighbors due to software implementation bugs. The local preference for the routes advertised to IBGP neighbors is set to 100 by default; the highest value is preferred when comparing the local preferences for different routes.

The bgp default local-preference BGP router subcommand defines the local preference value for all EBGP routes received by the router on which this command is configured. The value ranges from 0 to 4,294,967,295, inclusive. The value is advertised to all IBGP neighbors.

Manipulating the default local preference can have an immediate and dramatic effect upon the traffic flow leaving an AS. Thorough traffic analysis should be performed to understand its effects prior to implementing it, as it can cause a particular outbound link to be overutilized while leaving other outbound links to be underutilized! Route maps should often be implemented to set only certain destinations to have a higher local preference over others.

BGP Local Preference

Below shows the BGP routing table on RT1 before manipulating the local preference attribute.
RT1 acts as a route reflector. All routes have a weight of 0 and a default local preference of 100. The neighbor next-hop-self command is not implemented on RT2 and RT3.
Note: RT2 and RT3 advertise only the best paths to destinations reside in other ASes to RT1.
RT1#sh ip bgp
BGP table version is 6, local router ID is 192.168.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*>i172.16.1.0/24    10.10.10.2               0    100      0 65001 i
*>i172.16.2.0/24    11.11.11.2               0    100      0 65002 i
*>i172.16.3.0/24    10.10.10.2               0    100      0 65001 65003 i
*>i172.16.4.0/24    11.11.11.2               0    100      0 65002 65004 i
*> 192.168.1.0      0.0.0.0                  0         32768 i
RT1#

There are 2 loop-free paths available for AS 65000 to reach 172.16.3.0/24 and 172.16.4.0/24.
RT2#sh ip bgp
BGP table version is 8, local router ID is 12.12.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 172.16.1.0/24    10.10.10.2               0             0 65001 i
*>i172.16.2.0/24    11.11.11.2               0    100      0 65002 i
*> 172.16.3.0/24    10.10.10.2                             0 65001 65003 i
*  172.16.4.0/24    10.10.10.2                             0 65001 65003 65004 i
*>i                 11.11.11.2               0    100      0 65002 65004 i
r>i192.168.1.0      12.12.12.1               0    100      0 i
RT2#
======================================================================
RT3#sh ip bgp
BGP table version is 8, local router ID is 13.13.13.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*>i172.16.1.0/24    10.10.10.2               0    100      0 65001 i
*> 172.16.2.0/24    11.11.11.2               0             0 65002 i
*  172.16.3.0/24    11.11.11.2                             0 65002 65004 65003 i
*>i                 10.10.10.2               0    100      0 65001 65003 i
*> 172.16.4.0/24    11.11.11.2                             0 65002 65004 i
r>i192.168.1.0      13.13.13.1               0    100      0 i
RT3#

The following route map configuration is implemented on RT2 to manipulate the local preference for 172.16.4.0/24 to 200 (higher than the default of 100) for all BGP routers in AS 65000 to forward the packets destined to 172.16.4.0/24 through RT2 > EXT1 instead of RT3 > EXT2.
!
router bgp 65000
 no synchronization
 neighbor 10.10.10.2 remote-as 65001
 neighbor 10.10.10.2 route-map SET-LOCAL_PREF in
 neighbor 12.12.12.1 remote-as 65000
 no auto-summary
!
route-map SET-LOCAL_PREF permit 10
 match ip address 1
 set local-preference 200
!
route-map SET-LOCAL_PREF permit 20
!
access-list 1 permit 172.16.4.0 0.0.0.255
!

The 1st route map permit statement matches and sets the local preference for 172.16.4.0/24. The 2nd route map permit statement that does not have any match or set clauses is similar to a permit any statement in an access list. Since there are no match conditions for other networks, they are permitted with their current settings without any manipulation.

The route map is applied as an inbound route map for EXT1 to allow RT2 to process the routes received from EXT1 through the route map and manipulate the local preference accordingly. Implementing the route map as an outbound route map on RT2 for RT1 does not work if RT2 first learns the best path through RT3 and it does not have the chance to advertise the non-best path.

Below shows the BGP routing tables on all the BGP routers in AS 65000 – RT1, RT2, and RT3. RT2 is now being selected over RT3 as the exit point of the AS towards 172.16.4.0/24.
RT1#sh ip bgp
BGP table version is 7, local router ID is 192.168.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*>i172.16.1.0/24    10.10.10.2               0    100      0 65001 i
*>i172.16.2.0/24    11.11.11.2               0    100      0 65002 i
*>i172.16.3.0/24    10.10.10.2               0    100      0 65001 65003 i
*>i172.16.4.0/24    10.10.10.2               0    200      0 65001 65003 65004 i
*> 192.168.1.0      0.0.0.0                  0         32768 i
RT1#
================================================================================
RT2#sh ip bgp
BGP table version is 8, local router ID is 12.12.12.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 172.16.1.0/24    10.10.10.2               0             0 65001 i
*>i172.16.2.0/24    11.11.11.2               0    100      0 65002 i
*> 172.16.3.0/24    10.10.10.2                             0 65001 65003 i
*> 172.16.4.0/24    10.10.10.2                    200      0 65001 65003 65004 i
r>i192.168.1.0      12.12.12.1               0    100      0 i
RT2#
================================================================================
RT3#sh ip bgp
BGP table version is 8, local router ID is 13.13.13.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*>i172.16.1.0/24    10.10.10.2               0    100      0 65001 i
*> 172.16.2.0/24    11.11.11.2               0             0 65002 i
*  172.16.3.0/24    11.11.11.2                             0 65002 65004 65003 i
*>i                 10.10.10.2               0    100      0 65001 65003 i
*>i172.16.4.0/24    10.10.10.2               0    200      0 65001 65003 65004 i
*                   11.11.11.2                             0 65002 65004 i
r>i192.168.1.0      13.13.13.1               0    100      0 i
RT3#


Manipulating the MED Attribute

The BGP Multiple Exit Discriminator (MULTI_EXIT_DISC, MED) attribute is used by an AS that tries to influence the entering point towards the AS from another AS when multiple paths exist between the 2 neighboring ASes. Since MED is being evaluated quite late in the BGP best path selection process, it can be override by other attributes and has no influence on the process.

The MED for the routes originated from an AS and advertised to EBGP peers is set to 0 by default; the lowest value is preferred when comparing the MEDs for different routes.

The default-metric {metric} BGP router subcommand defines the metric value for routes redistributed into BGP through the redistribute router subcommand. The value ranges from 1 to 4294967295, inclusive. This value is actually the MED that is being evaluated during the BGP best path selection process. This metric is not set if the received route already has a MED value. MEDs are carried into an AS and used there, but are not being passed beyond the receiving AS, which means that MEDs are used only to influence traffic between 2 directly connected ASes.

As like the default local preference, manipulating the default metric can have an immediate and dramatic effect upon the traffic flow entering an AS. Thorough traffic analysis should be performed to understand its effects prior to implementing it, as it can cause a particular inbound link to be overutilized while leaving other inbound links to be underutilized! Route maps should often be implemented to set only certain routes to have a higher or lower MED over others.

When an EBGP peer receives an update without a MED attribute, it must interpret its meaning. Cisco IOS treats routes without the MED attribute as having a MED of 0, making a route lacking the MED variable the most preferred. The IETF decided upon the missing MED should be treated as having a value of infinity, making a route lacking the MED variable the least preferred. Implement the bgp bestpath med missing-as-worst BGP router subcommand to configure a Cisco IOS BGP router to conform to the IETF standard – routes without the MED attribute are treated as having a MED of 4,294,967,295.

BGP MED

The intention of the route maps above is to designate RT2 as the preferred entry point to reach 192.168.1.0/24, and RT3 as the preferred entry point to reach 192.168.2.0/24 and 192.168.3.0/24. The networks will still be reachable through each router in case of a link or router failure.

MEDs are being manipulated and set outbound when advertising to an EBGP neighbor. The 2nd route map permit statement that does not have any match clause but just a set clause is the permit any statement for the route map. All networks that are being processed through this section of the route map are permitted and set to a MED of 200. If the MED is not set to 200, it would have been set to a MED of 0 by default. Since 0 is less than 100, those routes would have been the preferred paths to reach the networks reside in AS 65001.

Below shows the BGP routing table on EXT3 before manipulating the MED attribute.
EXT3#sh ip bgp
BGP table version is 4, local router ID is 23.23.23.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
* i192.168.1.0      11.11.11.1               0    100      0 65001 i
*>i                 10.10.10.1               0    100      0 65001 i
* i192.168.2.0      11.11.11.1               0    100      0 65001 i
*>i                 10.10.10.1               0    100      0 65001 i
*>i192.168.3.0      10.10.10.1               0    100      0 65001 i
* i                 11.11.11.1               0    100      0 65001 i
EXT3#

Below shows how the BGP routing table on EXT3 has evolved to maintain only the best paths after implemented the path manipulation configuration on RT2 and RT3.
EXT3#sh ip bgp
BGP table version is 1, local router ID is 23.23.23.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
* i192.168.1.0      10.10.10.1             100    100      0 65001 i
* i                 11.11.11.1             200    100      0 65001 i
* i192.168.2.0      10.10.10.1             200    100      0 65001 i
* i                 11.11.11.1             100    100      0 65001 i
* i192.168.3.0      10.10.10.1             200    100      0 65001 i
* i                 11.11.11.1             100    100      0 65001 i
EXT3#
EXT3#sh ip bgp
BGP table version is 4, local router ID is 23.23.23.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*>i192.168.1.0      10.10.10.1             100    100      0 65001 i
* i                 11.11.11.1             200    100      0 65001 i
* i192.168.2.0      10.10.10.1             200    100      0 65001 i
*>i                 11.11.11.1             100    100      0 65001 i
* i192.168.3.0      10.10.10.1             200    100      0 65001 i
*>i                 11.11.11.1             100    100      0 65001 i
EXT3#
EXT3#sh ip bgp
BGP table version is 4, local router ID is 23.23.23.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*>i192.168.1.0      10.10.10.1             100    100      0 65001 i
*>i192.168.2.0      11.11.11.1             100    100      0 65001 i
*>i192.168.3.0      11.11.11.1             100    100      0 65001 i
EXT3#


Manipulating the WEIGHT Attribute

The neighbor {ip-addr | peer-group-name} weight {weight} BGP router subcommand assigns a weight upon all the routes learned through the specified neighbor. The value ranges from 0 to 65535, inclusive. Routes learned through a BGP peer have a default weight of 0; while the local routes that originated by the router itself have a default weight of 32768.

The set weight {weight} route map action clause can also be used to manipulate the WEIGHT attribute for individual routes that matched by access lists, prefix lists, and various attributes. The weights assigned using this route map configuration command override the weights assigned using the neighbor weight command.


AS Path Prepending

AS path prepending is the manipulation of the BGP AS_PATH attribute that inserts additional ASNs upon outgoing EBGP updates. Cisco IOS supports inbound and outbound AS path prepending on EBGP sessions. AS path prepending does not work on IBGP sessions.

Outbound AS path prepending can be used as the last resort mechanism to influence the BGP routing policies in BGP multihoming scenarios; where all other methods, eg: manipulation upon the local preference and MED attributes through BGP communities, cannot be implemented due to the lack of support of the upstream ISPs.

AS Path Prepending

Below shows the BGP routing table on RT1. Since both EBGP routes learned from RT2 and RT3 have the same AS path length, whichever route that was received first (oldest path) will be selected as the best path and installed into the IP routing table.
RT1#sh ip bgp
BGP table version is 4, local router ID is 192.168.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 192.168.1.0      0.0.0.0                  0         32768 i
*  192.168.2.0      13.13.13.3                             0 300 400 i
*>                  12.12.12.2                             0 200 400 i
*  192.168.3.0      13.13.13.3                             0 300 400 i
*>                  12.12.12.2                             0 200 400 i
RT1#

Below shows the BGP routing table on RT1, a router in a AS behind the upstream ISP routers for AS 400 (RT2 and RT3), after AS 400 implemented AS path prepending to influence the traffic flow for inbound traffic towards its internal networks – 192.168.2.0/24 and 192.168.3.0/24. It is impractical to implement the propagation of the MED attribute across the whole network. Note that local policy routing is implemented on RT4 and RT5 mainly used to influence the traffic flow for the outbound traffic originated from AS 400 (and from the routers themselves).
RT1#sh ip bgp
BGP table version is 5, local router ID is 192.168.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 192.168.1.0      0.0.0.0                  0         32768 i
*  192.168.2.0      13.13.13.3                             0 300 400 400 i
*>                  12.12.12.2                             0 200 400 i
*> 192.168.3.0      13.13.13.3                             0 300 400 i
*                   12.12.12.2                             0 200 400 400 i
RT1#

By implementing AS path prepending and local policy routing in the sample network scenario, the traffic flow for inbound and outbound traffic towards 192.168.2.0/24 and 192.168.3.0/24 are being determined to flow through the RT2-RT4 and RT3-RT5 links respectively.

Do not simply prepend any ASN when implementing AS path prepending; use only an ASN already in the AS path, eg: the ASN of the most recently added ASN or the local ASN.

1 comment: