Tuesday, October 11, 2011

BGP Confederations

Confederation is another method to handle the scalability issue of full mesh IBGP within an AS. Confederation is based on the concept that an AS can be broken into multiple smaller sub-ASes. All the IBGP rules still apply inside each sub-AS, eg: all the BGP routers must be fully meshed. EBGP must run between the sub-ASes as they have different ASNs. The ASNs could be chosen from the private AS range (64512 to 65535) to avoid the hassles of applying formal ASNs. Although EBGP is used to exchange routes between sub-ASes of a confederation, routing inside a confederation behaves like IBGP routing in a single AS – the NEXT_HOP, LOCAL_PREF, and MED information is preserved throughout the sub-ASes. A confederation still behaves and looks like a single AS or routing domain to the outside world.
Note: Some texts refer to Sub-AS as Member-AS.

Without confederations, EBGP routes are preferred over IBGP routes in the BGP best path selection algorithm. Confederations introduced a new type of EBGP route between sub-ASes – the confederation external or EIBGP route. BGP prefers routes in the following manner:
EBGP routes to outside the confederation > EIBGP routes > IBGP routes

Confederations can easily detect routing loops inside the whole AS because EBGP is run between the sub-ASes. The AS path list is a loop avoidance mechanism used to detect routing updates leaving a sub-AS and attempting to reenter the same sub-AS. A routing update that tries to reenter a sub-AS it originated from is detected when the sub-AS will see its own sub-AS number listed in the AS path of the update.

The main drawbacks of confederations are the migration from a non-confederation to a confederation design requires major reconfiguration of the routers and a major change upon the logical topology, and do not provide the capabilities for implementing policies between sub-ASes, as the whole AS is still considered one entity. Additionally, since sub-ASs do not influence the overall AS path length – all paths inside a confederation have exactly the same AS path length, suboptimal routing through a confederation could occur without manually tuning the policies, such as using the local preference attribute.

Choosing and connecting the sub-ASes randomly inside a confederation will lead to problems. Unnecessary processing might occur as a sub-AS can end up receiving duplicate information from sub-ASes throughout the same path. Additionally, suboptimal routing can occur as all paths inside the confederation have exactly the same AS path length. Experience has proven that a centralized confederation architecture, in which all sub-ASes exchange information with each other through a central sub-AS backbone, and each sub-AS interacts with only one other sub-AS, produces a uniform routing through the AS path length and route exchange within confederation, results in the most optimal routing behavior.

Centralized Confederation Architecture


AS 3 is divided into 2 smaller sub-ASes – AS 65001 and AS 65002. OSPF is used as the IGP in each sub-AS. The OSPF in AS 65001 is running independently of the OSPF in AS 65002, which means that the OSPF area numbers used in AS 65001 can be reused in AS 65002 – an IGP in a sub-AS runs independently of IGPs in other sub-ASes.

RT3 has all its interfaces reside in OSPF area 1. RT3 is running EBGP with RT1 in AS 1 and is running IBGP with RT4 in AS 65001. Take not that RT3 advertising the DMZ link to AS 1 – 13.13.13.0/24 through OSPF, eliminating the needs of the next-hop-self command for RT4; and defines interface Ethernet0/0 to AS 1 as passive interface to avoid forming neighborship. RT3 uses the bgp confederation identifier 3 BGP router subcommand to present itself to RT1 as being part of confederation 3. The AS Confederation Identifier is the externally visible ASN, and it is the ASN used in OPEN messages and advertised in the AS_PATH attribute.

RT4 is the sub-AS 65001 border router that is running EIBGP with RT7 in sub-AS 65002. RT4 is also running IBGP with RT3. RT4 is an OSPF area border router that has its interfaces in areas 0 and 1. RT4 has disabled its OSPF processing on the link to RT7 using passive interface, and only EIBGP is running on that link. RT4 uses the bgp confederation peers 65002 BGP router subcommand to preserve all the attributes, eg: local preference and next hop when traversing the EIBGP session to AS 65002, which makes the confederation EBGP session with sub-AS 65002 looks like an IBGP session. RT4 uses the next-hop-self command to set the next-hop address of routes advertising from RT4 to RT7 to RT4’s IP address – 47.47.47.4. Without this command, the next-hop address of the EBGP route from AS 1 (192.168.1.0/24) will be sent from RT4 to RT7 with the external next hop 13.13.13.1, which may becomes unreachable for routers in sub-AS 65002 depends upon the network design and implementation.
Tips: When implementing a confederation, include several extra internal AS umbers in the bgp confederation peers BGP router subcommand to ease the addition of new sub-ASes in the future without having to reconfigure all BGP routers throughout the network.

RT7 is also an area border router in areas 0 and 1. Areas 0 and 1 in AS 65002 are totally independent of areas 0 and 1 in AS 65001. The IGPs are shielded from each other by EBGP. Full-mesh IBGP sessions are implemented between RT5, RT6, and RT7 in sub-AS 65002.

RT6 is a border router for confederation 3. RT6 is running EBGP with RT2 in AS 2 and a full mesh IBGP with RT5 and RT7 in sub-AS 65002. RT6 has all its interfaces reside in OSPF area 0. Note that RT6 is not running OSPF on the external link to AS 2 and therefore the next hop for EBGP routes received on RT6 must be set to itself before propagating the routes to RT5 and RT7.

The AS path access list would match all the routes that originated locally from the confederation.
RT3#sh ip bgp regexp ^(\([0-9]*\))*$
BGP table version is 12, local router ID is 3.3.3.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 172.16.3.0/24    0.0.0.0                  0         32768 i
*>i172.16.4.0/24    34.34.34.4               0    100      0 i
*>i172.16.5.0/24    47.47.47.7               0    100      0 (65002) i
*>i172.16.6.0/24    47.47.47.7               0    100      0 (65002) i
*>i172.16.7.0/24    47.47.47.7               0    100      0 (65002) i
RT3#

Below shows how RT1 sees all routes via 2 paths – one via AS 2 and one via AS 3.
Note that all the sub-ASes are hidden from RT1.
RT1 and RT2 in AS 1 and AS 2 have no visibility upon the sub-ASes inside confederation 3.
RT1#sh ip bgp
BGP table version is 13, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  172.16.3.0/24    12.12.12.2                             0 2 3 i
*>                  13.13.13.3               0             0 3 i
*  172.16.4.0/24    12.12.12.2                             0 2 3 i
*>                  13.13.13.3                             0 3 i
*  172.16.5.0/24    12.12.12.2                             0 2 3 i
*>                  13.13.13.3                             0 3 i
*  172.16.6.0/24    12.12.12.2                             0 2 3 i
*>                  13.13.13.3                             0 3 i
*  172.16.7.0/24    12.12.12.2                             0 2 3 i
*>                  13.13.13.3                             0 3 i
*> 192.168.1.0      0.0.0.0                  0         32768 i
*> 192.168.2.0      12.12.12.2               0             0 2 i
RT1#

Below shows the BGP table on RT3. Note that all the sub-ASes are indicated between parentheses. Any path taken between sub-ASes has an AS path length of 0. Notice how prefix 192.168.2.0/24 is learned via 2 paths – one internal via (65002) 2, and the other external via 1 2. The AS path length of the internal route via (65002) 2 is considered shorter, as the sub-ASes are not being used to determine the AS path length.
RT3#sh ip bgp
BGP table version is 9, local router ID is 3.3.3.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 172.16.3.0/24    0.0.0.0                  0         32768 i
*>i172.16.4.0/24    34.34.34.4               0    100      0 i
*>i172.16.5.0/24    47.47.47.7               0    100      0 (65002) i
*>i172.16.6.0/24    47.47.47.7               0    100      0 (65002) i
*>i172.16.7.0/24    47.47.47.7               0    100      0 (65002) i
*> 192.168.1.0      13.13.13.1               0             0 1 i
*>i192.168.2.0      47.47.47.7               0    100      0 (65002) 2 i
*                   13.13.13.1                             0 1 2 i
RT3#

Below shows how RT7 considers all routes coming from sub-AS 65001 as confederation external routes – confed-external.
RT7#sh ip bgp 192.168.1.0
BGP routing table entry for 192.168.1.0/24, version 5
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to peer-groups:
     SUB-AS_65002
  (65001) 1
    47.47.47.4 from 47.47.47.4 (4.4.4.4)
      Origin IGP, metric 0, localpref 100, valid, confed-external, best
RT7#
RT7#sh ip bgp 172.16.3.0
BGP routing table entry for 172.16.3.0/24, version 2
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to peer-groups:
     SUB-AS_65002
  (65001)
    47.47.47.4 from 47.47.47.4 (4.4.4.4)
      Origin IGP, metric 0, localpref 100, valid, confed-external, best
RT7#

Cisco recommends the use of route reflection to solve IBGP mesh scalability issues; and use confederations to run an IGP in a sub-AS independently of IGPs in other sub-ASes in order to control the instability of large IGP routing domains. Actual deployments have proven that route reflections are more flexible to implement and maintain. Route reflections can be used in conjunction with confederations, in which an AS can be divided into sub-ASes that each run route reflections internally.

BGP Confederation Route Selection

Below shows the routing towards 192.168.1.0/24 on RT5 and RT7.
Note that both the Confederation External and Confederation Internal are treated as internal routes.
RT7 selects the path through RT5 instead of RT6 as RT5 has a lower BGP Router ID than RT6.
RT7 does not select the path through RT5 due to it is a confed-external route. This is proven by resetting the BGP sessions on RT4, and RT5 will then learn the confed-external route through RT7; but RT5 will eventually select the confed-internal route through RT4 when RT4 is up again.
RT5#sh ip bgp
BGP table version is 4, local router ID is 5.5.5.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 172.16.1.0/24    0.0.0.0                  0         32768 i
*> 172.16.2.0/24    57.57.57.7               0    100      0 (65002) i
*>i192.168.1.0      45.45.45.4               0    100      0 1 i
RT5#
======================================================================
RT7#sh ip bgp
BGP table version is 5, local router ID is 7.7.7.7
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 172.16.1.0/24    57.57.57.5               0    100      0 (65001) i
*> 172.16.2.0/24    0.0.0.0                  0         32768 i
*> 192.168.1.0      57.57.57.5               0    100      0 (65001) 1 i
* i                 67.67.67.6               0    100      0 1 i
RT7#
RT7#sh ip bgp 192.168.1.0
BGP routing table entry for 192.168.1.0/24, version 3
Paths: (2 available, best #1, table Default-IP-Routing-Table)
  Advertised to non peer-group peers:
  67.67.67.6
  (65001) 1
    57.57.57.5 from 57.57.57.5 (5.5.5.5)
      Origin IGP, metric 0, localpref 100, valid, confed-external, best
  1
    67.67.67.6 from 67.67.67.6 (6.6.6.6)
      Origin IGP, metric 0, localpref 100, valid, confed-internal
RT7#

By manipulating the BGP Router ID of RT4 from 4.4.4.4 to 8.8.8.8, RT5 select the confed-external to 192.168.1.0/24 through RT7 instead of RT4.
RT5#sh ip bgp 192.168.1.0
BGP routing table entry for 192.168.1.0/24, version 6
Paths: (2 available, best #2, table Default-IP-Routing-Table)
  Advertised to non peer-group peers:
  45.45.45.4
  1
    45.45.45.4 from 45.45.45.4 (8.8.8.8)
      Origin IGP, metric 0, localpref 100, valid, confed-internal
  (65002) 1
    57.57.57.7 from 57.57.57.7 (7.7.7.7)
      Origin IGP, metric 0, localpref 100, valid, confed-external, best
RT5#

No comments:

Post a Comment