It is important to understand two rules of prefix advertisement:
1. When a prefix is received from an EBGP neighbor, the router must advertise that prefix to all other EBGP and IBGP neighbors.
2. When a prefix is received from an IBGP neighbor, it can be advertised ONLY to EBGP neighbors, NOT to any other IBGP neighbors.
This second rule requires a fully meshed IBGP neighbor relationship; otherwise, prefixes are not advertised to all routers in a single AS.
IBGP full mesh can scale in networks where the number of IBGP running routers is small; however, in networks characteristic of a big ISP in which the number of routers running IBGP might reach several hundred, having an n(n–1) (where n is the total number of routers in the AS) neighbor relationship and exchanging routes between all simply will not work. Figure below shows a fully meshed IBGP with only 12 routers running IBGP.
Imagine the nightmare caused by replacing the 12-router full mesh with a 500-router full mesh of IBGP. This limitation of full-mesh IBGP was the catalyst for the development of two mechanisms that address this problem:
- Route Reflection, as described in RFC 1966
- AS Confederations, as described in RFC 3065
For more detailed coverage of these mechanisms, you are encouraged to read the RFCs.
Instead of doing full-mesh IBGP between all routers, Route-Reflection design allows router networks to have a hierarchy. Networks are divided into regions, and each region can have a multiple-layer hierarchy of Core, Aggregation, and Access routers. IBGP routing updates are propagated between levels in both directions when running Route-Reflection.
Figure below replaces the fully meshed IBGP mess illustrated in Figure above by using Route-Reflection in an IBGP network. Each Access layer router connects to only regional Aggregation routers, and these Aggregation routers connect only to Core routers. The Core routers need to be fully meshed with each other. Multiple connections exist from each router for redundancy. Routers speak only IBGP with their upper-layer routers. For example, R1 peers only with R4 and R5, which peer only with R6 and R7. Core routers peer with each other and to all routers below them in the hierarchy. This way, the Core is connected to all regions.
The top level is a Route-Reflector (RR) for the bottom level that acts as a Route-Reflector-Client (RRC) for the top level. In Figure above, the Core layer routers (R6 and R7) act as RRs for the Aggregation layer routers (R4, R5, R8, and R9); therefore, the Aggregation layer routers (R4, R5, R8 and R9) are RRCs of the Core layer routers (R6 and R7). An RR client can be a Route-Reflector for bottom-layer routers as well. Aggregation layer router R4, which is an RRC for the Core layer routers (R6 and R7), is also acting as an RR for Access layer Routers R1, R2, and R3, which are RRCs for the Aggregation layer routers (R4 and R5).
This is an example of hierarchical Route-Reflection. A network that has just two layers (Core and Access) has only one level of Route-Reflection. Route-Reflection is configured only on the RR(s). Route-Reflector-Clients are unaware that they are part of any reflection; therefore, no configuration is needed to make them RRCs.
The way that IBGP routing updates flow in an RR network is defined by the following rules:
1. If an update came from an EBGP neighbor, advertise that update to all neighbors (IBGP, EBGP, Route-Reflector-Client(s)).
2. If an update came from an IBGP neighbor, advertise that update to EBGP neighbors and Route-Reflector-Clients.
3. If an update came from a Route-Reflector-Client, advertise that update to other Route-Reflector-Client(s), IBGP, and EBGP neighbors, but not to the Route-Reflector-Client that sent the update.
In this case the EBGP neighbor is connected to Core Router R6 to advertise an update for 18.104.22.168/24. R6 passes that update to all neighbors because of rule 1 just mentioned, and the Aggregation layer (R4 and R5) will pass that to the Access layer (R1, R2, and R3) because of rule 2. Similarly, the east region will also propagate the update. This way, 22.214.171.124/24 will be propagated throughout the region without having a full mesh of IBGP.
Now, imagine that Access layer Router R1 receives the prefix 126.96.36.199/24 from its EBGP neighbor. R1 propagates that to the Aggregation layer (R4 and R5) because of rule 1. R4 and R5 reflect the update to the Aaccess layer (R2 and R3) and to the Core layer (R6 and R7) because of rule 3. The Core layer (R6 and R7) reflects that update to the east region Aggregation layer (R8 and R9) because of rule 3.
This way, 188.8.131.52/24 will be propagated from the lower layers to the upper layers in a hierarchical network.
Hierarchical Route-Reflection networks make more sense when they are viewed as a group of RRs and their clients. Following are the definition of a few important and key concepts in understand hierarchical Route-Reflection.
Cluster — A set of one or more RRs and their clients.
Originator_ID attribute — This is a RID of the router that originate or first received the route from EBGP neighbor in the local AS and the RR create the originator ID.
Cluster-ID — A 4-byte integer representing the cluster. If the cluster-ID is not configured, the RID of the RR is taken as the cluster-ID. Configure the cluster-ID using the following Cisco IOS Software command:
router bgp 109
bgp cluster-id x.x.x.x
When two RRs are configured with the same cluster-ID, they are part of the same cluster.
Cluster_list attribute — A list of cluster-IDs representing the series of clusters that an update has traversed. When an RR receives an update from its client, the RR adds its local cluster-ID and sends it to a nonclient (upper-level RR or IBGP neighbor).
When an RR receives an update with its own cluster-ID in the cluster list, the RR drops that update, assuming that the update has looped.
Figure below shows cluster definition as configured on all RRs.
Route-Reflection solves the full IBGP mesh problem very elegantly and offers great flexibility for BGP networks to grow to much bigger IBGP networks. Almost all large BGP networks make use of Route-Reflection to scale their IBGP.
In an AS Confederation, an AS is divided into smaller Sub-autonomous systems, which are connected through EBGP to each other. Each Sub-AS acts as an independent BGP AS and runs normal IBGP internally within the Sub-AS. A single IGP is run in a complete AS and each Sub-AS has IGP routing information about all other Sub-autonomous systems. Most BGP attributes, such as LOCAL_PREF, MED, and NEXT_HOP, are preserved when updates go across a Sub-AS. The AS_PATH attribute adds the Sub-autonomous systems in the AS_PATH. To the outside world, the AS running AS Confederation appears as a single AS.
To better understand AS Confederations, you need to know about how the AS_PATH attribute operates within an AS Confederation network. Just as the AS_PATH attribute carries information about autonomous systems the updates have traversed, AS_PATH in Confederation carries Sub-AS information. Just as with the AS_PATH attribute, when a router running Confederation receives an update whose AS_PATH contains its own Sub-AS, the router drops that update to avoid loops. The two BGP attributes associated with AS Confederations are described as follows:
AS_CONFED_SEQUENCE — Defines the list of Sub-autonomous systems in the AS_PATH, in sequential order of confederated Sub-AS where the update has traversed. This is analogous to AS_SEQUENCE, as discussed in AS_PATH attribute definition.
AS_CONFED_SET — Defines the list of Sub-autonomous systems in the AS_PATH in an unordered set of Sub-AS. This can be used in situations where a Confederation Sub-AS is aggregating routes to form multiple Sub-autonomous systems. In this case, you can set AS_PATH as AS_CONFED_SET for the aggregated route; it carries the list of all Sub-AS, but their order is not maintained. This is analogous to AS_SET, as discussed in the AS_PATH attribute definition.
Figure below shows an AS 109 divided into an AS Confederation of three small Sub-AS: 65001, 65002, and 65003. Each Sub-AS runs EBGP with the other Sub-autonomous systems. Notice that the Sub-autonomous systems do not have a full mesh of EBGP. This is similar to the real world of BGP where all EBGP speakers are not fully meshed. Each Sub-AS treats the other Sub-autonomous systems as EBGP neighbors, thus forwarding all updates from one Sub-AS to other Sub-autonomous systems.
R1 in Sub-AS 65003 is running EBGP with autonomous system 110, which is advertising 184.108.40.206/24 to R1. When R1 receives the update from autonomous system 110, the prefix 220.127.116.11/24 will have the AS_PATH as 110. Sub-AS 65003 propagates this path to Sub-AS 65002 with the AS_PATH attribute as (65003) 110. In BGP output, (65003) means that this autonomous system represents a Sub-AS of an AS Confederation. When this update leaves subautonomous system 65002, the AS_PATH looks like (65002 65003) 110. When R12 in Sub-AS 65001 advertises 18.104.22.168/24 to the outside world, the AS_PATH field is stripped from the Confederation Sub-AS numbers, and the outside world is presented with AS_PATH as 109 110 for prefix 22.214.171.124/24 as if there were no Confederation in AS 109.
Although an AS Confederation offers a mechanism to avoid fully meshed IBGP in a large AS, a full mesh of IBGP is still a requirement within a Sub-AS. This presents a challenge of scaling IBGP within each Sub-AS. Each Sub-AS could then have a full mesh of IBGP or it could run Route-Reflection within each Sub-AS.
In the quest to eliminate fully meshed IBGP using Route-Reflection or AS Confederations, BGP operators look at various reasons to prefer one to the other. It depends on, among other things, how the physical network is laid out, which method requires less configuration change, and which method offers ease in managing IBGP.