Showing posts with label CEF. Show all posts
Showing posts with label CEF. Show all posts

Thursday, February 6, 2014

NTS: BGP

BGP




BGP (Border Gateway Protocol) is defined in RFC 4271.



Uses TCP port 179.

The router with the highest router-id is used as the TCP client.



Best Path Selection


# Attribute Rule Default Notes Affects Traffic Applies to
route-map
1 Prefix Length Longest match Always checked inbound & outbound
2 Cost Community Lowest Cost Community Number
Lowest Cost Community ID
2147483647 Checked if "set extcommunity cost pre-bestpath" is configured. Skipped if "bgp bestpath cost-community ignore" is configured.
3 WEIGHT Highest Weight 32768 Local to the router. Local originated prefixes have weight 32768 by default. Only for Cisco; not recommended for general use.
4 LOCAL PREFERENCE Highest Local Preference 100 Used for separating customer/peering/transit traffic. outbound inbound
5 Prefer local-sourced routes Everything announced through network/aggregate commands or redistribution is considered as local-sourced.
6 AS-PATH Shortest as-path Ignored if "bgp bestpath as-path ignore" is configured. inbound outbound
7 ORIGIN Lowest Origin Type (IGP<EGP<Incomplete)
8 MED Lowest Multi-Exit Discriminator (MED) 1st AS must be the same, unless "bgp always-compare-med" is configured. inbound outbound
9 Prefer eBGP over iBGP
10 Lowest IGP metric to the BGP next hop
11 Cost Community Lowest Cost Community Number
Lowest Cost Community ID
2147483647 Checked if "set extcommunity cost igp" is configured. Skipped if "bgp bestpath cost-community ignore" is configured.
12 Check for BGP Multipath
13 If both paths are external, prefer the path received first
14 Lowest BGP router-id
15 If the originator-id or the router-id is the same for multiple paths, prefer path with minimum cluster list length Only for route reflectors
16 Prefer the path with the lowest neighbor address



MED
  • bgp deterministic-med
    • compare MED when choosing routes advertised by different neighbors in the same autonomous system
    • routes from the same autonomous system are grouped together and the best entries of each group are compared
  • bgp always-compare-med
    • compare MED when choosing routes advertised by different neighbors in different autonomous systems 

"bgp deterministic-med" and "bgp always-compare-med" are recommended in order to alway have a standard best path selection algorithm.

In order to avoid mis-interpreting a missing MED (zero vs infinity), when setting MED on a peering, it's best to set it also on all other peerings. Otherwise the command "bgp bestpath med missing-as-worst" can be used.


Common Comparisons
  • Prefix Length
  • Highest Local Preference
  • Shortest as-path
  • Lowest Multi-Exit Discriminator (MED)
  • Prefer eBGP over iBGP
  • Lowest IGP metric to the BGP next hop
  • Lowest BGP router-id


origin

The origin attribute indicates how BGP learned about a particular route.

  • i (IGP)
    • interior to the originating AS (i.e. when the network configuration command is used to inject the route into BGP)
  •  e (EGP)
    • learned via EGP (rarely seen)
  • ? (incomplete)
    • unknown or learned via some other way (i.e. redistributed into BGP, or from eBGP)



Address Families

AFIs
  • 1 (IPv4)
  • 2 (IPv6)
  • 25 (L2VPN)

SAFIs
  • 1 (Unicast)
  • 2 (Multicast)
  • 4 (NLRI with MPLS labels)
  • 65 (VPLS)
  • 128 (VPN with MPLS labels (VRFs))



Configuration

IOS
router bgp 2
 no synchronization
 bgp log-neighbor-changes
 network 5.5.5.5 mask 255.255.255.255
 neighbor 20.4.5.4 remote-as 1
 no auto-summary


IOS-XR
router bgp 1
 bgp log neighbor changes detail
 address-family ipv4 unicast
  network 4.4.4.4/32
 !
 neighbor 20.4.5.5
  remote-as 2

  address-family ipv4 unicast


In IOS-XR, if there is no loopback configured with an ipv4 address, the BGP session won't come up, until you explicitly configure the bgp router-id.

BGP is designed to refuse a session with itself because of the router-id check. You can use a  per-vrf assignment of BGP router-id in order to have a VRF-to-VRF peering on the same router.

In IOS-XR, every eBGP session requires an explicit route-policy in order to allow incoming/outgoing updates. It's good practice to create one named PASS-RPL with default action "pass" and use it when first activating each eBGP session. Afterwards you can create the required route-policy and use that instead.

When told to advertise a prefix into BGP, prefer to use the "network" statement, unless told to do otherwise. Also prefer to do the "network" advertisements to another AS on routers running eBGP to that AS.

You can use the "network x.x.x.x backdoor" command in order to change the admin distance of an eBGP route (default 20) to that of iBGP (200), so that the equivalent IGP route can be preferred.



Route Aggregation

  • redistribution of static or IGP
  • aggregate-address
    • summary-only
    • suppress-map
    • unsuppress-map (per neighbor)
    • advertise-map
  • inject-map

A more specific prefix must exist in the BGP table before doing aggregation.



Communities

  • Standard Communities (32-bit)
    • Used for well-known communities and for specific communities of type $ASN:$TAG in BGP
    • send-community (IOS)
    • send-community-ebgp (IOS-XR)
  • Extended Communities (64-bit) are defined in 
    • Used in MPLS VPNs for RT and SOO 
    • send-community extended (IOS)
    • send-extended-community-ebgp (IOS-XR)

Communities are configured through community lists.

When regular expressions are required, expanded (standard or extended) community lists must be used.

Configuration
  • Standard
    • ip community-list 1 permit 100:10
    • ip community-list standard X-COMMLIST permit 100:10 
  • Expanded Standard
    • ip community-list 100 permit 100:*
    • ip community-list expanded X-COMMLIST permit 100:*
  • Extended
    • ip extcommunity-list 1 permit rt 200:20
    • ip extcommunity-list standard X-COMMLIST permit rt 200:20
  • Expanded Extended
    • ip extcommunity-list 100 permit rt 200:*
    • ip extcommunity-list expanded X-COMMLIST permit rt 200:*

In IOS, all communities are not sent by default to iBGP or eBGP sessions.

In IOS-XR, all communities are sent by default on iBGP sessions, but not on eBGP sessions.


Well-known communities
  • internet
  • no-export (don't advertise to eBGP neighbor)
  • local-as (don't advertise to other confederation sub-AS)
  • no-advertise(don't advertise to any neighbor)

Delete communities

IOS
route-map DELCOMM1-ROUTEMAP permit 10
 set comm-list 1 delete

!
route-map DELCOMM2-ROUTEMAP permit 10
 set community none

!
route-map DELCOM3-ROUTEMAP permit 10
 set extcomm-list 1 delete


IOS-XR
route-policy DELCOM1-RPL
  delete community in (*:*)

end-policy
!

route-policy DELCOM3-RPL
  delete extcommunity rt all
end-policy


Use the "additive" keyword to add communities to existing ones.


Links



Synchronization

A BGP router with synchronization enabled does not install iBGP learned routes into its routing table and propagate them to an eBGP peer, if it is not able to validate those routes in its IGP first. It's used to ensure that there are no black holes inside the AS caused by intermediate routers that do not run BGP.

It's disabled by default ("no synchronization"), because nowadays most networks run iBGP or MPLS.



Route Reflectors

Route Reflectors modify iBGP split-horizon rules.

Routes learned on a RR from a RR-Client are propagated to other RR-Clients and Non-Clients
Routes learned on a RR from a Non-Client are propagated only to RR-Clients

RRs can be assigned per address-family.

RRs do not modify the next-hop of advertised routes by default.

RRs can be in the forwarding path or not.

Use "no bgp client-to-client reflection" on RRs, when their clients are also fully meshed.

An RR reflecting a route received from an RR-Client adds the following attributes:
  • Originator ID
    • the Router ID of the originator of the route
    • if the update comes back to the originator (so the local Router-ID is the same as the Originator-ID), the update is ignored
  • Cluster List
    • a list of Cluster IDs that an update has passed through 
    • when an RR reflects a route from a client to a non-client, the local Cluster ID is appended to the Cluster List
    • if the update comes back to the RR (so the local Cluster-ID is contained in the prefix Cluster List) the update is ignored

Originator and Cluster List are used to prevent loops in RR environments.

By default Cluster-ID = RR Router-ID. In case of two RRs, two different Cluster-IDs will be used. This increases memory utilization, because the same route is stored multiple times, each one with a different Cluster-ID.

You can use a common Cluster-ID in redundant RRs (in order to decrease memory utilisation, although rarely needed), only when you're sure that connectivity for RR clients won't break if the RR client looses one of its RR connections.


IOS
router bgp 100
 neighbor 2.2.2.2 remote-as 100
 neighbor 2.2.2.2 update-source Loopback 0
 neighbor 2.2.2.2 route-reflector-client



IOS-XR
router bgp 100
 neighbor 2.2.2.2
  remote-as 100
  update-source Loopback0
  address-family ipv4 unicast
   route-reflector-client



Links



Confederations

The AS is split into smaller autonomous systems in order to reduce the number of iBGP sessions.

It's common practice to use the private AS range (64512 – 65535) to denote a sub-autonomous system.

These internal ASNs are hidden and only a single external ASN is announced to eBGP neighbors.

BGP confederations modify iBGP as-path processing

When sending:
  • updates to iBGP neighbors
    • as-path is not changed
  • updates to intra-confederation eBGP neighbors
    • the intra-confederation ASN is prepended to the as-path
  • updates to eBGP neighbors
    • the intra-confederation ASNs are removed and the external ASN is prepended to the as-path

Intra-confederation eBGP session is:
  • like eBGP session when establishing the session (ebgp-multihop)
  • like iBGP session when sending routing updates (local pref, next-hop, etc.)


IOS
router bgp INTERNAL-ASN-100
 bgp confederation identifier EXTERNAL-ASN-1
 bgp confederation peers INTERNAL-ASN-200 INTERNAL-ASN-300

 neighbor 2.2.2.2 remote-as INTERNAL-ASN-200
 neighbor 3.3.3.3 remote-as INTERNAL-ASN-300
 neighbor 9.9.9.9 remote-as EXTERNAL-ASN-9



IOS-XR
router bgp INTERNAL-ASN-100
 bgp confederation peers
  INTERNAL-ASN-200
  INTERNAL-ASN-300
 !
 bgp confederation identifier
EXTERNAL-ASN-1
 !
 neighbor 2.2.2.2
  remote-as
INTERNAL-ASN-200
 !
 neighbor 3.3.3.3
  remote-as
INTERNAL-ASN-300 
 !
 neighbor 9.9.9.9
  remote-as
EXTERNAL-ASN-9


EXTERNAL-ASNs define the ASNs used for eBGP sessions between different ASNs.

INTERNAL-ASNs define the ASNs used for eBGP sessions between different sub-ASNs of the same ASN.

Example

IOS
router bgp 65100
 bgp confederation identifier 1
 bgp confederation peers 65200 65300
 neighbor 2.2.2.2 remote-as 65200
 neighbor 3.3.3.3 remote-as 65300
 neighbor 9.9.9.9 remote-as 9



IOS-XR
router bgp 65100
 bgp confederation peers
  65200
  65300
 !
 bgp confederation identifier 1
 !

 neighbor 2.2.2.2
  remote-as 65200
 !
 neighbor 3.3.3.3
  remote-as 65300
 !
 neighbor 9.9.9.9
  remote-as 9



Links



Next-Hop

  • advertisement to eBGP peer
    • next-hop changes to self
    • use "next-hop-unchanged" to not change
  • advertisement to iBGP peer
    • next-hop doesn't change
    • use "next-hop-self" to change

You can't use the next-hop-self for setting the next-hop in reflected iBGP routes. Instead use an outbound route map.

In IOS-XR, you can use "ibgp policy out enforce-modifications" in combination with an outbound route-map in order to force modification of the routes attributes (including next-hop) when sent to an iBGP neighbor.



keepalive & holdtime

Neighbor holdtime timers are negotiated while initially setting the BGP session and the smaller one gets used by both neighbors. Keepalive timers are then based on that holdtime value. It's not recommended to have less than 3 secs as a holdtime.

The fastest convergence on a BGP session that can be achieved by changing the keepalive/holdtime timers is 3 sec.

In order to protect the control-plane, you can put a limit on the lowest holdtime number accepted by using the "min-holdtime" command. If the neighbor doesn't comply, then the BGP session is rejected.



Mass Neighbor Configuration

In order to minimize neighbor configuration regarding the BGP session parameters you can use the following:

peer groups (IOS)

router bgp 100
 neighbor PEER-GROUP peer-group
 neighbor PEER-GROUP remote-as 100
 neighbor PEER-GROUP update-source Loopback0

!
 neighbor 1.1.1.1 peer-group PEER-GROUP

!
 address-family vpnv4
  neighbor PEER-GROUP send-community extended

  neighbor 1.1.1.1 activate

neighbor groups (IOS-XR)

router bgp 100
 neighbor-group NEI-GROUP
  remote-as 100
  update-source Loopback0

!
 neighbor 1.1.1.1

  use neighbor-group NEI-GROUP

peer session templates (IOS)

router bgp 100
 template peer-session PEER-TEMPLATE
  remote-as 100
  update-source Loopback0
!

 neighbor 1.1.1.1 inherit peer-session PEER-TEMPLATE




eBGP Peerings & TTL


IOS
router bgp 100
 neighbor 2.2.2.2 ttl-security hops X
 neighbor 3.3.3.3 ebgp-multihop Y


IOS-XR
router bgp 100
 neighbor 2.2.2.2
  ttl-security
 neighbor 3.3.3.3
  ebgp-multihop Y



IOS
R1#sh bgp nei | i TTL|Session
  Session: 2.2.2.2
Mininum incoming TTL 255-X, Outgoing TTL 255
  Session: 3.3.3.3
Mininum incoming TTL 0, Outgoing TTL Y




eBGP Multihop
It allows a neighbor connection between two external peers that do not have direct connection. You should also configure an IGP or static routing to allow the neighbors without direct connection to reach each other.


TTL Security Check 
It's a lightweight security mechanism to protect eBGP neighbor sessions from CPU utilization-based attacks (DoS attacks that flood the network with IP packets that contain forged source IP addresses).


IOS
When configured for an eBGP neighbor, the router accepts only IP packets with a TTL count that is greater or equal to maximum TTL value (255) minus the hop count that is configured locally for the relevant eBGP session. If the TTL value in the IP packet is less than the maximum TTL value (255) minus the hops configured value, the incoming packet is silently discarded.

Supports both directly connected neighbor sessions and multihop eBGP neighbor sessions

IOS-XR
When configured for a directly adjacent eBGP neighbor, the router accepts only IP packets with a TTL count that is equal to the maximum TTL value (255). If the TTL value in the IP pakcet is less than the maximum TTL value (255), the incoming packet is silently discarded.


TTL values according to BGP setup:
  • R1 config: neighbor R2
    • R1 sends packets to R2 with TTL=1
  • R1 config: neighbor R2 ttl-security hops X
    • R1 sends packets to R2 with TTL=255
  • R1 config: neighbor R2 ebgp-multihop X
    • R1 sends packets to R2 with TTL=X

ebgp-mutlihop combined with ttl-security on two eBGP routers

R1: ebgp-multihop X (<255)
R2: ttl-security hops Y (<254)

  • R1 sends packets to R2 with TTL=X
    • R2 doesn't reply back
    • R2 accepts packets with TTL < 255-Y
  • R2 sends packets to R1 with TTL=255
    • R1 replies back
    • R1 accepts packets with any TTL

General Rule

If ( X - ActualHops >= 255 - Y ) then the eBGP session can be established.


Interesting Cases

R1: ebgp-multihop X
R2: ttl-security hops 254

  • R1 sends packets to R2 with TTL=X
    • R2 replies back
  • R2 sends packets to R1 with TTL=255
    • R1 replies back

R1: ebgp-multihop 255
R2: ttl-security hops Y

  • R1 sends packets to R2 with TTL=255
    • R2 replies back
    • R2 accepts packets with TTL < 255-Y 
  • R2 sends packets to R1 with TTL=255
    • R1 replies back
    • R2 accepts packets with any TTL

If ebgp-multihop is set to 255 or ttl-security is set to 254 (aka when at least one of these parameters is set to its max), then the eBGP session can be established, as long as their packets can reach each other.


R1: ttl-security hops X
R2: ttl-security hops Y

  • R1 sends packets to R2 with TTL=255
    • R2 replies back
    • R2 accepts packets with TTL < 255-Y
  • R2 sends packets to R1 with TTL=255
    • R1 replies back
    • R1 accepts packets with TTL < 255-X 

If both routers use ttl-security, then the eBGP session can be established regardless of the hop values used, as long as their packets can reach each other.


R1: ebgp-multihop X
R2: ebgp-multihop Y

  • R1 sends packets to R2 with TTL=X
    • R2 replies back
    • R2 accepts packets with any TTL
  • R2 sends packets to R1 with TTL=Y
    • R1 replies back
    • R1 accepts packets with any TTL

If both routers use ebgp-multihop, then the eBGP session can be established regardless of the hop values used, as long as their packets can reach each other.

If loopback interfaces are used to connect single-hop eBGP peers, you can configure the "neighbor disable-connected-check" command before you can establish the eBGP peering session.



PMTUD


IOS
R2#sh bgp vpnv4 unicast all nei 19.19.19.19 | i tcp|segment
  Transport(tcp) path-mtu-discovery is enabled
Datagrams (max data segment is 1432 bytes):



If you have BGP PMTUD enabled (by default in most releases), BGP packets will be sent with DF bit set.

You can disable BGP PMTUD (either for all neighbors or for a specific neighbor) with the following commands.

IOS
router bgp 100
 no bgp transport path-mtu-discovery
 neighbor 19.19.19.19 transport path-mtu-discovery disable



If global command "ip tcp path-mtu-discovery" is disabled (default) and BGP PMTUD is disabled too, then the default MSS (536) is used for BGP neighbors.

If "ip tcp path-mtu-discovery" is enabled but BGP PMTUD is disabled, then the maximum MSS is used for BGP neighbors.

You can use "ip tcp mss X" to change the global TCP MSS.


IOS-XR
tcp path-mtu-discovery


Links


NTS: Advanced BGP

Advanced BGP




BGP (Border Gateway Protocol) is defined in RFC 4271.
MP-BGP (Multi-Protocol BGP) is defined in RFC 4760.
Labeled BGP (BGP+Label) is defined in RFC 3107.



enforce-first-as

When enabled, updates received from an eBGP peer that does not list its ASN at the beginning of the as-path in the incoming update are denied (in order to prevent spoofing).

It's enabled by default.

IOS
router bgp 100
 no bgp enforce-first-as


IOS-XR
router bgp 65000
 bgp enforce-first-as disable





local-as & dual-as

When local-as is enabled for a neighbor, it allows a router to appear to be a member of a second ASN, in addition to its real ASN.

This feature can only be used for true eBGP peers (i.e. members of different confederation sub-ASs are not supported).

R4 (IOS)
router bgp 1
 network 4.4.4.4 mask 255.255.255.255
 neighbor 20.4.5.5 remote-as 2
 neighbor 20.4.5.5 local-as 11



R5 (IOS)
router bgp 2
 network 5.5.5.5 mask 255.255.255.255
 neighbor 20.4.5.4 remote-as 11



By default, the new local-as is prepended in incoming and outgoing updates.

IOS
R4#sh bgp ipv4 unicast
BGP table version is 3, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 4.4.4.4/32       0.0.0.0                  0         32768 i
*> 5.5.5.5/32       20.4.5.5                 0             0 11 2 i


R5#sh bgp ipv4 unicast
BGP table version is 3, local router ID is 5.5.5.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 4.4.4.4/32       20.4.5.4                 0             0 11 1 i
*> 5.5.5.5/32       0.0.0.0                  0         32768 i


Use the "no-prepend" option to avoid prepending the new local-as in the incoming updates.

R4 (IOS)
router bgp 1
 network 4.4.4.4 mask 255.255.255.255
 neighbor 20.4.5.5 remote-as 2
 neighbor 20.4.5.5 local-as 11 no-prepend


IOS
R4#sh bgp ipv4 unicast
BGP table version is 5, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 4.4.4.4/32       0.0.0.0                  0         32768 i
*> 5.5.5.5/32       20.4.5.5                 0             0 2 i


R5#sh bgp ipv4 unicast
BGP table version is 5, local router ID is 5.5.5.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 4.4.4.4/32       20.4.5.4                 0             0 11 1 i
*> 5.5.5.5/32       0.0.0.0                  0         32768 i



Use the "no-prepend replace-as" option to avoid prepending the real ASN in the outgoing updates.

R4 (IOS)

router bgp 1
 network 4.4.4.4 mask 255.255.255.255
 neighbor 20.4.5.5 remote-as 2
 neighbor 20.4.5.5 local-as 11 no-prepend replace-as


IOS
R4#sh bgp ipv4 unicast
BGP table version is 7, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 4.4.4.4/32       0.0.0.0                  0         32768 i
*> 5.5.5.5/32       20.4.5.5                 0             0 2 i


R5#sh bgp ipv4 unicast
BGP table version is 7, local router ID is 5.5.5.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 4.4.4.4/32       20.4.5.4                 0             0 11 i
*> 5.5.5.5/32       0.0.0.0                  0         32768 i



Use the "no-prepend replace-as dual-as" option to avoid prepending the new local-as in the incoming updates and the real ASN in the outgoing updates and at the same time allow eBGP connections with both the real ASN and the new local-as.

R4 (IOS)
router bgp 1
 network 4.4.4.4 mask 255.255.255.255
 neighbor 20.4.5.5 remote-as 2
 neighbor 20.4.5.5 local-as 11 no-prepend replace-as dual-as


R5 (IOS)
router bgp 2
 network 5.5.5.5 mask 255.255.255.255
 neighbor 20.4.5.4 remote-as 11



IOS
R4#sh bgp ipv4 unicast
BGP table version is 9, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 4.4.4.4/32       0.0.0.0                  0         32768 i
*> 5.5.5.5/32       20.4.5.5                 0             0 2 i


R5#sh bgp ipv4 unicast
BGP table version is 9, local router ID is 5.5.5.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 4.4.4.4/32       20.4.5.4                 0             0 11 i
*> 5.5.5.5/32       0.0.0.0                  0         32768 i


or

R5 (IOS)
router bgp 2
 network 5.5.5.5 mask 255.255.255.255
 neighbor 20.4.5.4 remote-as 1



R4#sh bgp ipv4 unicast
BGP table version is 11, local router ID is 4.4.4.4
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 4.4.4.4/32       0.0.0.0                  0         32768 i
*> 5.5.5.5/32       20.4.5.5                 0             0 2 i


R5#sh bgp ipv4 unicast
BGP table version is 11, local router ID is 5.5.5.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 4.4.4.4/32       20.4.5.4                 0             0 1 i
*> 5.5.5.5/32       0.0.0.0                  0         32768 i





PE-CE Routing


In order to allow VPN sites with the same ASN talk to each other, you can use one of the following:
  • "neighbor PE allowas-in" in the CE
    • CE accepts its own ASN
  • "neighbor CE as-override" in the PE
    • PE replaces the common CE ASN with its own

eBGP sessions in IOS-XR require an in/out PASS routing policy under the appropriate address-family. Alternatively in some cases you can use "bgp unsafe-ebgp-policy" in order to bypass this.

IOS-XR
vrf VPN
 address-family ipv4 unicast
  import route-target
   100:1
  export route-target
   100:1

!
router bgp 100
 address-family ipv4 unicast
 vrf VPN

  rd 100:1
  bgp unsafe-ebgp-policy
  address-family ipv4 unicast
  neighbor 2.2.2.2
   remote-as 200
   address-family ipv4 unicast
    as-override





Labeled BGP

It's a BGP capability (negotiated between neighbors during session setup) that allows you to exchange labels together with IPv4/IPv6 unicast prefixes. It's used in Inter-AS, CsC, 6PE scenarios, and when LDP+IGP or RSVP-TE are not available for label distribution.

Configuration

IOS
router bgp 100
 address-family ipv4
  neighbor 1.1.1.1 send-label


IOS-XR
router bgp 100
 address-family ipv4 unicast
  allocate-label all
 neighbor 1.1.1.1
  address-family ipv4 labeled-unicast



You can also filter the prefixes for which to allocate labels.

Verification

R2#sh bgp ipv4 unicast neighbors 1.1.1.1 | b capabilities
  Neighbor capabilities:
    Route refresh: advertised and received(new)
    Four-octets ASN Capability: advertised and received
    Address family IPv4 Unicast: advertised and received
    ipv4 MPLS Label capability: advertised and received
    Multisession Capability: advertised and received



In IOS-XR, when you activate a new ipv4-labeled session for an existing ipv4 neighbor, you need to re-apply all settings (i.e. route-policy, send-community) from the ipv4 session to the ipv4-labeled session.



L3VPN

"send-community extended" is usually automatically enabled when activating a neighbor under the BGP VPNv4 address-family. Since RT is an extended community, without this command VPNv4 routes won't be advertised in BGP.

In order to see the VPN label to be used by the PEs, you just need to check the relevant BGP route.

R2#sh bgp vpnv4 unicast all 6.6.6.6/32
...
    5.5.5.5 (metric 4) from 5.5.5.5 (5.5.5.5)
...
      mpls labels in/out nolabel/28


In order to see the IGP/Transport label to be used by the PEs and Ps, you just need to find the label for the route's next-hop. Remember to add the "detail" keyword in order to see the whole label stack (due to possible route recursion).

R2#sh mpls forwarding-table 5.5.5.5
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop
Label      Label      or Tunnel Id     Switched      interface
27         26         5.5.5.5/32       0             Fa0/0.23   20.2.3.3


In order to see the whole label stack (which includes both the VPN and the IGP label), you can check the relevant CEF entry (inside the VRF) on the PEs.

R2#sh ip cef vrf VPN 6.6.6.6 det
6.6.6.6/32, epoch 0, flags rib defined all labels
  recursive via 5.5.5.5 label 28
    nexthop 20.2.3.3 FastEthernet0/0.23 label 26



If you want to follow a Intra-AS L3VPN path (assuming control-plane has been setup correctly), then you can execute the following algorithm:
  • first router (start PE)
    • Find the VPN label for the prefix
    • Find the Transport label(s) for the prefix's next-hop
  • n router
    • Follow the Transport top label swaps until there is a "Pop Label" for next router
  • n+1 router
    • Find the local VPN label for the prefix
      • If VPN label is "no label", then 
        • router is the end PE
        • VPN is locally attached
      • If VPN label is other, then 
        • ?
      • If VPN label doesn't exist, then 
        • ?

If the route is learned from IGP, the Transport label must be allocated through LDP/RSVP-TE.
If the route is learned from BGP, the Transport label must be allocated through BGP.



Dynamic L3VPN with mGRE Tunnels

If MPLS is not available in a network, you can use GRE (or other types of encapsulation) to "automatically" build dynamic tunnels in order to provide L3VPN services.

The BGP nexthop is used for tunnel endpoint discovery, but instead of adding a transport label, VPN traffic is encapsulated into GRE (having as source a local interface and as destination the neighbor PE).

The L3VPN BGP configuration (regarding VRFs and VPNv4) remains the same as in MPLS L3VPN.

Configuration Steps
  • create a new VRF for the mGRE tunnels
  • create a mGRE tunnel (with no destination) and assign the above VRF to it
  • create a default static route that forwards the above VRF traffic into the mGRE tunnel
  • activate the above VRF under BGP
  • apply an inbound route-map that changes the next-hop to the above VRF to all the PE sessions
The same tunnels can be used for all L3VPNs between the same PEs.

IOS
vrf definition L3VPN-VRF
 rd 1:99
!

interface Tunnel 1
 tunnel mode gre multipoint l3vpn
 tunnel source loopback0
 ip vrf forwarding L3VPN-VRF
 ip address 99.99.99.1 255.255.255.255
 tunnel key 99
!
ip route vrf L3VPN-VRF 0.0.0.0 0.0.0.0 Tunnel1

!
router bgp 1
 neighbor 2.2.2.2 remote-as 1
 neighbor 2.2.2.2 update-source Loopback0
!
 address-family vpnv4
  neighbor 2.2.2.2 activate
  neighbor 2.2.2.2 send-community extended
  neighbor 2.2.2.2 route-map L3VPN-ROUTEMAP in
 exit-address-family
!
 address-family ipv4 vrf L3VPN-VRF
 exit-address-family
!
route-map L3VPN-ROUTEMAP permit 10
 set ip next-hop in-vrf L3VPN-VRF



In latest releases you can also use multipoint L2TPv3 tunnels instead of the default mGRE ones.

You can also define l3vpn encapsulation profiles for fully automatic tunnel provisioning.

IOS
l3vpn encapsulation ip L3VPN-PROFILE
 transport source loopback 0
 protocol gre key 99
!

router bgp 1
 neighbor 2.2.2.2 remote-as 1
 neighbor 2.2.2.2 update-source Loopback0
!
 address-family vpnv4
  neighbor 2.2.2.2 activate
  neighbor 2.2.2.2 send-community extended
  neighbor 2.2.2.2 route-map L3VPN-ROUTEMAP in
 exit-address-family

!
route-map L3VPN-ROUTEMAP permit 10
 set ip next-hop encapsulate L3VPN-PROFILE 
        




Link Bandwidth

It is used with BGP multipath to configure load balancing over links with unequal bandwidth.

When enabled, routes learned from directly connected external neighbors are propagated through the iBGP network with the bandwidth of the source external link stored in an extended community.

The link bandwidth extended community attribute is used as a traffic sharing value relative to other paths while forwarding traffic. 

Two or more paths are designated as equal for load balancing if weight, local-preference, as-path length, MED and IGP costs are the same. 

BGP can originate the link bandwidth community only for directly connected links to eBGP neighbors.


Configuration Steps
  • "dmzlink-bw" must be enabled on all BGP routers that need to process the link bandwidth community
  • "dmzlink-bw" must be enabled on all eBGP neighborships from where the bandwidth will be acquired
  • "send-community extended" must be enabled on all iBGP peerings where the link bandwidth community must be propagated to
  • multipath must be enabled where more than one path is expected

R2 (IOS)
router bgp 1
 bgp dmzlink-bw
 neighbor 3.3.3.3 remote-as 1
 neighbor 3.3.3.3 update-source Loopback0
 neighbor 4.4.4.4 remote-as 1
 neighbor 4.4.4.4 update-source Loopback0
 maximum-paths ibgp 4



R3 (IOS)
router bgp 1
 bgp dmzlink-bw
 neighbor 2.2.2.2 remote-as 1
 neighbor 2.2.2.2 update-source Loopback0
 neighbor 2.2.2.2 next-hop-self
 neighbor 2.2.2.2 send-community extended
 neighbor 4.4.4.4 remote-as 1
 neighbor 4.4.4.4 update-source Loopback0
 neighbor 4.4.4.4 next-hop-self
 neighbor 4.4.4.4 send-community extended
 neighbor 20.3.6.6 remote-as 2
 neighbor 20.3.6.6 dmzlink-bw
 maximum-paths 4
 maximum-paths ibgp 4
!

interface FastEthernet0/0.36
 bandwidth 36000



R4 (IOS)
router bgp 1
 bgp dmzlink-bw
 neighbor 2.2.2.2 remote-as 1
 neighbor 2.2.2.2 update-source Loopback0
 neighbor 2.2.2.2 next-hop-self
 neighbor 2.2.2.2 send-community extended
 neighbor 3.3.3.3 remote-as 1
 neighbor 3.3.3.3 update-source Loopback0
 neighbor 3.3.3.3 next-hop-self
 neighbor 3.3.3.3 send-community extended
 neighbor 20.4.5.5 remote-as 2
 neighbor 20.4.5.5 dmzlink-bw
 neighbor 20.4.6.6 remote-as 2
 neighbor 20.4.6.6 dmzlink-bw
 maximum-paths 4
 maximum-paths ibgp 4
!

interface FastEthernet0/0.45
 bandwidth 45000
!
interface FastEthernet0/0.46
 bandwidth 46000



IOS
R2#sh bgp
BGP table version is 5, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*mi19.19.19.19/32   4.4.4.4                  2    100      0 2 i
*>i                 3.3.3.3                  2    100      0 2 i


R2#sh bgp ipv4 unicast 19.19.19.19/32
BGP routing table entry for 19.19.19.19/32, version 5
Paths: (2 available, best #2, table default)
Multipath: iBGP
  Not advertised to any peer
  2
    4.4.4.4 (metric 5) from 4.4.4.4 (4.4.4.4)
      Origin IGP, metric 2, localpref 100, valid, internal, multipath
      DMZ-Link Bw 11375 kbytes
  2
    3.3.3.3 (metric 5) from 3.3.3.3 (3.3.3.3)
      Origin IGP, metric 2, localpref 100, valid, internal, multipath, best
      DMZ-Link Bw 4500 kbytes



Although BGP multipath is enabled, the BGP selection algorithm still chooses one path as the best (based on the standard BGP selection criteria), but both paths are tagged with the "multipath" keyword and appear in the routing table for forwarding. 

R2#sh ip route 19.19.19.19
Routing entry for 19.19.19.19/32
  Known via "bgp 1", distance 200, metric 2
  Tag 2, type internal
  Last update from 3.3.3.3 00:04:36 ago
  Routing Descriptor Blocks:
  * 4.4.4.4, from 4.4.4.4, 00:04:36 ago
      Route metric is 2, traffic share count is 5
      AS Hops 1
      Route tag 2
      MPLS label: none
    3.3.3.3, from 3.3.3.3, 00:04:36 ago
      Route metric is 2, traffic share count is 2
      AS Hops 1
      Route tag 2
      MPLS label: none



Divide the bandwidth entry (Kbps) by 8 to find out the DMZ-Link Bw (KBps) in the"sh bgp" output.

IOS-XR
router bgp 2
 address-family ipv4 unicast
  maximum-paths ibgp 4

  maximum-paths ebgp 4
 !
 neighbor 6.6.6.6
  dmz-link-bandwidth


The above (old-style) configuration is not recommended. In later IOS-XR releases (>4.3.2) you can set the bandwidth extcommunity in a route-policy towards the iBGP neighbor in order to achieve the same thing.

Links




RT Constrain (RTC)

The default behavior is for the PEs to filter out the unwanted RTs, after they receive the prefixes from the RR. After enabling this feature on the PE and the RR, the PE informs the RR what RTs it actually needs and the RR sends only those.

This feature causes two exchanges to happen:
  • The PE sends an RT Constraint (RTC) NLRI to the RR
  • The RR installs an outbound route filter
The rtfilter address-family must be activated on both the RR and the PE.


IOS
router bgp 100
 neighbor 1.1.1.1 remote-as 100
 neighbor 1.1.1.1 update-source Loopback0
 !
 address-family vpnv4
  neighbor 1.1.1.1 activate
  neighbor 1.1.1.1 send-community extended
 exit-address-family
 !
 address-family rtfilter unicast
  neighbor 1.1.1.1 activate
  neighbor 1.1.1.1 send-community extended
 exit-address-family



IOS-XR
router bgp 100
 address-family vpnv4 unicast
 !
 address-family ipv4 rt-filter
 !
 neighbor 1.1.1.1
  remote-as 100
  update-source Loopback0
  address-family vpnv4 unicast
  !
  address-family ipv4 rt-filter



It requires IOS-XR > 4.3 or IOS > 15.1.

Links



Fast Convergence

  • Different RD per PE
  • BGP Multipath
  • BGP Best-external
  • BGP PIC
  • Two RRs (one for primary, one for secondary)

Multipath 
It allows installation of multiple BGP paths to the same destination into the IP routing table. These paths are installed in the table together with the best path for load sharing. BGP Multipath does not affect best-path selection. For example, a router still designates one of the paths as the best path, according to the algorithm, and advertises this best path to its neighbors.

  • eBGP multipath
    • maximum-paths x (IOS)
    • maximum-paths ebgp x (IOS-XR)
  • iBGP multipath
    • maximum-paths ibgp x (IOS, IOS-XR)
  • eiBGP multipath (under ipv4 vrf address-family)
    • maximum-paths eibgp x (IOS, IOS-XR)

In IOS-XR, you can also use the "selective" keyword in order to restrict multipath to specific neighbors (the ones with "multipath" configured).

CEF load-sharing might need to be tuned also.

"bgp bestpath as-path multipath-relax" can be used to skip checking the as-path contents and check only its length.


Best-External Path

When configured, enables the advertisement of the best-external path to iBGP/RR peers, if the locally selected best-path is from an internal peer. That way routers internal to the AS have knowledge of more exit paths from the AS.

Usually it's configured on the backup router.

IOS
router bgp 100
 address-family vpnv4
  bgp advertise-best-external


IOS-XR
router bgp 100
 address-family ipv4 unicast
  advertise best-external




PIC (Prefix Independent Convergence)
When configured, provides a capability to install a backup path into the forwarding table to provide prefix independent convergence in case of PE-CE link failure

Core/Edge

IOS
router bgp 100
 address-family vpnv4
  bgp additional-paths install
  bgp recursion host


IOS-XR (3.9)
router bgp 100
 address-family vpnv4 unicast
  additional-paths install backup



For faster convergence you might need to remove the command "bgp recursion host".


Links



QPPB (QoS Policy Propagation via BGP)

It allows you to match BGP routes based on attributes (i.e. community, as-path), mark these with ip prec or qos-group (or other attributes depending on software version) and then mark appropriately the relevant source/destination packets matching the above routes. Further actions (i.e. policing, queuing) can be performed on the marked packets afterwards.


IOS
ip community-list 1 permit 100:1
!
ip as-path access-list 1 permit _200$
!
route-map QPPB-ROUTEMAP permit 10
 match community 1
 set ip precedence 2
!
route-map QPPB-ROUTEMAP permit 20
 match as-path 1
 set ip precedence 5
!
router bgp 100
 table-map QPPB-ROUTEMAP
!

interface FastEthernet0/0
 bgp-policy source ip-prec-map



IOS-XR
route-policy QPPB-ROUTEPOLICY
  if community matches-any (100:1) then
    set qos-group 2
  endif
  if as-path originates-from '200'  then
    set qos-group 5
  endif
end-policy

!
router bgp 100
 address-family ipv4 unicast
  table-policy QPPB-ROUTEPOLICY
!
interface GigabitEthernet0/0/0/0
 ipv4 bgp policy propagation input qos-group source


IOS-XR has various limitations depending on hw used.



RTBH (Remotely Triggered Black Hole) routing/filtering

It allows you to quickly "block" various attacks on your edge routers, by advertising a null route from a single router to all edge routers.

Configuration Steps
  • configure null static route with dummy next-hop on your edge routers
  • configure route-map that matches a tag and sets a dummy next-hop (plus whatever else) on your rtbh router
  • configure redistribution of static routes into BGP using the above route-map on your rtbh router
  • in case of attack, configure a null static route with the appropriate tag for the destination on the rtbh router
i.e. for destination-based RTBH:

edge (IOS)
ip route 192.168.1.1 255.255.255.255 Null0

rtbh router (IOS)
router bgp 100
 redistribute static route-map RTBH-ROUTEMAP
!
route-map RTBH-ROUTEMAP
 match tag 99
 set ip next-hop 192.168.1.1
 set community no-export no-advertise additive



When attack to 10.10.10.10 happens:

rtbh router (IOS)
ip route 10.10.10.10.10 255.255.255.255 Null0 tag 99


It is assumed that the rtbh router has BGP connectivity with all edge routers (either directly, or through RRs).

If you combine loose uRPF + RTBH, you can use it for blocking source ips too.