CCIE in 2 months - Is it possible?: MED

Showing posts with label MED. Show all posts

Thursday, February 6, 2014

NTS: L3VPN Redistribution

L3VPN Redistribution

Configuration Steps

Configure the VRFs
Configure the RDs
Configure the import/export RTs
Assign the PE=>CE interfaces to VRFs
Configure IGP/BGP between PE-CE
Configure MP-BGP between PEs
Mutually redistribute between MP-BGP and the PE-CE IGP

BGP<=>RIP

RIP=>BGP
RIP metric => BGP MED (auto)

RIP=>BGP=>RIP
RIP metric => BGP MED => RIP metric (auto)

OTHER=>BGP=>RIP
BGP X => RIP metric (manual)

If "auto" doesn't work (for whatever reason), you can trying clearing the vrf routing table on the PE or you can use the following to set manually the RIP metric:

redistribute bgp 100 metric transparent
redistribute bgp 100 metric X
redistribute bgp 100 route-map X

Clearing of vrf routing table might be needed every time a new prefix is redistributed.

If version 2 is to be used, then it must be defined under the ipv4 vrf address-family on the PE.

RIP metric = hops (0-16)

Configuration

IOS
router rip
address-family ipv4 vrf VPN
redistribute bgp 200
!
router bgp 100
address-family ipv4 vrf VPN
redistribute rip

IOS-XR
router rip
vrf VPN
redistribute bgp 200
!
router bgp 200
vrf VPN
address-family ipv4 unicast
redistribute rip

BGP<=>EIGRP

EIGRP=>BGP
EIGRP composite metric => BGP MED (auto)
EIGRP vector metrics => BGP Extended Cost Community (auto)

EIGRP=>BGP=>EIGRP
EIGRP composite metric => BGP MED => EIGRP composite metric (auto)
EIGRP vector metrics => BGP Extended Cost Community => EIGRP vector metrics (auto)
original internal EIGRP routes appear as internal EIGRP routes when redistributed
original external EIGRP routes appear as external EIGRP routes when redistributed

OTHER=>BGP=>EIGRP
BGP X => EIGRP metrics (manual)
original routes appear as external EIGRP routes when redistributed

If "auto" doesn't work (for whatever reason), you can trying clearing the vrf routing table on the PE or you can use the following to set manually the EIGRP metrics:

redistribute bgp 100 metric K1 K2 K3 K4 K5
redistribute bgp 100 route-map X
redistribute bgp 100 route-policy X
redistribute bgp 100 & default-metric K1 K2 K3 K4 K5

Clearing of vrf routing table might be needed every time a new prefix is redistributed.

EIGRP vector metrics = K1 K2 K3 K4 K5 (i.e. 1000 10 255 1 1500)

Configuration

IOS
router eigrp 100
address-family ipv4 vrf VPN autonomous-system 1
redistribute bgp 200
exit-address-family
!
router bgp 200
address-family ipv4 vrf VPN
redistribute eigrp 1

IOS-XR
router eigrp 100
vrf VPN
address-family ipv4
   autonomous-system 1
   redistribute bgp 200
!
router bgp 200
vrf VPN
address-family ipv4 unicast
   redistribute eigrp 1

Redistribution of EIGRP into the BGP vrf requires the EIGRP autonomous-system number to be redistributed. Some software releases may accept the global EIGRP process too.

You can use the SoO extended community to prevent any possible loops.

BGP<=>ISIS

ISIS=>BGP
ISIS metric => BGP MED (auto)

ISIS=>BGP=>ISIS
ISIS metric => BGP MED => ISIS metric (auto)

OTHER=>BGP=>ISIS
BGP X => ISIS metric (manual)

You can use the following to set manually the ISIS metric:

redistribute bgp 100 metric X
redistribute bgp 100 route-map X

Clearing of vrf routing table might be needed every time a new prefix is redistributed.

ISIS metric = hops (10)

Configuration

IOS
router isis 100
address-family ipv4 vrf VPN
redistribute bgp 200
!
router bgp 200
address-family ipv4 vrf VPN
redistribute isis 100

IOS-XR
router isis 100
vrf VPN
redistribute bgp 200
!
router bgp 200
vrf VPN
address-family ipv4 unicast
redistribute isis 100

Redistribution doesn't take into account the IS-IS connected routes. You have to explicitly define them.

In order to void a possible loop while doing redistribution (when L1 is involved), you can change the distance of the ISIS advertised routes (excluding connected) on the PE to be higher than BGP's.

IOS
router isis 100
vrf VPN
distance 201 0.0.0.0 255.255.255.255 ISIS-NOT-CONNECTED-ACL

BGP<=>OSPF

OSPF=>BGP
OSPF metric => BGP MED + 1 (auto)
OSPF Area/LSA => BGP extended community "OSPF RT" (auto)

OSPF=>BGP=>OSPF
OSPF metric => BGP MED + 1 => OSPF metric (auto)

original intra-area routes appear as inter-area routes when redistributed (if same OSPF Domain-ID)
original intra-area routes appear as external-2 routes when redistributed (if different OSPF Domain-ID)
type-4 LSAs are not redistributed into BGP
original external routes appear as external-2 routes when redistributed (requires "match external" in redistribution from OSPF to BGP)

OTHER=>BGP=>OSPF
BGP X => OSPF metric (manual)

You can always use the following to manually set the OSPF metric:

redistribute bgp 200 metric X
redistribute bgp 200 route-map X

Clearing of vrf routing table might be needed every time a new prefix is redistributed.

OSPF metric = interface cost (0-65535)

"OSPF RT" Extended Community

"OSPF RT" format is "Area:LSA-Type:External-Type"

LSA Type to OSPF RT conversion

Type-1/2 => RT 2
Type-3 => RT 3
Type-5 => RT 5
Type-7 => RT 7
Sham-links => RT 129

Examples

OSPF RT:0.0.0.0:2:0

area 0.0.0.0
LSA-Type 1/2

OSPF RT:0.0.0.0:5:0

LSA-Type 5
External 1

OSPF RT:0.0.0.0:5:1

LSA-Type 5
External 2

Configuration

IOS
router ospf 100 vrf VPN
redistribute bgp 200 subnets
!
router bgp 200
address-family ipv4 vrf VPN
redistribute ospf 100 vrf VPN

IOS-XR
router ospf 100
vrf VPN
redistribute bgp 200
!
router bgp 200
vrf VPN
address-family ipv4 unicast
redistribute ospf 100

In IOS, if you don't include the vrf name in the redistribution of OSPF into BGP, it gets automatically added to the configuration.

The DN Bit and the VPN Route Tag

For a PE it is necessary to know if a particular prefix has been learned from another PE router, in order to avoid re-advertisement of it into BGP and cause a loop.

Two mechanisms are mainly used for loop prevention when OSPF is used as PE-CE protocol.

the DN bit
the VPN Route (or OSPF Domain) tag

By default, when a type 3, 5, 7 LSA is sent from a PE to a CE, the DN bit is set by the PE.

When another PE receives from a CE router, a type 3, 5, 7 LSA with the DN bit set, the prefix information from that LSA is not used during the OSPF route calculation, which means that the prefix doesn't get installed into the PE's BGP table.

Almost all Cisco software releases support the setting of DN bit only for Type-3 LSAs and they use a 32-bit VPN Route tag for Type-5/7 LSAs. The configuration and inclusion of the VPN Route Tag is required by all implementations for backward compatibility with older implementations that do not set the DN bit in type 5/7 LSAs.

If a PE router receives an LSA that contains the same VPN Route Tag as the locally configured tag, then the local PE router knows that another PE router (from the same domain) generated this route and the LSA is ignored.

16bit ASNs

VPN Route tag Format: 1101 000000000000 ASN_of_VPN_Backbone

32bit ASNs

VPN Route tag must be defined manually

You can change this default value by using the "domain-tag" command within the OSPF VRF process configuration.

IOS
router ospf 100 vrf VPN
domain-tag 12345

IOS-XR
router ospf 100
vrf TEST
domain-tag 12345

In case of Multi-VRF (VRF-Lite), the router that is accepting the LSA with the DN bit is actually a CE router with no BGP VPNv4 functionality, so there is no danger of redistributing this prefix into BGP. In order to bypass this DN bit check, the following configuration can be enabled.

IOS
router ospf 100 vrf VPN
capability vrf-lite

IOS-XR
router ospf 100
vrf VPN
disable-dn-bit-check

Verification

IOS
R1#sh ip ospf 100 database summary 10.7.7.7

            OSPF Router with ID (10.1.3.1) (Process ID 100)

                Summary Net Link States (Area 0)

Routing Bit Set on this LSA in topology Base with MTID 0
LS age: 1196
Options: (No TOS-capability, DC, Downward)
LS Type: Summary Links(Network)
Link State ID: 10.7.7.7 (summary Network Number)
Advertising Router: 10.1.2.2
LS Seq Number: 80000005
Checksum: 0x2761
Length: 28
Network Mask: /32
        MTID: 0         Metric: 2

R1#sh ip ospf 100 database external 7.7.7.7

            OSPF Router with ID (10.1.3.1) (Process ID 100)

                Type-5 AS External Link States

Routing Bit Set on this LSA in topology Base with MTID 0
LS age: 1302
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 7.7.7.7 (External Network Number )
Advertising Router: 10.1.2.2
LS Seq Number: 80000004
Checksum: 0x6DCF
Length: 36
Network Mask: /32
        Metric Type: 2 (Larger than any link state path)
        MTID: 0
        Metric: 20
        Forward Address: 0.0.0.0
        External Route Tag: 3489661028

Links

IETF - RFC 4576 (the DN-bit)

OSPF Domain-ID

OSPF Domain-ID is an attribute that defines how (internal, external) the OSPF routes will be transferred from one CE to another CE over their PEs BGP VPNv4 session.

On a PE, if the OSPF Domain-ID of the received BGP prefixes (encoded as extended community) is the same as the OSPF Domain-ID of the local OSPF process, then:

the MPLS core is treated like a SuperBackbone area (which is considered higher than area 0)
the PE is treated like an ABR (instead of an ASBR)
internal routes are being redistributed as Type-3 LSAs (instead of Type-5)

IOS-XR uses a null Domain-ID by default, so this needs to be changed if the other PE is running IOS (which is encoding the OSPF process-id as domain-id). OSPF Domain-ID needs to be changed on the PEs (where redistribution between BGP and OSPF takes place), not on the CEs.

The "type" value can be different is some cases for backwards compatibility (like in 0005 vs 8005).

Detailed Steps

OSPF=>BGP redistribution on PE1

if the OSPF Domain tag of the local OSPF process is the same as the VPN Route tag of the prefix, then that route isn't installed into BGP
if the OSPF DN bit check is enabled in the local OSPF process and the OSPF route has this bit set, then that route isn't installed into BGP
if the route is installed into BGP

the Domain-ID of the local OSPF process is encoded into OSPF DOMAIN ID community on the prefix

the area and the LSA type of the OSPF prefix is encoded into OSPF RT community on the prefix

the Router-ID of the local OSPF process is encoded into OSPF ROUTER ID community on the prefix

BGP=>OSPF redistribution on PE2

if the Domain-ID of the local OSPF process is the same as the OSPF DOMAIN ID community of the prefix, then that route is passed to the CE as internal else as external

Configuration

IOS
router ospf 100 vrf VPN
domain-id type 0005 value 000000440101

IOS-XR
router ospf 100
vrf VPN
domain-id type 0005 value 000000440101

Verification

You can use "sh ip ospf" to see the Domain-ID of the local OSPF process.

You can use "sh bgp vpn4 unicast" to see the Domain-ID encoded as extended community in the BGP prefixes (OSPF RT is included too).

R2#sh ip ospf 100
Routing Process "ospf 100" with ID 10.1.2.2
   Domain ID type 0x0005, value 0x000000440101
Start time: 00:13:37.092, Time elapsed: 00:36:17.144
Supports only single TOS(TOS0) routes
Supports opaque LSA
Supports Link-local Signaling (LLS)
Supports area transit capability
Connected to MPLS VPN Superbackbone, VRF VPN
Event-log disabled
It is an area border and autonomous system boundary router
Redistributing External Routes from,
    bgp 100, includes subnets in redistribution

R2#sh bgp vpnv4 unicast vrf VPN 1.1.1.1/32
BGP routing table entry for 100:1:1.1.1.1/32, version 2
Paths: (1 available, best #1, table VPN)
Advertised to update-groups:
     1
Local
    10.1.2.1 from 0.0.0.0 (2.2.2.2)
      Origin incomplete, metric 2, localpref 100, weight 32768, valid, sourced, best
      Extended Community: RT:100:1 OSPF DOMAIN ID:0x0005:0x000000440101
        OSPF RT:0.0.0.0:3:0 OSPF ROUTER ID:10.1.2.2:0
      mpls labels in/out 28/nolabel

R2#sh ip ospf 100 database

            OSPF Router with ID (10.1.2.2) (Process ID 100)
...
                Summary Net Link States (Area 0)

Link ID         ADV Router      Age         Seq#       Checksum
1.1.1.1         10.1.1.1        980         0x80000002 0x00F336
...

LSA Type-3 (Summary) in local OSPF table is encoded as "OSPF RT:0.0.0.0:3:0" in local BGP table.

Propagation of OSPF routes between CE1 and CE2

same domain-id

CE1 O => CE2 IA

different domain-id

CE1 O => CE2 E2

sham-link (regardless of domain-id)

CE1 O => CE2 O

Extra care needs to be taken if route tags are changed manually on OSPF=>BGP redistribution, because external OSPF routes are tagged by the BGP ASN when BGP=>OSPF redistribution takes place, which means that the original tag is lost (which could lead to a loop)

IOS
R6#sh ip route 1.1.1.1
Routing entry for 1.1.1.1/32
Known via "ospf 100", distance 110, metric 2
Tag Complete, Path Length == 1, AS 100, , type extern 2, forward metric 1
Last update from 10.10.10.5 on POS4/0, 00:00:03 ago
Routing Descriptor Blocks:
* 10.10.10.5, from 10.10.10.5, 00:00:03 ago, via POS4/0
Route metric is 2, traffic share count is 1
Route tag 3489661028

If a BGP VPNv4 route is redistributed into OSPF, then redistributed into another IGP like RIP (where all the information (DN bit, VPN Route-Tag) needed to prevent looping is lost), and then redistributed back into OSPF, then it is possible that it could be redistributed back into BGP as a VPNv4 route, thereby causing a loop.

You can use route tags at every step of redistribution in order to avoid possible routing loops, either caused by the above scenario or by mutual redistribution in two places.

NTS: BGP

BGP

BGP (Border Gateway Protocol) is defined in RFC 4271.

Uses TCP port 179.

The router with the highest router-id is used as the TCP client.

Best Path Selection

#	Attribute	Rule	Default	Notes	Affects Traffic	Applies to route-map
1	Prefix Length	Longest match		Always checked	inbound & outbound
2	Cost Community	Lowest Cost Community Number Lowest Cost Community ID	2147483647	Checked if "set extcommunity cost pre-bestpath" is configured. Skipped if "bgp bestpath cost-community ignore" is configured.
3	WEIGHT	Highest Weight	32768	Local to the router. Local originated prefixes have weight 32768 by default. Only for Cisco; not recommended for general use.
4	LOCAL PREFERENCE	Highest Local Preference	100	Used for separating customer/peering/transit traffic.	outbound	inbound
5		Prefer local-sourced routes		Everything announced through network/aggregate commands or redistribution is considered as local-sourced.
6	AS-PATH	Shortest as-path		Ignored if "bgp bestpath as-path ignore" is configured.	inbound	outbound
7	ORIGIN	Lowest Origin Type (IGP<EGP<Incomplete)
8	MED	Lowest Multi-Exit Discriminator (MED)		1st AS must be the same, unless "bgp always-compare-med" is configured.	inbound	outbound
9		Prefer eBGP over iBGP
10		Lowest IGP metric to the BGP next hop
11	Cost Community	Lowest Cost Community Number Lowest Cost Community ID	2147483647	Checked if "set extcommunity cost igp" is configured. Skipped if "bgp bestpath cost-community ignore" is configured.
12		Check for BGP Multipath
13		If both paths are external, prefer the path received first
14		Lowest BGP router-id
15		If the originator-id or the router-id is the same for multiple paths, prefer path with minimum cluster list length		Only for route reflectors
16		Prefer the path with the lowest neighbor address

MED

bgp deterministic-med

compare MED when choosing routes advertised by different neighbors in the same autonomous system
routes from the same autonomous system are grouped together and the best entries of each group are compared

bgp always-compare-med

compare MED when choosing routes advertised by different neighbors in different autonomous systems

"bgp deterministic-med" and "bgp always-compare-med" are recommended in order to alway have a standard best path selection algorithm.

In order to avoid mis-interpreting a missing MED (zero vs infinity), when setting MED on a peering, it's best to set it also on all other peerings. Otherwise the command "bgp bestpath med missing-as-worst" can be used.

Common Comparisons

Prefix Length
Highest Local Preference
Shortest as-path
Lowest Multi-Exit Discriminator (MED)
Prefer eBGP over iBGP
Lowest IGP metric to the BGP next hop
Lowest BGP router-id

origin

The origin attribute indicates how BGP learned about a particular route.

i (IGP)

interior to the originating AS (i.e. when the network configuration command is used to inject the route into BGP)

e (EGP)

learned via EGP (rarely seen)

? (incomplete)

unknown or learned via some other way (i.e. redistributed into BGP, or from eBGP)

Address Families

AFIs

1 (IPv4)
2 (IPv6)
25 (L2VPN)

SAFIs

1 (Unicast)
2 (Multicast)
4 (NLRI with MPLS labels)
65 (VPLS)
128 (VPN with MPLS labels (VRFs))

Configuration

IOS
router bgp 2
no synchronization
bgp log-neighbor-changes
network 5.5.5.5 mask 255.255.255.255
neighbor 20.4.5.4 remote-as 1
no auto-summary

IOS-XR
router bgp 1
bgp log neighbor changes detail
address-family ipv4 unicast
network 4.4.4.4/32
!
neighbor 20.4.5.5
remote-as 2
address-family ipv4 unicast

In IOS-XR, if there is no loopback configured with an ipv4 address, the BGP session won't come up, until you explicitly configure the bgp router-id.

BGP is designed to refuse a session with itself because of the router-id check. You can use a per-vrf assignment of BGP router-id in order to have a VRF-to-VRF peering on the same router.

In IOS-XR, every eBGP session requires an explicit route-policy in order to allow incoming/outgoing updates. It's good practice to create one named PASS-RPL with default action "pass" and use it when first activating each eBGP session. Afterwards you can create the required route-policy and use that instead.

When told to advertise a prefix into BGP, prefer to use the "network" statement, unless told to do otherwise. Also prefer to do the "network" advertisements to another AS on routers running eBGP to that AS.

You can use the "network x.x.x.x backdoor" command in order to change the admin distance of an eBGP route (default 20) to that of iBGP (200), so that the equivalent IGP route can be preferred.

Route Aggregation

redistribution of static or IGP
aggregate-address

summary-only
suppress-map
unsuppress-map (per neighbor)
advertise-map

inject-map

A more specific prefix must exist in the BGP table before doing aggregation.

Communities

Standard Communities (32-bit)

Used for well-known communities and for specific communities of type $ASN:$TAG in BGP
send-community (IOS)
send-community-ebgp (IOS-XR)

Extended Communities (64-bit) are defined in

Used in MPLS VPNs for RT and SOO
send-community extended (IOS)
send-extended-community-ebgp (IOS-XR)

Communities are configured through community lists.

When regular expressions are required, expanded (standard or extended) community lists must be used.

Configuration

Standard

ip community-list 1 permit 100:10
ip community-list standard X-COMMLIST permit 100:10

Expanded Standard

ip community-list 100 permit 100:*
ip community-list expanded X-COMMLIST permit 100:*

Extended

ip extcommunity-list 1 permit rt 200:20
ip extcommunity-list standard X-COMMLIST permit rt 200:20

Expanded Extended

ip extcommunity-list 100 permit rt 200:*
ip extcommunity-list expanded X-COMMLIST permit rt 200:*

In IOS, all communities are not sent by default to iBGP or eBGP sessions.

In IOS-XR, all communities are sent by default on iBGP sessions, but not on eBGP sessions.

Well-known communities

internet
no-export (don't advertise to eBGP neighbor)
local-as (don't advertise to other confederation sub-AS)
no-advertise(don't advertise to any neighbor)

Delete communities

IOS
route-map DELCOMM1-ROUTEMAP permit 10
set comm-list 1 delete
!
route-map DELCOMM2-ROUTEMAP permit 10
set community none
!
route-map DELCOM3-ROUTEMAP permit 10
set extcomm-list 1 delete

IOS-XR
route-policy DELCOM1-RPL
delete community in (*:*)
end-policy
!
route-policy DELCOM3-RPL
delete extcommunity rt all
end-policy

Use the "additive" keyword to add communities to existing ones.

Links

Synchronization

A BGP router with synchronization enabled does not install iBGP learned routes into its routing table and propagate them to an eBGP peer, if it is not able to validate those routes in its IGP first. It's used to ensure that there are no black holes inside the AS caused by intermediate routers that do not run BGP.

It's disabled by default ("no synchronization"), because nowadays most networks run iBGP or MPLS.

Route Reflectors

Route Reflectors modify iBGP split-horizon rules.

Routes learned on a RR from a RR-Client are propagated to other RR-Clients and Non-Clients
Routes learned on a RR from a Non-Client are propagated only to RR-Clients

RRs can be assigned per address-family.

RRs do not modify the next-hop of advertised routes by default.

RRs can be in the forwarding path or not.

Use "no bgp client-to-client reflection" on RRs, when their clients are also fully meshed.

An RR reflecting a route received from an RR-Client adds the following attributes:

Originator ID

the Router ID of the originator of the route
if the update comes back to the originator (so the local Router-ID is the same as the Originator-ID), the update is ignored

Cluster List

a list of Cluster IDs that an update has passed through
when an RR reflects a route from a client to a non-client, the local Cluster ID is appended to the Cluster List
if the update comes back to the RR (so the local Cluster-ID is contained in the prefix Cluster List) the update is ignored

Originator and Cluster List are used to prevent loops in RR environments.

By default Cluster-ID = RR Router-ID. In case of two RRs, two different Cluster-IDs will be used. This increases memory utilization, because the same route is stored multiple times, each one with a different Cluster-ID.

You can use a common Cluster-ID in redundant RRs (in order to decrease memory utilisation, although rarely needed), only when you're sure that connectivity for RR clients won't break if the RR client looses one of its RR connections.

IOS
router bgp 100
neighbor 2.2.2.2 remote-as 100
neighbor 2.2.2.2 update-source Loopback 0
neighbor 2.2.2.2 route-reflector-client

IOS-XR
router bgp 100
neighbor 2.2.2.2
remote-as 100
update-source Loopback0
address-family ipv4 unicast
route-reflector-client

Links

IETF - RFC 4456

Confederations

The AS is split into smaller autonomous systems in order to reduce the number of iBGP sessions.

It's common practice to use the private AS range (64512 – 65535) to denote a sub-autonomous system.

These internal ASNs are hidden and only a single external ASN is announced to eBGP neighbors.

BGP confederations modify iBGP as-path processing

When sending:

updates to iBGP neighbors

as-path is not changed

updates to intra-confederation eBGP neighbors

the intra-confederation ASN is prepended to the as-path

updates to eBGP neighbors

the intra-confederation ASNs are removed and the external ASN is prepended to the as-path

Intra-confederation eBGP session is:

like eBGP session when establishing the session (ebgp-multihop)
like iBGP session when sending routing updates (local pref, next-hop, etc.)

IOS
router bgp INTERNAL-ASN-100
bgp confederation identifier EXTERNAL-ASN-1
bgp confederation peers INTERNAL-ASN-200 INTERNAL-ASN-300
neighbor 2.2.2.2 remote-as INTERNAL-ASN-200
neighbor 3.3.3.3 remote-as INTERNAL-ASN-300
neighbor 9.9.9.9 remote-as EXTERNAL-ASN-9

IOS-XR
router bgp INTERNAL-ASN-100
bgp confederation peers
INTERNAL-ASN-200
INTERNAL-ASN-300
!
bgp confederation identifier EXTERNAL-ASN-1
!
neighbor 2.2.2.2
remote-as INTERNAL-ASN-200
!
neighbor 3.3.3.3
remote-as INTERNAL-ASN-300
!
neighbor 9.9.9.9
remote-as EXTERNAL-ASN-9

EXTERNAL-ASNs define the ASNs used for eBGP sessions between different ASNs.

INTERNAL-ASNs define the ASNs used for eBGP sessions between different sub-ASNs of the same ASN.

Example

IOS
router bgp 65100
bgp confederation identifier 1
bgp confederation peers 65200 65300
neighbor 2.2.2.2 remote-as 65200
neighbor 3.3.3.3 remote-as 65300
neighbor 9.9.9.9 remote-as 9

IOS-XR
router bgp 65100
bgp confederation peers
65200
65300
!
bgp confederation identifier 1
!
neighbor 2.2.2.2
remote-as 65200
!
neighbor 3.3.3.3
remote-as 65300
!
neighbor 9.9.9.9
remote-as 9

Links

IETF - RFC 5065

Next-Hop

advertisement to eBGP peer

next-hop changes to self
use "next-hop-unchanged" to not change

advertisement to iBGP peer

next-hop doesn't change
use "next-hop-self" to change

You can't use the next-hop-self for setting the next-hop in reflected iBGP routes. Instead use an outbound route map.

In IOS-XR, you can use "ibgp policy out enforce-modifications" in combination with an outbound route-map in order to force modification of the routes attributes (including next-hop) when sent to an iBGP neighbor.

keepalive & holdtime

Neighbor holdtime timers are negotiated while initially setting the BGP session and the smaller one gets used by both neighbors. Keepalive timers are then based on that holdtime value. It's not recommended to have less than 3 secs as a holdtime.

The fastest convergence on a BGP session that can be achieved by changing the keepalive/holdtime timers is 3 sec.

In order to protect the control-plane, you can put a limit on the lowest holdtime number accepted by using the "min-holdtime" command. If the neighbor doesn't comply, then the BGP session is rejected.

Mass Neighbor Configuration

In order to minimize neighbor configuration regarding the BGP session parameters you can use the following:

peer groups (IOS)

router bgp 100
neighbor PEER-GROUP peer-group
neighbor PEER-GROUP remote-as 100
neighbor PEER-GROUP update-source Loopback0
!
neighbor 1.1.1.1 peer-group PEER-GROUP
!
address-family vpnv4
neighbor PEER-GROUP send-community extended
neighbor 1.1.1.1 activate

neighbor groups (IOS-XR)

router bgp 100
neighbor-group NEI-GROUP
remote-as 100
update-source Loopback0
!
neighbor 1.1.1.1
use neighbor-group NEI-GROUP

peer session templates (IOS)

router bgp 100
template peer-session PEER-TEMPLATE
remote-as 100
update-source Loopback0
!
neighbor 1.1.1.1 inherit peer-session PEER-TEMPLATE

eBGP Peerings & TTL

IOS
router bgp 100
neighbor 2.2.2.2 ttl-security hops X
neighbor 3.3.3.3 ebgp-multihop Y

IOS-XR
router bgp 100
neighbor 2.2.2.2
ttl-security
neighbor 3.3.3.3
ebgp-multihop Y

IOS
R1#sh bgp nei | i TTL|Session
Session: 2.2.2.2
Mininum incoming TTL 255-X, Outgoing TTL 255
Session: 3.3.3.3
Mininum incoming TTL 0, Outgoing TTL Y

eBGP Multihop
It allows a neighbor connection between two external peers that do not have direct connection. You should also configure an IGP or static routing to allow the neighbors without direct connection to reach each other.

TTL Security Check
It's a lightweight security mechanism to protect eBGP neighbor sessions from CPU utilization-based attacks (DoS attacks that flood the network with IP packets that contain forged source IP addresses).

IOS
When configured for an eBGP neighbor, the router accepts only IP packets with a TTL count that is greater or equal to maximum TTL value (255) minus the hop count that is configured locally for the relevant eBGP session. If the TTL value in the IP packet is less than the maximum TTL value (255) minus the hops configured value, the incoming packet is silently discarded.

Supports both directly connected neighbor sessions and multihop eBGP neighbor sessions

IOS-XR
When configured for a directly adjacent eBGP neighbor, the router accepts only IP packets with a TTL count that is equal to the maximum TTL value (255). If the TTL value in the IP pakcet is less than the maximum TTL value (255), the incoming packet is silently discarded.

TTL values according to BGP setup:

R1 config: neighbor R2

R1 sends packets to R2 with TTL=1

R1 config: neighbor R2 ttl-security hops X

R1 sends packets to R2 with TTL=255

R1 config: neighbor R2 ebgp-multihop X

R1 sends packets to R2 with TTL=X

ebgp-mutlihop combined with ttl-security on two eBGP routers

R1: ebgp-multihop X (<255)
R2: ttl-security hops Y (<254)

R1 sends packets to R2 with TTL=X

R2 doesn't reply back
R2 accepts packets with TTL < 255-Y

R2 sends packets to R1 with TTL=255

R1 replies back
R1 accepts packets with any TTL

General Rule

If ( X - ActualHops >= 255 - Y ) then the eBGP session can be established.

Interesting Cases

R1: ebgp-multihop X
R2: ttl-security hops 254

R1 sends packets to R2 with TTL=X

R2 replies back

R2 sends packets to R1 with TTL=255

R1 replies back

R1: ebgp-multihop 255
R2: ttl-security hops Y

R1 sends packets to R2 with TTL=255

R2 replies back
R2 accepts packets with TTL < 255-Y

R2 sends packets to R1 with TTL=255

R1 replies back
R2 accepts packets with any TTL

If ebgp-multihop is set to 255 or ttl-security is set to 254 (aka when at least one of these parameters is set to its max), then the eBGP session can be established, as long as their packets can reach each other.

R1: ttl-security hops X
R2: ttl-security hops Y

R1 sends packets to R2 with TTL=255

R2 replies back
R2 accepts packets with TTL < 255-Y

R2 sends packets to R1 with TTL=255

R1 replies back
R1 accepts packets with TTL < 255-X

If both routers use ttl-security, then the eBGP session can be established regardless of the hop values used, as long as their packets can reach each other.

R1: ebgp-multihop X
R2: ebgp-multihop Y

R1 sends packets to R2 with TTL=X

R2 replies back
R2 accepts packets with any TTL

R2 sends packets to R1 with TTL=Y

R1 replies back
R1 accepts packets with any TTL

If both routers use ebgp-multihop, then the eBGP session can be established regardless of the hop values used, as long as their packets can reach each other.

If loopback interfaces are used to connect single-hop eBGP peers, you can configure the "neighbor disable-connected-check" command before you can establish the eBGP peering session.

PMTUD

IOS
R2#sh bgp vpnv4 unicast all nei 19.19.19.19 | i tcp|segment
Transport(tcp) path-mtu-discovery is enabled
Datagrams (max data segment is 1432 bytes):

If you have BGP PMTUD enabled (by default in most releases), BGP packets will be sent with DF bit set.

You can disable BGP PMTUD (either for all neighbors or for a specific neighbor) with the following commands.

IOS
router bgp 100
no bgp transport path-mtu-discovery
neighbor 19.19.19.19 transport path-mtu-discovery disable

If global command "ip tcp path-mtu-discovery" is disabled (default) and BGP PMTUD is disabled too, then the default MSS (536) is used for BGP neighbors.

If "ip tcp path-mtu-discovery" is enabled but BGP PMTUD is disabled, then the maximum MSS is used for BGP neighbors.

You can use "ip tcp mss X" to change the global TCP MSS.

IOS-XR
tcp path-mtu-discovery

Links

IETF - RFC 1191

Pages

Thursday, February 6, 2014

NTS: L3VPN Redistribution

L3VPN Redistribution

NTS: BGP

BGP