Thursday, July 31, 2014

EVPN for Layer 2 stretch between Data Centers Pt.1

EVPN (Enhanced VPN or Ethernet VPN) is a great technology for stretching Layer 2 between Data Centers (aka Data Center Interconnect or DCI). It uses MP-BGP for control plane exchange of Tenant information and mac-addresses. Data Plane traffic is tunneled inside a tunneling protocol such as MPLS, VXLAN or PBB. EVPN is used in lieu of VPLS because it provides better control over BUM traffic (Broadcast, Unknown Unicast, and Multicast). It also supports the ability to forward traffic over multiple active paths and Multihoming. EVPN used over MPLS provides the benefits of traffic engineering and fast convergence.

In part I, I've created a small single-homed setup to show how this works.


The first step is to create the trunk port facing the Leaf switch. The leaf switch is a standard TOR switch with no special config.

set interfaces et-2/2/1 description TO-LEAF1
set interfaces et-2/2/1 flexible-vlan-tagging
set interfaces et-2/2/1 encapsulation flexible-ethernet-services
set interfaces et-2/2/1 unit 100 encapsulation vlan-bridge
set interfaces et-2/2/1 unit 100 vlan-id 100
 
I created a sub-interface and placed it into a routing instance.

set routing-instances evpn100 instance-type evpn
set routing-instances evpn100 vlan-id 100
set routing-instances evpn100 interface et-2/2/1.100
set routing-instances evpn100 route-distinguisher 4.4.4.4:100
set routing-instances evpn100 vrf-target target:65000:100
set routing-instances evpn100 protocols evpn interface et-2/2/1.100
set routing-instances evpn100 protocols evpn label-allocation per-instance


Instance configuration looks like a normal VPLS configuration except for the instance-type and evpn protocol parameters.

Next I configure BGP to exchange control plane info.

set protocols bgp group IBGP type internal
set protocols bgp group IBGP local-address 4.4.4.4
set protocols bgp group IBGP family inet unicast
set protocols bgp group IBGP family evpn signaling
set protocols bgp group IBGP neighbor 5.5.5.5

A new address family is used called evpn.
After that, the normal MPLS, your flavor of MPLS signaling and IGP protocol configuration is used as well as the Core MPLS facing interfaces.

set protocols mpls interface all
set protocols mpls interface fxp0.0 disable
set protocols mpls interface lo0.0
set protocols isis interface all
set protocols isis interface fxp0.0 disable
set protocols isis interface lo0.0 passive
set protocols ldp interface all
set protocols ldp interface fxp0.0 disable
set protocols ldp interface lo0.0

set interfaces et-2/0/0 description TO-CORE1
set interfaces et-2/0/0 unit 0 family inet address 192.168.24.4/24
set interfaces et-2/0/0 unit 0 family iso
set interfaces et-2/0/0 unit 0 family mpls


Once configured, MP-BGP exchanges "control plane" information.

# run show bgp summary
Groups: 1 Peers: 1 Down peers: 0
Table          Tot Paths  Act Paths Suppressed    History Damp State    Pending
inet.0              
                       0          0          0          0          0          0
bgp.evpn.0          
                       2          2          0          0          0          0
Peer                     AS      InPkt     OutPkt    OutQ   Flaps Last Up/Dwn State|#Active/Received/Accepted/Damped...
5.5.5.5               65000        137        136       0       0       57:42 Establ
  inet.0: 0/0/0/0
  bgp.evpn.0: 2/2/2/0
  evpn100.evpn.0: 2/2/2/0

  __default_evpn__.evpn.0: 0/0/0/0


# run show route receive-protocol bgp 5.5.5.5

inet.0: 24 destinations, 24 routes (24 active, 0 holddown, 0 hidden)

inet.3: 3 destinations, 3 routes (3 active, 0 holddown, 0 hidden)

iso.0: 1 destinations, 1 routes (1 active, 0 holddown, 0 hidden)

mpls.0: 15 destinations, 15 routes (15 active, 0 holddown, 0 hidden)

bgp.evpn.0: 2 destinations, 2 routes (2 active, 0 holddown, 0 hidden)
  Prefix          Nexthop           MED     Lclpref    AS path
  2:5.5.5.5:100::100::00:00:05:ed:ae:01/304                  
*                         5.5.5.5                      100        I
  3:5.5.5.5:100::100::5.5.5.5/304                  
*                         5.5.5.5                      100        I

evpn100.evpn.0: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden)
  Prefix          Nexthop           MED     Lclpref    AS path
  2:5.5.5.5:100::100::00:00:05:ed:ae:01/304                  
*                         5.5.5.5                      100        I
  3:5.5.5.5:100::100::5.5.5.5/304                  

You can also check the status of the EVPN and it's mac table


# run show evpn mac-table

MAC flags (S - static MAC, D - dynamic MAC, L - locally learned, P - Persistent static, C - Control MAC
           SE - statistics enabled, NM - non configured MAC, R - remote PE MAC)


Ethernet switching table : 2 entries, 2 learned
Routing instance : evpn100
    Vlan                MAC                 MAC         Age    Logical                NH        RTR
    name                address             flags              interface              Index     ID
    __evpn100__         00:00:05:ed:ad:49   D             -   et-2/2/1.100        
    __evpn100__         00:00:05:ed:ae:01   DC            -   pip-13.010010000000    1048577   1048577


This shows you locally learned macs and macs learned over the WAN.

# run show evpn statistics   
Instance: evpn100
   Local interface: et-2/2/1.100, Index: 338
     Broadcast packets:                     1
     Broadcast bytes  :                    60
     Multicast packets:                     0
     Multicast bytes  :                     0
     Flooded packets  :                  4240
     Flooded bytes    :               6341604
     Unicast packets  :               3292539
     Unicast bytes    :            3528822524
     Current MAC count:                     1 (Limit 0)

In Part II I'll go more into configuring Gateway information to prevent the trombone effect.


Sunday, July 27, 2014

Juniper MC-LAG configuration and behavior

A customer had an unusual requirement. Their spine switches didn't have any out of band management connectivity and they were not yet going to run any IP protocols so we couldn't use a loopback and reditribute that into an IGP. Their spine switches were also utilizing MC-LAG. The problem was how to access the switches to manage them. We basically setup in-band ip addresses on the MC-LAG. These addresses would have to be reachable through static routes. The problem we encountered was, in an MC-LAG which member would be received as the owner of the ip address. There is an option called status-control which does this. I ran a test and found out that it seems to work opposite of what our expected behavior.

First EX1's status-control is set to standby and EX2 is active.

jnpr@EX1# set interfaces ae0 aggregated-ether-options mc-ae status-control standby

jnpr@EX2# set interfaces ae0 aggregated-ether-options mc-ae status-control active

I put IRBs on both the MC-LAG Spine switches and the QFX leaf on vlan-id 100. 100.1.1.1 is the MC-LAG's ip and 100.1.1.100 is the QFX just for this test.  From the QFX I try to access the Spine.

jnpr@QFX5100-LEAF# run show arp no-resolve  
MAC Address       Address         Interface     Flags
00:00:5e:00:01:02 10.161.39.254   vme.0                none
4c:96:14:6b:bb:f0 100.1.1.1       ae0.0                none
4c:96:14:f2:b6:e3 192.168.1.1     em2.32768            none
Total entries: 4

{master:0}[edit]
jnpr@QFX5100-LEAF# run telnet 100.1.1.1
Trying 100.1.1.1...
Connected to 100.1.1.1 
Escape character is '^]'.

EX1 (ttyp1)

login: ^C
Client aborted login
Connection closed by foreign host.

I'm in EX1?!


Then I change the status control

jnpr@EX1# set interfaces ae0 aggregated-ether-options mc-ae status-control active 

jnpr@EX2# set interfaces ae0 aggregated-ether-options mc-ae status-control standby 

{master:0}[edit]
jnpr@QFX5100-LEAF# run show arp no-resolve  
MAC Address       Address         Interface     Flags
00:00:5e:00:01:02 10.161.39.254   vme.0                none
a8:d0:e5:f7:bf:f0 100.1.1.1       ae0.0                none
4c:96:14:f2:b6:e3 192.168.1.1     em2.32768            none
Total entries: 6

{master:0}[edit]
jnpr@QFX5100-LEAF# run telnet 100.1.1.1      
Trying 100.1.1.1...
Connected to 100.1.1.1
Escape character is '^]'.

EX2 (ttyp1)

login: ^C
Client aborted login

Weird. Not sure why this behavior seems backwards.

So the next issue is how do you access the other MC-LAG member? There are two ways. You can either access it via the ip address that is using the iccp connection. Or if you have the resources you can basically have two MC-LAGs per Spine switch and make one of them standby on one IRB and the other standby on a different IRB say 101 so both Chassises are IP reachable for management.

MC-LAG configuration example


EX1
set chassis redundancy graceful-switchover
set chassis aggregated-devices ethernet device-count 2
set interfaces et-2/0/1 description TO-LEAF
set interfaces et-2/0/1 ether-options 802.3ad ae0
set interfaces et-2/2/1 description TO-LEAF
set interfaces et-2/2/1 ether-options 802.3ad ae0
set interfaces xe-3/1/0 description ICCP
set interfaces xe-3/1/0 unit 0 family inet address 200.1.1.1/30
set interfaces ae0 aggregated-ether-options lacp active
set interfaces ae0 aggregated-ether-options lacp system-priority 100
set interfaces ae0 aggregated-ether-options lacp system-id 00:00:00:00:00:05
set interfaces ae0 aggregated-ether-options lacp admin-key 1
set interfaces ae0 aggregated-ether-options mc-ae mc-ae-id 1
set interfaces ae0 aggregated-ether-options mc-ae redundancy-group 1
set interfaces ae0 aggregated-ether-options mc-ae chassis-id 0
set interfaces ae0 aggregated-ether-options mc-ae mode active-active
set interfaces ae0 aggregated-ether-options mc-ae status-control standby
set interfaces ae0 unit 0 multi-chassis-protection 200.1.1.2 interface xe-9/1/1.0
set interfaces ae0 unit 0 family ethernet-switching interface-mode trunk
set interfaces ae0 unit 0 family ethernet-switching vlan members all
set interfaces irb unit 100 family inet address 100.1.1.1/24
set vlans v100 vlan-id 100
set vlans v100 l3-interface irb.100


EX2
set chassis redundancy graceful-switchover
set chassis aggregated-devices ethernet device-count 2
set interfaces et-2/0/1 description TO-LEAF
set interfaces et-2/0/1 ether-options 802.3ad ae0
set interfaces et-2/2/1 description TO-LEAF
set interfaces et-2/2/1 ether-options 802.3ad ae0
set interfaces xe-3/1/0 description ICCP
set interfaces xe-3/1/0 unit 0 family inet address 200.1.1.2/30
set interfaces ae0 aggregated-ether-options lacp active
set interfaces ae0 aggregated-ether-options lacp system-priority 100
set interfaces ae0 aggregated-ether-options lacp system-id 00:00:00:00:00:05
set interfaces ae0 aggregated-ether-options lacp admin-key 1
set interfaces ae0 aggregated-ether-options mc-ae mc-ae-id 1
set interfaces ae0 aggregated-ether-options mc-ae redundancy-group 1
set interfaces ae0 aggregated-ether-options mc-ae chassis-id 1
set interfaces ae0 aggregated-ether-options mc-ae mode active-active
set interfaces ae0 aggregated-ether-options mc-ae status-control active
set interfaces ae0 unit 0 multi-chassis-protection 200.1.1.1 interface xe-3/1/1.0
set interfaces ae0 unit 0 family ethernet-switching interface-mode trunk
set interfaces ae0 unit 0 family ethernet-switching vlan members all
set interfaces irb unit 100 family inet address 100.1.1.1/24
set vlans v100 vlan-id 100
set vlans v100 l3-interface irb.100


EX1
# run show interfaces ae0 extensive
Physical interface: ae0 (MC-AE-1, active), Enabled, Physical link is Up
  Interface index: 186, SNMP ifIndex: 561, Generation: 189
  Link-level type: Ethernet, MTU: 1518, Speed: 40Gbps, BPDU Error: None, MAC-REWRITE Error: None, Loopback: Disabled,
  Source filtering: Disabled, Flow control: Disabled, Minimum links needed: 1, Minimum bandwidth needed: 1bps
  Device flags   : Present Running
  Interface flags: SNMP-Traps Internal: 0x4000
  Current address: 4c:96:14:6b:bb:c0, Hardware address: 4c:96:14:6b:bb:c0
  Last flapped   : 2014-07-25 17:13:08 PDT (1d 06:39 ago)
  Statistics last cleared: Never
  Traffic statistics:
   Input  bytes  :                 2141                    0 bps
   Output bytes  :             22808481                 2616 bps
   Input  packets:                   30                    0 pps
   Output packets:               313340                    5 pps
   IPv6 transit statistics:
    Input  bytes  :                   0
    Output bytes  :                   0
    Input  packets:                   0
    Output packets:                   0
  Dropped traffic statistics due to STP State:
   Input  bytes  :                    0
   Output bytes  :                    0
   Input  packets:                    0
   Output packets:                    0
  Input errors:
    Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Giants: 0, Policed discards: 0, Resource errors: 0
  Output errors:
    Carrier transitions: 0, Errors: 0, Drops: 0, MTU errors: 0, Resource errors: 0
  Ingress queues: 8 supported, 4 in use
  Queue counters:       Queued packets  Transmitted packets      Dropped packets
    0                                0                    0                    0
    1                                0                    0                    0
    2                                0                    0                    0
    3                                0                    0                    0
  Egress queues: 8 supported, 4 in use
  Queue counters:       Queued packets  Transmitted packets      Dropped packets
    0                            39562                39562                    0
    1                                0                    0                    0
    2                                0                    0                    0
    3                           253116               253116                    0
  Queue number:         Mapped forwarding classes
    0                   best-effort
    1                   expedited-forwarding
    2                   assured-forwarding
    3                   network-control

  Logical interface ae0.0 (Index 348) (SNMP ifIndex 563) (Generation 177)
    Flags: Up SNMP-Traps 0x24024000 Encapsulation: Ethernet-Bridge
    Statistics        Packets        pps         Bytes          bps
    Bundle:
        Input :             0          0             0            0
        Output:        309312          4      28634313         2008
    Adaptive Statistics:
        Adaptive Adjusts:          0
        Adaptive Scans  :          0
        Adaptive Updates:          0
    Link:
      et-2/0/1.0
        Input :             0          0             0            0
        Output:        313340          4      29939385         2008
    LACP info:        Role     System             System      Port    Port  Port
                             priority          identifier  priority  number   key
      et-2/0/1.0     Actor        100  00:00:00:00:00:05       127       1     1
      et-2/0/1.0   Partner        127  4c:96:14:f2:b6:e0       127       2     1
    LACP Statistics:       LACP Rx     LACP Tx   Unknown Rx   Illegal Rx
      et-2/0/1.0            111233      111816            0            0
    Marker Statistics:   Marker Rx     Resp Tx   Unknown Rx   Illegal Rx
      et-2/0/1.0                 0           0            0            0
    Protocol eth-switch, MTU: 1518, Generation: 229, Route table: 6
      Flags: Trunk-Mode

EX2

show interfaces ae0 extensive
Physical interface: ae0 (MC-AE-1, active), Enabled, Physical link is Up
  Interface index: 219, SNMP ifIndex: 501, Generation: 351
  Link-level type: Ethernet, MTU: 1518, Speed: 40Gbps, BPDU Error: None, MAC-REWRITE Error: None, Loopback: Disabled,
  Source filtering: Disabled, Flow control: Disabled, Minimum links needed: 1, Minimum bandwidth needed: 1bps
  Device flags   : Present Running
  Interface flags: SNMP-Traps Internal: 0x4000
  Current address: a8:d0:e5:f7:bf:c3, Hardware address: a8:d0:e5:f7:bf:c3
  Last flapped   : 2014-07-25 17:13:10 PDT (1d 06:39 ago)
  Statistics last cleared: Never
  Traffic statistics:
   Input  bytes  :               153427                    0 bps
   Output bytes  :             13628185                 1832 bps
   Input  packets:                 2414                    0 pps
   Output packets:               193271                    1 pps
   IPv6 transit statistics:
    Input  bytes  :                   0
    Output bytes  :                   0
    Input  packets:                   0
    Output packets:                   0
  Dropped traffic statistics due to STP State:
   Input  bytes  :                    0
   Output bytes  :                    0
   Input  packets:                    0
   Output packets:                    0
  Input errors:
    Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Giants: 0, Policed discards: 0, Resource errors: 0
  Output errors:
    Carrier transitions: 0, Errors: 0, Drops: 0, MTU errors: 0, Resource errors: 0
  Ingress queues: 8 supported, 4 in use
  Queue counters:       Queued packets  Transmitted packets      Dropped packets
    0                                0                    0                    0
    1                                0                    0                    0
    2                                0                    0                    0
    3                                0                    0                    0
  Egress queues: 8 supported, 4 in use
  Queue counters:       Queued packets  Transmitted packets      Dropped packets
    0                      90356345631          90356345631                    0
    1                                0                    0                    0
    2                                0                    0                    0
    3                           402965               402965                    0
  Queue number:         Mapped forwarding classes
    0                   best-effort
    1                   expedited-forwarding
    2                   assured-forwarding
    3                   network-control

  Logical interface ae0.0 (Index 343) (SNMP ifIndex 18551) (Generation 128345)
    Flags: Up SNMP-Traps 0x24024000 Encapsulation: Ethernet-Bridge
    Statistics        Packets        pps         Bytes          bps
    Bundle:
        Input :             0          0             0            0
        Output:        189232          1      13150169          512
    Adaptive Statistics:
        Adaptive Adjusts:          0
        Adaptive Scans  :          0
        Adaptive Updates:          0
    Link:
      et-2/2/1.0
        Input :             0          0             0            0
        Output:        193272          1      14459095          512
    LACP info:        Role     System             System      Port    Port  Port
                             priority          identifier  priority  number   key
      et-2/2/1.0     Actor        100  00:00:00:00:00:05       127   32769     1
      et-2/2/1.0   Partner        127  4c:96:14:f2:b6:e0       127       1     1
    LACP Statistics:       LACP Rx     LACP Tx   Unknown Rx   Illegal Rx
      et-2/2/1.0            111288      111983            0            0
    Marker Statistics:   Marker Rx     Resp Tx   Unknown Rx   Illegal Rx
      et-2/2/1.0                 0           0            0            0
    Protocol eth-switch, MTU: 1518, Generation: 37237, Route table: 3
      Flags: Trunk-Mode



LEAF

{master:0}[edit]
jnpr@QFX5100-LEAF#run show interfaces ae0 extensive
Physical interface: ae0, Enabled, Physical link is Up
  Interface index: 659, SNMP ifIndex: 550, Generation: 150
  Description: TO-EX2
  Link-level type: Ethernet, MTU: 1514, Speed: 80Gbps, BPDU Error: None,
  MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled,
  Flow control: Disabled, Minimum links needed: 1, Minimum bandwidth needed: 0
  Device flags   : Present Running
  Interface flags: SNMP-Traps Internal: 0x4000
  Current address: 4c:96:14:f2:b7:a0, Hardware address: 4c:96:14:f2:b7:a0
  Last flapped   : 2014-07-25 16:49:38 PDT (1d 07:04 ago)
  Statistics last cleared: Never
  Traffic statistics:
   Input  bytes  :       12286985366334                 2832 bps
   Output bytes  :       12315040186254                 2208 bps
   Input  packets:         179102021645                    3 pps
   Output packets:         179048324703                    2 pps
   IPv6 transit statistics:
    Input  bytes  :                   0
    Output bytes  :                   0
    Input  packets:                   0
    Output packets:                   0
  Input errors:
    Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Giants: 0, Policed discards: 0,
    Resource errors: 0
  Output errors:
    Carrier transitions: 4, Errors: 0, Drops: 0, MTU errors: 0, Resource errors: 0
  Egress queues: 12 supported, 5 in use
  Queue counters:       Queued packets  Transmitted packets      Dropped packets
    0 best-effort                    0             95319185                    0
    3 fcoe                           0                    0                    0
    4 no-loss                        0                    0                    0
    7 network-cont                   0               242393                    0
    8 mcast                          0         178948686348                    0
  Queue number:         Mapped forwarding classes
    0                   best-effort
    3                   fcoe
    4                   no-loss
    7                   network-control
    8                   mcast

  Logical interface ae0.0 (Index 557) (SNMP ifIndex 553) (Generation 167)
    Flags: SNMP-Traps 0x24024000 Encapsulation: Ethernet-Bridge
    Statistics        Packets        pps         Bytes          bps
    Bundle:
        Input :             0          0             0            0
        Output:          3338          0        183668            0
    Link:
      et-0/0/50.0
        Input :             0          0             0            0
        Output:          5685          0       1470451            0
      et-0/0/51.0
        Input :             0          0             0            0
        Output:          4057          0       1386350            0
    LACP info:        Role     System             System      Port    Port  Port
                             priority          identifier  priority  number   key
      et-0/0/50.0    Actor        127  4c:96:14:f2:b6:e0       127       1     1
      et-0/0/50.0  Partner        100  00:00:00:00:00:05       127   32769     1
      et-0/0/51.0    Actor        127  4c:96:14:f2:b6:e0       127       2     1
      et-0/0/51.0  Partner        100  00:00:00:00:00:05       127       1     1
    LACP Statistics:       LACP Rx     LACP Tx   Unknown Rx   Illegal Rx
      et-0/0/50.0           111855      111309            0            0
      et-0/0/51.0           111853      111308            0            0
    Marker Statistics:   Marker Rx     Resp Tx   Unknown Rx   Illegal Rx
      et-0/0/50.0                0           0            0            0
      et-0/0/51.0                0           0            0            0
    Protocol eth-switch, MTU: 1514, Generation: 181, Route table: 3
      Flags: Trunk-Mode




Monday, July 21, 2014

Use Zero Touch Provisioning (ZTP) to auto-configure and upgrade new or replacement switches in a datacenter.

A Typical Data Center can host 10s if not hundreds of Top of the Rack (TOR) Switches. Managing and configuring each one of these can become a tedious task. Replacing a switch that goes out of service is just as time consuming. ZTP is an automation method that reduces the amount of time, minimizes errors and the need for a Network Engineer to be on location. You would only need a junior engineer or technician to re-cable links, rack the units and power them on without having to console in or add any configuration.


HOW ZTP WORKS



ZTP uses a combination of DHCP and TFTP/FTP/HTTP servers for dynamically allocating ip addresses, uploading configuration and upgrading switch software images. Juniper EX and QFX switches automatically default to ZTP on boot up and basically become DHCP clients.

To start you would configure a DHCP server, modifying the dhcpd.conf file by adding a few options: DHCP option 43 with vendor specific information sub options and DHCP option 150 or 66 which contains the address of the TFTP server. On the TFTP or FTP server you would archive all your switches' configurations and software images.

On your linux server the dhcpd.conf file would look similar to this:

host <EX SWITCH NAME> {
hardware ethernet 4c:96:14:e5:a3:41; ## MAC address of the management interface, you can also use the dynamic IP allocation and also we can use any of the network port's (MAC add# chassis mac +1) for ZTP
  fixed-address 100.1.1.90;     # Switch's irb ip address
  option option-150 100.1.1.1; # TFTP Server address to download config and image
  option host-name "EX4300-1";
  option VENDOR_OP.image-file-name "jinstall-ex-4300-13.2X51-D20.2-domestic-signed.tgz";
  option VENDOR_OP.config-file-name "PE3713320070.conf";
  option VENDOR_OP.transfer-mode "tftp";
  option VENDOR_OP.image-file-type "symlink";
}

ZTP works on untagged interfaces on any ports on the switch (data ports or management ports).

If you were to console into the box during a ZTP sequence it would look like this:

---------------------------------------

Committing autoinstall config                                                 
                                                                              
FIRST THE SWITCH WILL TRY DHCP OVER THE MANAGMENT address. (VME or ME)
                                                                              
Auto Image Upgrade: DHCP OFFER Client vme.0: Invalid config, no file server information. OFFER REJECTED.                                                                              

If no server is reachable it will try all the interfaces that are up on the switch using the default vlan and a temporary irb.

It will then check the DHCP Options that are  passed between the server and the switch, noting the TFTP server IP address, the configuration file name and software image to be installed.

Auto Image Upgrade: DHCP Options for client interface irb.0:                  
ConfigFile: PE3713320070.conf ImageFile: jinstall-ex-4300-13.2X51-D20.2-domesti
c-signed.tgz Gateway: 100.1.1.1DHCP Server: 100.1.1.20 File Server: 100.1.1.1 O
ptions state: All options set                                                                              
                                                                              
Auto Image Upgrade: DHCP Client Bound interfaces: irb.0   vme.0                                                                                 
                                                                              
Auto Image Upgrade: DHCP Client Unbound interfaces: ge-0/0/0.0   ge-0/0/1.0   g
e-0/0/2.0   ge-0/0/3.0   ge-0/0/4.0   ge-0/0/5.0   ge-0/0/6.0   ge-0/0/7.0   ge
-0/0/8.0   ge-0/0/9.0   ge-0/0/10.0   ge-0/0/11.0   ge-0/0/12.0   ge-0/0/13.0 
 ge-0/0/14.0   ge-0/0/15.0   ge-0/0/16.0   ge-0/0/17.0   ge-0/0/18.0   ge-0/0/1
9.0   ge-0/0/20.0   ge-0/0/21.0   ge-0/0/22.0   ge-0/0/23.0   ge-0/0/24.0   ge-
0/0/25.0   ge-0/0/26.0   ge-0/0/27.0   ge-0/0/28.0   ge-0/0/29.0   ge-0/0/30.0
  ge-0/0/31.0   ge-0/0/32.0   ge-0/0/33.0   ge-0/0/34.0   ge-0/0/35.0   ge-0/0/
36.0   ge-0/0/37.0   ge-0/0/38.0   ge-0/0/39.0   ge-0/0/40.0   ge-0/0/41.0   ge
-0/0/42.0   ge-0/0/43.0   ge-0/0/44.0   ge-0/0/45.0   ge-0/0/46.0   ge-0/0/47.0
                                                                                 
                                                                              
Auto Image Upgrade: To stop, on CLI apply "delete chassis auto-image-upgrade" 
and commit                                                                              

The EX switch will then parse the dhcp response
                                                                              
Auto Image Upgrade: Active on client interface: irb.0                                                                              
                                                                              
Auto Image Upgrade: Interface::   "irb"                                       

Auto Image Upgrade: Server::      "100.1.1.1"                                 

Auto Image Upgrade: Image File::  "jinstall-ex-4300-13.2X51-D20.2-domestic-sign
ed.tgz"                                                                       

Auto Image Upgrade: Server File:: "PE3713320070.conf"                         

Auto Image Upgrade: Gateway::     "100.1.1.254"                                 

Auto Image Upgrade: Protocol::    "tftp"        
                              
                                                                            
The EX switch will then download the config file and the software image. 
                                                                              
Auto Image Upgrade: Start fetching PE3713320070.conf file from server 100.1.1.1
 through irb using tftp                                                       
                                                                              
                                                                              
Auto Image Upgrade: File PE3713320070.conf fetched from server 100.1.1.1 throug
h irb                                                                         
                                                                              
                                                                              
Auto Image Upgrade: Start fetching jinstall-ex-4300-13.2X51-D20.2-domestic-sign
ed.tgz file from server 100.1.1.1 through irb using tftp              

If the installed version on the switch and the version on the tftp server are the same, then the upgrade process aborts.

Auto Image Upgrade: Aborting image installation of jinstall-ex-4300-13.2X51-D21
.1-domestic-signed.tgz received from 100.1.1.1 through irb: Installed and fetch
ed image version same                                                         
                       
If the images are not the same, the EX switch will auto upgrade.

Auto Image Upgrade: File jinstall-ex-4300-13.2X51-D20.2-domestic-signed.tgz fet
ched from server 100.1.1.1 through irb   
                                                                              
Auto Image Upgrade: To install /var/tmp/jinstall-ex-4300-13.2X51-D20.2-domestic
-signed.tgz image fetched from server 100.1.1.1 through irb                   
                                                                                                                                                             
WARNING!!! On successful image installation, system will reboot automatically 

Auto Image Upgrade: Installation of /var/tmp/jinstall-ex-4300-13.2X51-D20.2-dom
estic-signed.tgz image fetched from server 100.1.1.1 through irb is done, proce
eding for reboot of system                                                    
                                                                              
                                                                              
Broadcast Message from root@EX4300-1                                          
        (no tty) at 5:47 UTC...                                               
                                                                              
Auto image Upgrade: Stopped                                                   
                                                                              
                                                                              
*** System shutdown message from root@EX4300-1 ***                          

System going down in 1 minute                                                 

*** FINAL System shutdown message from root@EX4300-1 ***                    

System going down IMMEDIATELY     

### AFTER REBOOT
EX4300-1 (ttyu0)

login: jnpr
Password:

--- JUNOS 13.2X51-D20.2 built 2014-04-29 08:43:38 UTC
{master:0}
jnpr@EX4300-1>

The EX is now running the new version of code and the downloaded configuration file and is ready for production.
Here's the config I see on the switch after reboot. It matches the config I saved on the TFTP server.

------------
jnpr@EX4300-1# show
## Last changed: 2014-07-20 07:56:01 UTC
version 13.2X51-D21.1;
/*
 * dhcpd-generated /var/etc/dhcpd.options.conf
 * Version: JDHCPD release 13.2X51-D21.1 built by builder on 2014-05-29 13:06:11 UTC
 * Written: Sun Jul 20 07:49:45 2014
 */

system {
    host-name EX4300-1;
    root-authentication {
        encrypted-password "$1$byLFhlG6$my6QnZANcF7DqD9m9Op5s."; ## SECRET-DATA
    }
    login {
        user jnpr {
            uid 2005;
            class super-user;
            authentication {
                encrypted-password "$1$FNz57vVN$lQYXYBuxDKlPwtTBFQXWa0"; ## SECRET-DATA
            }
        }
    }
    services {                         
        ssh;
        telnet;
        web-management {
            http;
        }
    }
    syslog {
        user * {
            any emergency;
        }
        host 100.1.1.72 {
            any any;
        }
        file messages {
            any notice;
            authorization info;
        }
        file interactive-commands {
            interactive-commands any;
        }
    }
    ntp {
        server 100.1.1.73;            
    }
}
interfaces {
    ge-0/0/0 {
        unit 0 {
            family ethernet-switching {
                vlan {
                    members default;
                }
            }
        }
    }
    irb {
        unit 0 {
            family inet {
                address 100.1.1.90/24;
            }
        }
    }
}
vlans {
    default {
        vlan-id 1;
        l3-interface irb.0;
    }
}

dhcpd.conf file on your unix box
--------------------------------

#STARTING OPTIONS
option subnet-mask 255.255.255.0;
option routers 100.1.1.1;   # Default GW
option option-150 code 150 = ip-address;

#Vendor Specific Option
option space VENDOR_OP;        #Define the Vendor Specific Option called VENDOR_OP
option VENDOR_OP-encapsulation code 43 = encapsulate VENDOR_OP;
option VENDOR_OP.image-file-name code 0 = text;
option VENDOR_OP.config-file-name code 1 = text;
option VENDOR_OP.image-file-type code 2 = text;
option VENDOR_OP.transfer-mode code 3 = text;

# DHCP IP Pool for your PCs, etc.

subnet 100.1.1.0 netmask 255.255.255.0 {
  range 100.1.1.50 100.1.1.60;
  option routers 100.1.1.1;
  option broadcast-address 100.1.1.255;
  option subnet-mask 255.255.255.0;
  option domain-name-servers 8.8.8.8;
  option domain-name "mydomain.net";
}

### EX Switch entries

host EX4300-1 {
hardware ethernet 4c:96:14:e5:a3:41; ## MAC address of the management interface, you can also use the dynamic IP allocation
 and also we can use any of the network port's (MAC add# chassis mac +1) for ZTP
  fixed-address 100.1.1.90;     # Switch's irb ip address to be assigned
  option routers 100.1.1.1;     # Default GW in case tftp is on another subnet
  option option-150 200.1.1.1; # TFTP Server address to download config and image
  option host-name "EX4300-1";
  option VENDOR_OP.image-file-name "jinstall-ex-4300-13.2X51-D21.1-domestic-signed.tgz";
  option VENDOR_OP.config-file-name "PE3713320070.conf";
  option VENDOR_OP.transfer-mode "tftp";
  option VENDOR_OP.image-file-type "symlink";

  option log-servers 100.1.1.72;
  option ntp-servers 100.1.1.73;
}







Thursday, July 17, 2014

In service software upgrade ISSU on a Juniper leaf switch QFX5100 with minimal traffic distruption.

QFX5100 ISSU

The Juniper QFX5100 switch has the ability to be upgraded while in service (production) with minimal impact. This is useful when let's say this QFX is used as a leaf switch with servers directly attached to it. Since most TOR (Top-of-Rack) switches do not have redundant CPUs, i.e. control plane, this feature is a necessity.

Currently Data Centers would have to migrate VMs on host servers to other hosts residing on other TOR switches. An upgrade would happen on the switch that would be taken out of service and then VMs would have to be migrated back. This could take several hours and also could be an issue if resources on other switches were limited.

How ISSU works

Just like Virtualization in the Server world, the QFX switch has a hypervisor (KVM) and a VM that runs JUNOS (JVM).
You would need to configure GRES (Graceful Routing-Engine Switchover), non-stop routing and non-stop bridging before starting.
From the JVM (running the current code) you would issue the ISSU upgrade command: request system software in-service-upgrade <image version>
Using the new code image a second JVM is launched as a backup.
Next a Master-Backup election would happen between the VMs with the Master being elected on the current version of code.
It will then sync all the state tables to between the two VMs.
The Backup VM would then connect to the Packet Forwarding Engine (i.e. asic).
The device drivers would then detach from the old master while simultaneously attach the device drivers to current backup JVM
Switchover mastership for the current backup to tbecome the new master
Kill the old JVM running the old version of code.


In this test I setup IXIA traffic testers that injected 10K OSPF, 2K BGP routes and 10K mac addresses. Traffic was then sent to these destination addresses.

jnpr> show route summary
Autonomous system number: 100
Router ID: 1.1.1.1

inet.0: 12015 destinations, 12015 routes (12015 active, 0 holddown, 0 hidden)
              Direct:      7 routes,      7 active
               Local:      6 routes,      6 active
                OSPF:  10001 routes,  10001 active
                 BGP:   2000 routes,   2000 active
              Static:      1 routes,      1 active

{master:0}
jnpr> show ethernet-switching table summary
Total dynamic and static MAC addresses learned globally : 10004
Configured static MAC addresses learned globally       : 0

{master:0}
jnpr> edit  
Entering configuration mode

{master:0}[edit]
jnpr# show routing-options
nonstop-routing;
autonomous-system 100;

{master:0}[edit]
jnpr# show protocols layer2-control
nonstop-bridging;

{master:0}[edit]
jnpr# show chassis
redundancy {
    graceful-switchover;
}



jnpr# show interfaces
xe-0/0/0:0 {
    unit 0 {
        family ethernet-switching {
            interface-mode trunk;
            vlan {
                members all;
            }
        }
    }
}
xe-0/0/0:1 {
    unit 0 {
        family ethernet-switching {
            interface-mode trunk;
            vlan {
                members all;
            }
        }
    }
}
irb {
    unit 100 {
        family inet {
            address 100.1.1.1/24;
        }
    }
    unit 101 {
        family inet {
            address 101.1.1.1/24;
        }
    }
    unit 200 {                         
        family inet {
            address 200.1.1.1/24;
        }
    }
    unit 201 {
        family inet {
            address 201.1.1.1/24;
        }
    }
    unit 300 {
        family inet {
            address 110.1.1.254/24;
        }
    }
}
lo0 {
    unit 0 {
        family inet {
            address 1.1.1.1/32;
        }
    }
}

{master:0}[edit]
jnpr# show protocols
bgp {
    group EBGP {
        type external;
        neighbor 200.1.1.2 {
            peer-as 200;
        }
        neighbor 201.1.1.2 {
            peer-as 201;
        }
    }
}
ospf {
    area 0.0.0.0 {
        interface irb.100;
        interface irb.101;
    }
}
layer2-control {
    nonstop-bridging;
}


UPGRADE PROCESS:


{master:0}
jnpr> show version
fpc0:
--------------------------------------------------------------------------
Model: qfx5100-24q-2p
JUNOS Base OS Software Suite [13.2X51-D15.5]
JUNOS Base OS boot [13.2X51-D15.5]
JUNOS Crypto Software Suite [13.2X51-D15.5]
JUNOS Online Documentation [13.2X51-D15.5]
JUNOS Kernel Software Suite [13.2X51-D15.5]
JUNOS Packet Forwarding Engine Support (qfx-x86-32) [13.2X51-D15.5]
JUNOS Routing Software Suite [13.2X51-D15.5]
JUNOS Enterprise Software Suite [13.2X51-D15.5]
JUNOS py-base-i386 [13.2X51-D15.5]
JUNOS Host Software [13.2X51-D15.5]


{master:0}
jnpr> request system software in-service-upgrade ftp://jnpr:pass123@192.168.1.1:/jinstall-qfx-5-13.2X51-D21.1-domestic-img.tgz  
warning: Do NOT use /user during ISSU. Changes to /user during ISSU may get lost!
ISSU: Validating Image
Fetching package...
ISSU: Preparing Backup RE
Prepare for ISSU
ISSU: Backup RE Prepare Done
Spawning the backup RE
Spawn backup RE, index 0 successful
GRES in progress
GRES done in 13 seconds
Waiting for backup RE switchover ready
GRES operational
Copying home directories
Copying home directories successful
Initiating Chassis In-Service-Upgrade
Chassis ISSU Started
ISSU: Preparing Daemons
ISSU: Daemons Ready for ISSU
ISSU: Starting Upgrade for FRUs
ISSU: FPC Warm Booting
ISSU: FPC Warm Booted
ISSU: Preparing for Switchover
ISSU: Ready for Switchover
Checking In-Service-Upgrade status
  Item           Status                  Reason
  FPC 0          Online (ISSU)       
Send ISSU done to chassisd on backup RE
Chassis ISSU Completed
ISSU: IDLE
Initiate em0 device handoff
Connection closed by foreign host.
[jnpr-laptop:~] jnpr%
[jnpr-laptop:~] jnpr% expect telly 10.161.33.53
spawn telnet -K 10.161.33.53
Trying 10.161.33.53...
Connected to 10.161.33.53.
Escape character is '^]'.

 (ttyp0)

login: jnpr
Password:

--- JUNOS 13.2X51-D21.1 built 2014-05-29 11:41:16 UTC
{master:0}
jnpr> show version
fpc0:
--------------------------------------------------------------------------
Model: qfx5100-24q-2p
JUNOS Base OS Software Suite [13.2X51-D21.1]
JUNOS Base OS boot [13.2X51-D21.1]
JUNOS Crypto Software Suite [13.2X51-D21.1]
JUNOS Online Documentation [13.2X51-D21.1]
JUNOS Kernel Software Suite [13.2X51-D21.1]
JUNOS Packet Forwarding Engine Support (qfx-ex-x86-32) [13.2X51-D21.1]
JUNOS Routing Software Suite [13.2X51-D21.1]
JUNOS Enterprise Software Suite [13.2X51-D21.1]
JUNOS py-base-i386 [13.2X51-D21.1]
JUNOS Host Software [13.2X51-D15.5]


IXIA Results

That's 52 frames lost with a service disruption of 2.6 ms (milliseconds).