This is a collection of my Cisco ACI notes during my studies.

Imperative Model vs Declarative Model

  • There are two operational models, as to how the hardware reacts to the intent of the network administrator: imperative or declarative.
  • The imperative model is what we, network engineers, did for years before the appearance of ACI: we tell network equipement how to implement a feature or a protocol by “programming” them. the result is immediately seeable.
  • The declarative model is however when we tell the hardware where and when we need such and such features, but we do not tell the hardware how to implement them.
  • ACI builds by default a zero-based trust network, i.e communication is not allowed unless specified, which is opposite to the traditional network which is trust-based.

Vmware VDS vs Cisco AVS

Vmware VDS

  • is purely a L2 virtual switch.
  • spans one or more virtualization host, unlike the Virtual Standard Switch VSS (not to confuse with VSS feature on Catalyst switches)
    • VSS is not supported when integrating ACI with vCenter.
  • can either be managed by vCenter or by ACI when the vSphere environment is integrated to ACI through a VMM Domain. In the latter case, we call it an ACI-managed DVS.
  • does not support OpFlex.
  • supports CDP and LLDP.
  • has a closed code that only VMware has access to. That is why APIC follows an imperative model with DVS

Cisco AVS

  • Application Virtual Switch, is a multihypervisor (compatible with many hypvervisor vendors) virtual switch that comes with ACI free of charge
  • is built on the successful 1000V switch.
  • is completely managed by ACI, unlike the 1000V that was managed by a Virtual Supervisor Module VSM.
  • supports L2/L3 functionality
  • supports OpFlex
  • is supported by VMware vSphere up to vSphere 6.5.
  • integrates a Distributed Firewall which can be in disabled mode, Learning mode or Enabled mode.
  • supports both VLAN and VXLAN encapsulations
  • its successor is the Cisco ACI Virtual Edge (AVE)

The choice of using VDS or Cisco AVS is made in the menu during the configuration of the VMM domain integration.

Local Station Table, Global Station Table

  • LST (Local Station Table) contain entries in the format “Endpoint IP address — switch port )
  • GST (Global Station Table) contains entries in the format “Endpoint IP address — Leaf ID”

Integrating Workloads with ACI

  • Physical workload: is a subset of compute, storage and network ressources dedicated to a single entity. In the IT industry we distinguish physical workloads and virtual workloads. A Physical workload ist then the subset of compute, storage and network used by a physical machine..
  • A Virtual Workload is the same subset being used by a virtual machine.
  • When integrating a physical workload (bare-metal servers) with ACI we should very likely configure policies on each physical NIC or virtual NIC on the server.
  • To integrate ACI with Microsoft platforms, we have two options:
    • integration with Microsoft SCVMM
    • integration with Azure Pack
      • provides ready-to-use management portal and admisitrator portal
      • reflects the same experience as Microsoft Azure cloud.

ACI Migration Modes

when companies migrate from a traditional network to ACI, they can adopt either one of the following approaches:

  • network-centric mode:
    • the network administrator creates a subnet per VLAN per Bridge Domain and put their servers that were in this VLAN on an EPG. This mode is also known as the VLAN=BD=EPG mode.
    • in this mode we take our existing VLANs and subnets from the old network and create them in ACI; i.e. we “reproduce” the network in ACI. 
    • can be a one-tenant or a multi-tenant setup.
  • Application-centric mode .
  • Hybrid mode: some servers remain grouped by VLAN, some others according to another criteria such as by application or by business need, rather than by VLAN as it was in the traditional network. The hybrid mode is the combination of the network-centric mode and some features from the application-centric mode.

      There is nothing wrong with either migration modes, i.e. you are not forced to migrate to the application-centric mode if you don’t have a need to. Always ask the question “Is my customer happy with my network design?”

      ACI Policies

      Blacklist Model vs Whitelist Model

      In an organisation the corporate security guidelines and policies follow one of the following security models: blacklist or whitelist. In a blacklist model everything is open unless specifically denied. In the whitelist model every communication is denied unless specifically authorized. A quick analogy to understand the whitelist model is Cisco IOS Access Lists.

      VLAN Pool

      • assigned statically or dynamically.
      • In ACI, the VLAN allocation to EPGs is either static or dynamic.
        • a statically allocated VLAN to the EPG is called static binding. This case is seen with Legacy Bridge Domains.
          • Attention however: We need to distinguish two concepts related to VLANs: internal namespace and external namespace.
            • Internal namespace refers to the internal VLAN ID used by the leaf to switch the endpoint frames within the fabric.
            • external namespace refers to the static VLAN encapsulation that the ACI administrator assigns to an EPG
        • We can also assign a range of VLAN IDs to an EPG. This VLAN range is called VLAN Pool.
      • it is recommended to design VLAN pools based on funtional role, e.g Firewall_VLP.
      • attaches to one or more domains. Beware that when using the same VLAN Pool for more than one domain, the VLAN significance is local to each domain. The recommendation however is not to reuse the same VLAN Pool.
      • When a physical non-virtualization server is connecting to the ACI fabric, then configure static VLAN mapping.
      • When a virtualization server is connecting to the ACI fabric, then use dynamic VLAN allocation.

      VXLAN

      • VXLAN in the Cisco ACI fabric is different from the standard VXLAN protocol.
      • VXLAN offers 16 Million subnets. Each segment is distinguished with a VNI VXLAN Network Identifier.
      • Just as VLANs segment a network, VXLAN segments the network too. We call them simply VXLAN segments.
      • Each VXLAN segment is uniquely identified by a VNI or VNID (VXLAN Network IDentifier). A VXLAN segment is a L2 broadcast segment.
      • At the ingress leaf, usually the endpoint generates traffic that is encapsulated in VLANs (some endpoints support VXLAN and NVGRE too!). The ACI fabric leaf:
        • encapsulates the ingress frame in a Cisco VXLAN packet that contains both UDP and VXLAN headers. The VXLAN header includes the right VNI (there is a VNI-to-VLAN mapping on each fabric leaf).
        • encapsulates the VXLAN packet into an IP packet
        • routes the packet to the destination VTEP using the IS-IS underlay network.
      • At the leaf egress, the packet is decapsulated from the Cisco VXLAN header and encapsulated into a VLAN frame (or whichever the encapsulation at the endpoint is).

      • There an entry “deny any” at the end of the ACL. Cisco ACI employs the whitelist model by default.
      • Unknown Multicast traffic is a multicast traffic crossing the ACI fabric without a IGMP Join message.

      Initial APIC Setup

      ACI Domains

      VMware NIC cards

      • vNIC: the virtual NIC on a VM
      • vmnic: aka pNIC: the physical NIC on the virtualization host
      • vmknic: the vmnic on the hypervisor itself, used to transport infrastructure traffic from/to the hypervisor itself.

      Bridge Domains

      Application Profiles

      EPG (End Point Groups)

      Contracts

      Attachable Access Entity Profile (AEP)

      Tenants

      VRF

      ACI L4-7 Service insertion

      • NTP must be configured and synchronized on APIC and all fabric nodes. Here is a quick tutorial on setting up NTP on ACI.
      • New nodes being added to the fabric are automatically discovered by APIC through LLDP. As soon as they pop up in the APIC GUI Interface you can add or block them from joining the fabric, based on their Serial Numbers.
      • New fabric nodes send DHCP requests and receive replies from APIC.
      • APIC sends TEP addresses to the new leafs
      • VTEP aka TEP
        • Tunnel EndPoint addresses
        • the address pool is laid down during initial APIC setup and is recommended to be a /16 or a /17 subnet. By default it is 10.0.0.0/16 subnet. ACI version 2 allows to have a VTEP address pool of a /22 subnet.
        • Swtiches in a Pod share the same VTEP prefix
      • Giving lower numerical IDs to the spines is recommended. The subsequent higher IDs should be reserved for the leafs.
      • All fabric nodes and APICs should be connected to an OOB network for management purposes.
      • Access to leaf switches through console cable is possible but offers only read capabilities.
      • OS image management occurs on the APIC, which supports TFTP
      • in ACI there is no need to:
        • configure loopback addresses on new switches
        • configure IGP protocol and neighborships
        • configure custom routing timers
        • configure list of allowed VLANs on trunks.
      • Management of the fabric can be performed also using an external management station connected to the fabric on tenant “mgmt”. In this scenario you must:
        • configure a VLAN Pool, an AEP, a phyiscal domain
        • assign the VLAN Pool to the domain
        • encapsulate the domain under the AEP
      • Traffic classification
        • ACI fabric performs traffic classification when an end host or a NIC is attached to it. The purpose is to be able to correctly assign the endpoint to one preconfigured EPG.
        • is based on one of the following criteria, depending on whether we attach a physical workload or a virtual workload, or whether we use an ACI-managed DVS or AVS:
          • source MAC address
          • source IP address
          • port and VLAN ecapsulation
          • port and VXLAN encapsulation
          • etc.
      • Provisioning a switch port in traditional networks is completely different from the ACI world:
        • in a traditional switch you configure interfaces separately
        • in ACI, you configure many constructs and objects at first, sch as domain, AEP, VLAN Pool, Switch Profile, Interface Profile… which may seem a burden at first. But its power lays with its flexibility and extensibility. For example if you want to add an interface with similar configuration to a previous one, simply add it to the Interface Profile.
      • an Application in the ACI model ist not a virtual/physical machine, but the combination of:
        • workloads, either physical or virtual
        • L2 – L7 policies: VLANs, subnets, L4 ports, ACL, QoS policies, filtering policies, load balancing policies,…
      • In terms of number of supported Spines, the ACI fabric supports a minimum of 2 and a maximum of 6, in even numbers (2, 4, 6).
      • ACI fabric operates on a whitelist model: no communication is allowed unless specified.
      • Frames in ACI are routed, but the L2 switching semantics are preserved.
      • ACI fabric design on multiple sites:
        • stretched fabric: the ACI fabric is stretched on both sites. We have always one APIC cluster: one APIC is installed on one site and two APIC on the other site. Some leafs from site A physically connect to some spines of site B.
        • Multi-pod
        • dual fabric design: each site has its own APIC cluster and own ACI fabric. Both ACI fabrics are connected over the L2 or L3 networks, which are carried by some leafs at each site.
        • multi-site design: this is an evolution of the dal fabric design. Both ACI fabrics are connected over the WAN. The WAN is connected at the spines of each site.
      • MP-BGP
        • internal MP-BGP and iBGP use the same ASN
        • runs on selected spines chosen by the APIC administrator
        • runs between Leafs and Spines to propagate routing information regarding external routes.
        • carries endpoint reachability information between pods.
        • one or more Spines can play the role of BGP Route Reflector. This is chosen by the APIC administrator.
      • Infrastructure VLAN
        • is used within the fabric
        • must be unique on the whole network, including end host VLANs
        • must be extended (manually configured) to Blade Systems
        • recommended but not mandatory: use VLAN ID 3967.
      • ACI Basic vs Advanced GUI
        • Basic GUI
          • use cases:
            • for small ACI deployments
            • for network administrators who do not need full ACI features such as L4-7 integration.
          • allows configuration of tenants, leaf ports and access profiles
          • allows to configure one port at a time
          • is above ACI v3.0 not supported anymore
        • Advanced GUI
          • allows to configure multiple ports through the access selector and Interface Profiles
          • recommended to be used.
      • Extending the ACI fabric:
        • new leafs should have a leaf ID numbered higher than 100
        • new spines should have a Spine ID numbered higher than 200.
      • Fabric Extenders can be attached to ACI fabric. Not all FEXs are supported by ACI. A FEX can attach to only one leaf.
      • APIC firmware upgrade:
        • you define an APIC firmware policy first, that defines when to perform the upgrade
        • you can launch the firmware upgrade immediately or plan it using a Scheduler
      • Switch firmware upgrade
        • you define maintenance groups and maintenance policies
        • Clients with critical ACI environments use the four-group method; i.e. they define 4 maintenance groups and start by upgrading Leaf groups first;
          • red Leaf group
          • blue Leaf group
          • red Spine group
          • blue Spine group
        • The maintenance policy defines when and how to perform the upgrade:
          • the “when”: immediately or using a Scheduler
          • the “how”: the upgrade process if launched must obey the following rules:
            • not exceeding the concurrCap value, which is the maximum simultaneous switches being upgraded at a time
            • only one switch per VPC
      • Configuration management in ACI is performed with either snapshots or backups.
        • Snapshots
          • are fast (couple of clicks) to store a config or to restore it
          • store settings of the fabric or of the tenants
          • do not store the complete configuration
        • Backups
          • used to store and restore the ACI configuration
          • require an external server
          • in a backup operation, a backup file is generated. In a restore, an import file is needed
          • require an export policy in the save operation, and an import policy in the restore operation.
          • two types of backup/restore operation:
            • best effort:
              • when a difference in Shads is encountered, the portion of data is ignored.
              • when the imported config belongs to an ACI version is different from the current running, the difference is ignored.
            • atomic:
              • when a difference in Shads is encountered, the portion of data is ignored.
              • when the imported config belongs to an ACI version different from the current running, the backup/restore operation is aborted.
      • Endpoint learning
        • there are three so-called Station tables
          • local station table:
            • each leaf has a local station table
            • contains all endpoints connected to the local leaf
          • global station table
            • each leaf has a global station table
            • contains cached information about some remote endpoints. Leafs are not supposed to possess forwading information about all endpoints in the fabric.
          • proxy station table
            • resides on the spines
            • all the spines have the same proxy station table
            • contains forwarding information about all endpoints attached to the leafs (aka the endpoint reachability information).
              • The endpoint reachability information includes:
                • L2 information: VLAN, endpoint MAC address
                • L3 information: endpoint IP address
                • location information: Leaf ID, access port ID.
      • Software overlay network: the logically built overlay network between virtual switches located on hypervisors
        • When a virtualized server is a dual hypervisor, then each hypervisor runs its own software network overlay, and both network overlays do not communicate with each other.
        • the software overlay network does not communicate with the physical network either unless a software gateway is installed.
      • Dockers (equivalent to VM) in the Linux Docker technology doe not have their own TCP/IP stack but rather a namespace in the TCP/IP stack of the host machine.
      • OpFlex:
        • finds out which physical ports of the virtualization host are connected to leaf ports, if the virtualization host is directly plugged to ACI fabric. Otherwise, LLDP (enabled by default) or CDP (not enabled by default) is used.Remember that a virtual switch connects to physical NIC ports of the virtualization host, and the physical NIC ports connect to the ACI fabric.
        • runs on the ACI Infrastructure VLAN
      • Microsegmentation
        • leads to the distinction between the original EPG (aka Base EPG) and microsegmented EPG (aka uSeg EPG)
        • The purpose of microsegmenting EPG is to automate the assignment of selected Virtual Machines to a particular EPG using rules, instead of the VMware administrator having to manually assigning them.
        • Each rule is in the format “match-any | match-all {u-attribute}”, where u-attributes are the microsegmentation attributes
        • Only two u-attributes are supported by uSeg EPG when attached to bare metal servers:
          • IP Address
          • MAC Address
        • the list of available u-attributes of an uSeg EPG attaching to a VMM domain is richer:
          • IP Address
          • MAC Address
          • VM Name
          • VM OS
          • VM tag
        • a rule can be a pure “match-any” filter, a pure “match-all” filter, or a combination of both.
        • if there are many clauses in the rule, than beware of the precedence among the u-attributes, e.g. the u-attribute “VM Name” has a higher precedence than “VM tag”. So if the u-attribute “VM Name” matches first, further clauses of the rule won’t be inspected by APIC.
        • available for both physical and VMM Domains
      • A Blade system (or Blade Chassis) is composed of Blade servers and Blade Switches
        • Blade Switches are physical
        • Blade Servers are physical and contain Virtual Switches
      • ACI plugin for vCenter
        • allows virtualization administrators to interact with APIC in an easy way without the requirement to have prior networking knowledge:
        • virtualization administrator can add/delete/modify ACI constructs (tenants, VRF, Bridge Domains, App Profile, EPG, uSeg EPG), add/modify Port Groups, add/modify VM to Port group associations, etc.
      • Integrating Openstack with ACI
        • Openstack is a private cloud environment
        • has three components or node types: compute nodes (aka Nova), storage nodes (aka Swift) and network nodes (aka Quantum or later as Neutron).
        • A compute node hosts one or more instances (the equivalent of virtual machines). Each instance is referenced with its Instance ID.
        • ML2 (Modular Layer 2) is a framework developed to interact with the networking node. Cisco has developed its own variation of it in order to integrate ACI with Openstack.
        • to add an Openstack environment to ACI as a VMM Domain, you do this not on ACI but on the Openstack network node itself using the Cisco ML2 plugin. Then when you create an Openstack project, an ACI tenant is created too along with a brindge domain and a VRF.
        • ML2 allows only one EPG per Bridge Domain is possible
        • Openstack has the concept of L3out like ACI. The L3out can be dedicated per Openstack project or shared. In the latter case, configure the shared L3out in the ACI tenant common.
        • Openstack supports Source NAT (SNAT). When an instance connects out of the node, it takes an IP address from the global IP address range reserved to it, whose subnet is defined in the bridge domain.
          • The opposite way: ingress packets from the outside do not communicate directly with the real IP address of the instance, but rather with a floating IP address.
        • GBP (Group-Based Policy) is a network policy framework simpler than ML2 when interacting with Neutron. It does not intend to replace ACI but its constructs are very similar to those of ACI.

      ACI Multi-Pod Design

      • A multi-pod design is considered an evolution of stretched fabric design.
      • Pods can be in the same physical location (intra-DC) or in separate locations (inter-DC) separated by a point-to-point network like dark fiber or DWDM, or by a traditional L3 infrastructre network like an MPLS network.
        • Whether it is MPLS or point-to-point, the transport network must have a maximum of 50ms RTT. This value depends also on the ACI firmware release.
      • Each Pod owns a separate control plane. However, the spines on both Pods exchange COOP entries using MultiProtocol BGP over Ethernet VPN (MP-BGP EVPN).
      • It involves an InterPod Network IPN consisting of IPN devices.
      • IPN devices:
        • can be routers or modular switches, which support MP-BGP.
        • must support Multicast PIM BiDir mode in order to correctly forward BUM traffic between Pods.
        • at least one IPN device connects to some of all spines per Pod. Ideally two IPN devices connect to all spines per Pod.
        • establish OSPF peering with spines of each pod.
        • have in their routing tables the TEP pool prefixes of the Pods.
        • Each IPN device installs a multicast source-group pair (*, G), with G = the GIPO value of each Bridge Domains of the attached Pod.
        • It is recommended to ensure the existence of a physical path anytime between an IPN device A and an IPN device B, whether they are connected to the same Pod or not.
        • Between IPN devices use 10/40/100Gbps connectivity
      • The spines that are peering with the IPN devices perform mutual redistribution:
        • they redistribute IS-IS prefixes (local TEP pool prefix) into OSPF, and
        • redistribute OSPF prefixes they learned from IPN devices (these OSPF prefixes are the remote TEP pool prefixes) into IS-IS, in order to let the local leafs learn them and know how to reach remote TEP addresses.
      • when IP communication fails between the Pods:
        • the Pod with the APIC majority still operates in read/write
        • the Pod with the APIC minority operates in read-only mode. When communication is restored, it synchronizes its database.

      ACI Integration With Puppet

      • Puppet is a data center orchestration framework.
      • Puppet configuration includes preparing modules that will be downloaded onto puppet-compatible hardware platforms
      • Puppet components:
        • Puppet Master: the server that hosts the modules
        • Puppet Agents: installed on the Nexus switches
      • Cisco Nexus 9000 support Puppet natively in its API, i.e. we can install Puppet modules on the nexus switch.
      • A Puppet module contains configuration of a certain feature, for example SNMP, VRF, interface speed,… As soon as the module is downloaded on the switch, the changes in the config are visible.

      I’ve distilled all these notes from my ACI study material:

      Categories: Cisco ACI

      Keyboard Banger

      Keyboard Banger is a network engineer from Africa. He has been working in network support and administration since 2008. He started writing study notes about certification exams and technology topics a couple of years ago. When he's not writing articles, he can be found wandering on technical forums.

      0 Comments

      Leave a Reply

      Your email address will not be published. Required fields are marked *