Spanning Tree Protocol IEEE 802.1D

November 6, 2016 by Neel Rao

Filed under Network

Last modified November 6, 2016

A robust network design not only includes efficient transfer of packets or frames but also considers how to recover quickly from faults in the network. In a Layer 3 environment, the routing protocols in use keeps track of redundant paths to a destination network so that a secondary path can be quickly utilized if the primary path fails. Layer 3 routing allows many paths to a destination to remain up and active, and allows load sharing across multiple paths.

In a Layer 2 environment (switching or bridging), however, no routing protocols are used, and active redundant paths are not allowed. Instead, some form of bridging provides data transport between networks or switch ports. The Spanning Tree Protocol (STP) provides network link redundancy so that a Layer 2 switched network can recover from failures without intervention in a timely manner.

The STP is defined in the IEEE 802.1D standard.

STP is discussed in relation to the problems it solves in the sections that follow.


Bridging Loops

Recall that a Layer 2 switch mimics the function of a transparent bridge. A transparent bridge must offer segmentation between two networks, while remaining transparent to all the end devices connected to it. For the purpose of this discussion, consider a two-port Ethernet switch and its similarities to a two-port transparent bridge.

A transparent bridge (and the Ethernet switch) must operate as follows:

  • The bridge has no initial knowledge of any end device’s location; therefore, the bridge must“listen” to frames coming into each of its ports to figure out on which network a device resides.The source address in in incoming frame is the clue to a device’s whereabouts—the bridge assumes the source device is located behind the port that the frame arrived on. As the listening process continues, the bridge builds a table containing source MAC addresses and the Bridge Port numbers associated with them.

The bridge can constantly update its bridging table upon detecting the presence of a new MAC address or upon detecting a MAC address that has changed location from one Bridge Port to another. The bridge can then forward frames by looking at the destination address, looking up the address in the bridge table, and sending the frame out the port where the destination device is located.

  • If a frame arrives with the broadcast address as the destination address, the bridge must forward, or flood, the frame out all available ports. However, the frame is not forwarded out the port that initially received the frame. In this way, broadcasts can reach all available networks. A bridge only segments collision omains—it does not segment broadcast domains.
  • If a frame arrives with a destination address that is not found in the bridge table, the bridge is unable to determine which port to forward the frame to for transmission. This type of frame is known as an unknown unicast. In this case, the bridge treats the frame as if it were a broadcast and forwards it out all remaining ports. After a reply to that frame is overheard, the bridge learns the location of the unknown station and adds it to the bridge table for future use.
  • Frames forwarded across the bridge cannot be modified by the bridge itself. Therefore, the bridging process is effectively transparent.

Bridging or switching in this fashion works well. Any frame forwarded, whether to a known or unknown destination, will be forwarded out the appropriate port or ports so that it is likely to be received successfully at the end device. Figure 9-1 shows a simple two-port switch functioning as a bridge, forwarding frames between two end devices. However, this network design offers no additional links or paths for redundancy, should the switch or one of its links fail.




To add some redundancy, you can add a second switch between the two original network segments, as shown in Figure 9-2. Now, two switches offer the transparent bridging function in parallel.




Consider what happens when PC-1 sends a frame to PC-4. For now, assume that both PC-1 and PC-4 are known to the switches and are in their address tables. PC-1 sends the frame out onto network Segment A. Switch A and Switch B both receive the frame on their 1/1 ports. Because PC-4 is already known to the switches, the frame is forwarded out ports 2/1 on each switch onto Segment B.

The end result is that PC-4 receives two copies of the frame from PC-1. This is not ideal but is not disastrous either.

Now, consider the same process of sending a frame from PC-1 to PC-4. This time, however, neither switch knows anything about PC-1 or PC-4. PC-1 sends the frame to PC-4 by placing it on Segment A. The sequence of events is as follows:

  • Step 1>Both Switch A and Switch B receive the frame on their 1/1 ports. Because PC-1’s MAC address has not yet been seen or recorded, each switch records PC-1’s MAC address in its address table along with the  receiving port number, 1/1. From this information, both switches  infer that PC-1 must reside on Segment A.
  • Step 2> Because PC-4’s location is unknown, both switches forward the frame out all available ports (their 2/1 ports) and onto Segment B.
  • Step 3> Each switch places a new frame on its 2/1 port on Segment B. PC-4, located on Segment B, receives the two frames destined for it. However,        Switch A hears the new frame forwarded by Switch B, and Switch B hears the new frame forwarded by Switch A.
  • Step 4> Switch A sees that the “new” frame is from PC-1 to PC-4. From the address table, the switch had learned that PC-1 was on port 1/1, or Segment A. However, PC-1’s source address has just been heard on port 2/1 on Segment B. By definition, the switch must relearn PC-1’s location, which is now incorrectly assumed to be Segment B. (Switch B follows the same procedure, based on the “new” frame from Switch A.)
  • Step 5> At this point, neither Switch A nor Switch B has learned PC-4’s locationbecause no frames have been received with PC-4 as the source address. Therefore, the frame must be forwarded out all available ports in an attempt to find PC-4. This frame is then sent out Switch A’s 1/1 port and onto Segment A.
  • Step 6> Now, both switches relearn PC-1’s location as Segment A and forward the “new” frames back onto  segment B; then the entire process epeats.


Preventing Loops with Spanning Tree Protocol

 Bridging loops form because parallel switches (or bridges) are unaware of each other. STP was developed to overcome the possibility of bridging loops so that redundant switches and switch paths could be used for their benefits. Basically, the protocol enables switches to become aware of each other so they can negotiate a loop-free path through the network.

Loops are discovered before they are made available for use, and redundant links are shut down to prevent the loops from forming. In the case of redundant links, switches can be made aware that a link shut down for loop prevention should be quickly brought up in case of a link failure. This is discussed in later sections of this chapter.

STP is communicated between all connected switches on a network. Each switch executes the Spanning Tree Algorithm based on information received from other neighboring switches. The algorithm chooses a reference point in the network and calculates all the redundant paths to that reference point. When redundant paths are found, the Spanning Tree Algorithm picks one path by which to forward frames and disables, or blocks, forwarding on the other redundant paths.

As its name implies, STP computes a tree structure that spans all switches in a subnet or network. Redundant paths are placed in a Blocking or Standby state to prevent frame forwarding. The switched network is then in a loop-free condition. However, if a forwarding port fails or becomes disconnected, the Spanning Tree Algorithm recomputes the Spanning Tree topology so that blocked links can be reactivated.

Spanning Tree Communication: Bridge Protocol Data Units

 STP operates as switches communicate with one another. Data messages are exchanged in the form of Bridge Protocol Data Units (BPDUs). A switch sends a BPDU frame out a port, using the unique MAC address of the port itself as a source address. The switch is unaware of the other switches around it. Therefore, the BPDU frame has a destination address of the well-known STP multicast address 01-80-c2-00-00-00 to reach all listening switches.

Two types of BPDU exist:

  • Configuration BPDU, used for Spanning Tree computation
  • Topology Change Notification (TCN) BPDU, used to announce changes in the network topology.

The Configuration BPDU message contains the fields shown in Table 9-2.

Table 9-2 Configuration BPDU Message Content


Field Description                                                    Number of Bytes

Protocol ID (always 0)                                               2

Version (always 0)                                                     1

Message Type (Configuration or TCN BPDU)       1

Flags                                                                             1

Root Bridge ID                                                            8

Root Path Cost                                                           4

Sender Bridge ID                                                       8

Port ID                                                                       2

Message Age (in 256ths of a second)                    2

Maximum Age (in 256ths of a second)                 2

Hello Time (in 256ths of a second)                       2

Forward Delay (in 256ths of a second)                2

The exchange of BPDU messages works toward the goal of electing reference points as a foundation for a stable Spanning Tree topology. Loops can also be identified and removed by placing specific redundant ports in a Blocking or Standby state. Notice that several key fields in the BPDU are related to bridge (or switch) identification, path costs, and timer values. These all work together so that the network of switches can converge upon a common Spanning Tree topology and select the same reference points within the network. These reference points are defined in the sections that follow.

BPDUs are sent out all switch ports every two seconds so that current topology information is exchanged and loops are identified quickly.

Electing a Root Bridge

For all switches in a network to agree on a loop-free topology, a common frame of reference must exist to use as a guide. This reference point is called the Root Bridge.

An election process among all connected switches chooses the Root Bridge. Each switch has a unique Bridge ID that identifies it to other switches. The Bridge ID is an 8-byte value consisting of the following fields:

  • Bridge Priority (2 bytes)—The priority or weight of a switch in relation to all other switches. The priority field can have a value of 0 to 65,535 and defaults to 32,768 (or 0x8000) on every Catalyst switch.
  • MAC Address (6 bytes)—The MAC address used by a switch can come from the Supervisor module, the backplane, or a pool of 1024 addresses that are assigned to every Supervisor or backplane depending on the switch model. In any event, this address is hardcoded and unique, and the user cannot change it.

When a switch first powers up, it has a narrow view of its surroundings and assumes that it is the Root Bridge itself. This notion will probably change as other switches check in and enter the election process. The election process then proceeds as follows: Every switch begins by sending out BPDUs with a Root Bridge ID equal to its own Bridge ID and a Sender Bridge ID of its own Bridge ID. The Sender Bridge ID simply tells other switches who is the actual sender of the BPDU message. (After a Root Bridge is decided upon, configuration BPDUs are only sent by the Root

Bridge. All other bridges must forward or relay the BPDUs, adding their own Sender Bridge IDs to the message.)

Received BPDU messages are analyzed to see if a “better” Root Bridge is being announced. A Root Bridge is considered better if the Root Bridge ID value is lower than another. Again, think of the Root Bridge ID as being broken up into Bridge Priority and MAC address fields. If two Bridge Priority values are equal, the lower MAC address makes the Bridge ID better. When a switch hears of a better Root Bridge, it replaces its own Root Bridge ID with the Root Bridge ID announced in the BPDU. The switch is then required to recommend or advertise the new Root Bridge ID in its own BPDU messages; although, it will still identify itself as the Sender Bridge ID.

Sooner or later, the election converges and all switches agree on the notion that one of them is the Root Bridge. As might be expected, if a new switch with a lower Bridge Priority powers up, it begins advertising itself as the Root Bridge. Because the new switch does indeed have a lower Bridge ID, all the switches will soon reconsider and record it as the new Root Bridge. This can also happen if the new switch has a Bridge Priority equal to the existing Root Bridge but a lower MAC address. Root Bridge election is an ongoing process, triggered by Root Bridge ID changes in the BPDUs every two seconds.

As an example, consider the small network shown in Figure 9-3. For simplicity, assume that each Catalyst switch has a MAC address of all 0s with the last hex digit equal to the switch label.




In this network, each switch has the default Bridge Priority of 32,768. The switches are interconnected with FastEthernet links, having a default path cost of 19. All three switches try to elect themselves as the Root, but all of them have equal Bridge Priority values. The election is determined by the lowest MAC address—that of Catalyst A.

Electing Root Ports

 Now that a reference point has been nominated and elected for the entire switched network, each nonroot switch must figure out where it is in relation to the Root Bridge. This action can be performed by selecting only one Root Port on each nonroot switch.

STP uses the concept of cost to determine many things. Selecting a Root Port involves evaluating the Root Path Cost. This value is the cumulative cost of all the links leading to the Root Bridge. A particular switch link has a cost associated with it, too, called the Path Cost. To understand the difference between these values, remember that only the Root Path Cost is carried inside the BPDU. (See Table 9-2 again.) As the Root Path Cost travels along, other switches can modify its value to make it cumulative. The Path Cost, however, is not contained in the BPDU. It is known only to the local switch where the port (or “path” to a neighboring switch) resides.

Path Costs are defined as a 1-byte value, with the default values shown in Table 9-3. Generally, the higher the bandwidth of a link, the lower the cost of transporting data across it. The original IEEE 802.1D standard defined Path Cost as 1000 Mbps divided by the link bandwidth in Mbps. These values are shown in the center column of the table. Modern networks commonly use GigabitEthernet and OC-48 ATM, which are both either too close to or greater than the maximum scale of 1000 Mbps.

The IEEE now uses a nonlinear scale for Path Cost, as shown in the right column of the table. The Root Path Cost value is determined in the following manner:

Table 9-3 STP Path Cost

Link Bandwidth                   Old STP Cost New             STP Cost

4 Mbps                                    250                                          250

10 Mbps                                  100                                          100

16 Mbps                                  63                                            62

45 Mbps                                  22                                            39

100 Mbps                                10                                            19

155 Mbps                                6                                             14

622 Mbps                                2                                              6

1 Gbps                                    1                                               4

10 Gbps                                  0                                              2

The Root Path Cost value is determined in the following manner:

  • The Root Bridge sends out a BPDU with a Root Path Cost value of 0 because its ports sit directly on the Root Bridge.
  • When the next-closest neighbor receives the BPDU, it adds the Path Cost of its own port where the BPDU arrived. (This is done as the BPDU is received.).
  • The neighbor sends out BPDUs with this new cumulative value as the Root Path Cost.
  • This value is added to by subsequent switch port Path Costs as each switch receives the BPDU on down the line.

After incrementing the Root Path Cost, a switch also records the value in its memory. When a BPDU is received on another port and the new Root Path Cost is lower than the previously recorded value, this lower value becomes the new Root Path Cost. In addition, the lower cost tells the switch that the path to the Root Bridge must be better using this port than it was on other ports. The switch has now determined which of its ports has the best path to the Root—the Root Port.




The Root Bridge, Catalyst A, has already been elected. Therefore, every other switch in the network must choose one port that has the best path to the Root Bridge. Catalyst B selects its port 1/1, with a Root Path Cost of 0 plus 19. Port 1/2 is not chosen because its Root Path Cost is 0 (BPDU from Catalyst A) plus 19 (Path Cost of A-C link) plus 19 (Path Cost of C-B link), or a total of 38. Catalyst C makes a similar choice of port 1/1.

Electing Designated Ports

 By now, you should begin to see the process unfolding: a starting or reference point has been identified, and each switch “connects” itself toward the reference point with the single link that has the best path. A tree structure is beginning to emerge, but links have been identified only at this point. All links are still connected and could be active, leaving bridging loops.

To remove the possibility of bridging loops, STP makes a final computation to identify one Designated Port on each network segment. Suppose that two or more switches have ports connected to a single common network segment. If a frame appears on that segment, all the bridges attempt to forward it to its destination. Recall that this behavior was the basis of a bridging loop and should be avoided.

Instead, only one of the links on a segment should forward traffic to and from that segment. This location is the Designated Port. Switches choose a Designated Port based on the lowest cumulative Root Path Cost to the Root Bridge. For example, a switch always has an idea of its own Root Path Cost, which it announces in its own BPDUs. If a neighboring switch on a shared LAN segment sends a BPDU announcing a lower Root Path Cost, the neighbor must have the Designated Port. If a switch learns only of higher Root Path Costs from other BPDUs received on a port, however, it then correctly assumes that its own receiving port is the Designated Port for the segment.

Notice that the entire STP determination process has served only to identify bridges and ports. All ports are still active, and bridging loops might still lurk in the network. STP has a set of progressive states that each port must go through, regardless of the type or identification. These states actively prevent loops from forming and are described in the next section.

NOTE In each determination process discussed so far, two or more links having identical Root Path Costs is possible. This results in a tie condition, unless other factors are considered. All STP decisions are based on the following sequence of four conditions:

  1. Lowest Root Bridge ID
  2. Lowest Root Path Cost to Root Bridge
  3. Lowest Sender Bridge ID
  4. Lowest Sender Port ID

STP States

To participate in STP, each port of a switch must progress through several states. A port begins its life in a Disabled state, moving through several passive states and, finally, into an active state if allowed to forward traffic. The STP port states are as follows:

  • Disabled—Ports that are administratively shut down by the network administrator, or by the system due to a fault condition, are in the Disabled state. This state is special and is not part of the normal STP progression for a port.
  • Blocking—After a port initializes, it begins in the Blocking state so that no bridging loops can form. In the Blocking state, a port cannot receive or transmit data and cannot add MAC addresses to its address table. Instead, a port is allowed to receive only BPDUs so that the switch can hear from other neighboring switches. In addition, ports that are put into standby mode to remove a bridging loop enter the Blocking state.
  • Listening—The port will be moved from Blocking to Listening if the switch thinks that the port can be selected as a Root Port or Designated Port. In other words, the port is on its way to begin forwarding traffic. In the Listening state, the port still cannot send or receive data frames. However, the port is allowed to receive and send BPDUs so that it can actively participate in the Spanning Tree topology process. Here, the port is finally allowed to become a Root Port or Designated Port because the switch can advertise the port by sending BPDUs to other switches. Should the port lose its Root Port or Designated Port status, it returns to the Blocking state.
  • Learning—After a period of time called the Forward Delay in the Listening state, the port is allowed to move into the Learning state. The port still sends and receives BPDUs as before. In addition, the switch can now learn new MAC addresses to add to its address table. This gives the port an extra period of silent participation and allows the switch to assemble at least some address table information.
  • Forwarding—After another Forward Delay period of time in the Learning state, the port is allowed to move into the Forwarding state. The port can now send and receive data frames, collect MAC addresses in its address table, and send and receive BPDUs. The port is now a fully functioning switch port within the Spanning Tree topology.

To check who is root bridge :

Switch# show spanning-tree brief

STP Timers

STP operates as switches send BPDUs to each other in an effort to form a loop-free topology. The BPDUs take a finite amount of time to travel from switch to switch. In addition, news of a topology change (such as a link or Root Bridge failure) can suffer from propagation delays as the announcement travels from one side of a network to the other. Because of the possibility of these delays, keeping the Spanning Tree topology from settling out or converging until all switches have had time to receive accurate information is important.

STP uses three timers to make sure that a network converges properly before a bridging loop can form. The timers and their default values are as follows:

  • Hello Time—The time interval between Configuration BPDUs sent by the Root Bridge. The Hello Time value configured in the Root Bridge switch determines the Hello Time for all nonroot switches because they just relay the Configuration BPDUs as they are received from the root. However, all switches have a locally configured Hello Time that is used to time TCN BPDUs when they are retransmitted. The IEEE 802.1D standard specifies a default Hello Time value of 2 seconds.
  • Forward Delay—The time interval that a switch port spends in both the Listening and Learning states. The default value is 15 seconds.
  • Max (maximum) Age—The time interval that a switch stores a BPDU before discarding it. While executing the STP, each switch port keeps a copy of the “best” BPDU that it has heard. If the BPDU’s source loses contact with the switch port, the switch notices that a topology change occurred after the Max Age time elapses and the BPDU is aged out. The default Max Age value is 20 seconds.

The STP timers can be configured or adjusted from the switch command line. However, the timer values should never be changed from the defaults without careful consideration. Then, the values should be changed only on the Root Bridge switch. Recall that the timer values are advertised in fields within the BPDU. The Root Bridge ensures that the timer values propagate to all other switches.

Types of STP

So far, this chapter has discussed STP in terms of its operation to prevent loops and to recover from topology changes in a timely manner. STP was originally developed to operate in a bridged environment, basically supporting a single LAN (or one VLAN). Implementing STP into a switched environment has required additional consideration and modification to support multiple VLANs.

Because of this, the IEEE and Cisco have approached STP differently. This section reviews the three traditional types of STP that are encountered in switched networks and how they relate to one another. No specific configuration commands are associated with the various types of STP. Rather, you need a basic understanding of how they interoperate in a network.

NOTE The IEEE has produced additional standards for Spanning Tree enhancements that greatly improve on its scalability and convergence aspects. These are covered in Chapter 12, “Advanced Spanning Tree Protocol.” After you have a firm understanding of the more traditional forms of STP presented in this chapter, you can grasp the enhanced versions much easier.

Common Spanning Tree (CST)

 The IEEE 802.1Q standard specifies how VLANs are to be trunked between switches. It also specifies only a single instance of STP for all VLANs. This instance is referred to as the Common Spanning Tree (CST). All CST BPDUs are transmitted over the native VLAN as untagged frames.

Having a single STP for many VLANs simplifies switch configuration and reduces switch CPU load during STP calculations. However, having only one STP instance can cause limitations, too. Redundant links between switches will be blocked with no capability for load balancing. Conditions can also occur that would cause forwarding on a link that does not support all VLANs, while other links would be blocked.

Per-VLAN Spanning Tree (PVST)

 Cisco has a proprietary version of STP that offers more flexibility than the CST version. Per-VLAN Spanning Tree (PVST) operates a separate instance of STP for each individual VLAN. This allows the STP on each VLAN to be configured independently, offering better performance and tuning for specific conditions. Multiple Spanning Trees also make load balancing possible over redundant links when the links are assigned to different VLANs.

Due to its proprietary nature, PVST requires the use of Cisco Inter-Switch Link (ISL) trunking encapsulation between switches. In networks where PVST and CST coexist, interoperability problems occur. Each requires a different trunking method, so BPDUs will never be exchanged between STP types.

Per-VLAN Spanning Tree Plus (PVST+)

Cisco has a second proprietary version of STP that allows devices to interoperate with both PVST and CST. Per-VLAN Spanning Tree Plus (PVST+) effectively supports three groups of STP operating in the same campus network:

  • Catalyst switches running PVST
  • Catalyst switches running PVST+
  • Switches running CST over 802.1Q

To do this, PVST+ acts as a translator between groups of CST switches and groups of PVST switches. PVST+ can communicate directly with PVST by using ISL trunks. To communicate with CST, however, PVST+ exchanges BPDUs with CST as untagged frames over the native VLAN.

BPDUs from other instances of STP (other VLANs) are propagated across the CST portions of the network by tunneling. PVST+ sends these BPDUs by using a unique multicast address so that the CST switches forward them on to downstream neighbors without interpreting them first. Eventually, the tunneled BPDUs reach other PVST+ switches where they are understood.

By : N.R.Rao

SkyBird Technology Solutions Pvt Ltd.





Leave a Comment