Quality of Service ( QoS)
Quality of service (QoS) configurations give special treatment to certain traffic at the expense of others. In a network with several hops, the available bandwidth is only as much as the smallest link. When multiple applications use the same links, the available bandwidth per application is even smaller—it equals the smallest link bandwidth divided by the number of flows.
Using QoS in the network can mitigate several issues including lack of bandwidth for important applications, delay of sensitive data or packet loss due to data being dropped at a congested interface. Network traffic experiences four types of delay:
- Processing Delay—The time it takes a packet to move from the input interface of a router to the output interface
- Queuing Delay—The length of time a packet waits in the interface queue before being sent to the transmit ring
- Serialization Delay—The length of time it take to place the bits from the interface transmit ring onto the wire
- Propagation Delay—The length of time it takes the packet to move from one end of the link to the other.
Initially, there will be one software queue and one hardware queue formed when traffic is transferred from one interface to another interface. By default, when a software queue is full (congested), the switch or router drops all other traffic bound for that queue. This is called ‘tail drop.’ Packet loss can cause jerky transmission of voice or video, slow application performance, and can even corrupt data.
Hardware Queue
The number of packets a TxQ can hold depends on the speed of the interface.
Software Queue
This is where you can influence the order in which packets are scheduled into the TxQ and the numbers of packets are sent. When using queuing mechanisms, several different logical queues are created and traffic is placed into the appropriate software queue when it arrives at the interface. Each software queue has size limits and packets above that limit are dropped.
Techniques to Implement QoS in an Enterprise Network
QoS can be implemented in an enterprise through classification, marking and queuing techniques. To classify packets, use an ACL or NBAR (deep packet inspection tool). The packets must be classified before applying any queuing techniques on an interface.
Classifications group network traffic into classes comprised of traffic needing the same type of QoS treatment. For instance, voice traffic is separated from web or email traffic. However, e-mail traffic may be placed in the same class as web traffic. Marking of the IP packet was traditionally done on the three IP precedence bits, but now marking the six bits on the IP header is considered the standard method of IP packet marking.
CoS uses the three 802.1p bits in 802.1Q trunking tag to mark the traffic. These three bits have eight possible values, ranging between zero and seven. IP Precedence uses three bits in the IP header so called ToS byte has the same range of values as does CoS .
Queuing Mechanisms
FIFO – By default, most interfaces use FIFO queuing. There is just one software queue and traffic is buffered and then scheduled onto the interface in the order it is received.
Priority Queuing - With Priority queuing, queues are assigned different priority values and placed in one of four queues. The high-priority queue is a strict priority queue, which means that it gets serviced before anything else until it is empty. After that, each queue is serviced in turn, as long as the priority queue remains empty. The lower-priority queues may never be serviced if there is sufficient traffic in higher-priority queues.
Class-Based Weighted Fair Queuing (CBWFQ) – Addresses some of the problems with FIFO. It allows manual configuration of traffic classification and minimum bandwidth guarantees. Therefore, you can group traffic into classes based on any of the criteria available and widely used in real QoS configurations.
Each class becomes its own FIFO queue at the interface. Each queue is assigned a weight based on the class bandwidth guarantee, and the scheduler takes packets from each queue based on those weights.
Low-Latency Queuing (LLQ) – This is a strict priority queue—traffic is sent from it before any other queues.
However, when the interface is congested, the priority queue is not allowed to exceed its configured bandwidth to avoid starving the other queues
Voice traffic is typically queued into the LLQ. You can place more than one class of traffic in the LLQ.
When the interface queue fills, all packets are dropped again, and all sessions reduce their sending again. Eventually, this results in waves of increased and decreased transmission, causing under-utilization of the interface.
Congestion Avoidance
RED and WRED – Random Early Detection (RED) and Weighted Random Early Detection (WRED) attempt to avoid tail drops by preventing the interface from becoming totally congested. Once the queue fills above a threshold level, it drops random packets from the interface queue, dropping a higher percentage of packets as the queue fills. Basic RED does not distinguish between flows or types of traffic. Weighted RED, on the other hand, drops traffic differently depending on its IP precedence or DSCP value.
Enterprise-Wide QoS Deployment Guide – A company might use a Service Level Agreement (SLA) to contract with their ISP for certain levels of service. This typically provides levels of throughput, delay, packet loss, and link availability, along with penalties for missing the SLA. Service providers use a set number of classes, and your marking must conform to their guidelines to use QoS SLAs.
A Quick Guide
Classify and mark traffic as close to the edge as possible, based on SLA. Once traffics are marked, apply queuing mechanisms according to your preferences to meet your SLA . For example, use LLQ for real-time traffic and use WRED in data queues while setting bandwidth for non-Voip traffic.

