This is another topic of heated debate, and it changes from network to network, but I found a simple approach that works in most cases. Since I have four queues and four classes of traffic, I need to categorize my important traffic into four classes. Strictly for explanation purposes I took some liberty in defining four categories of traffic that are very effective in both large and small networks. These classes are: Real Time Protocol (RTP), Network Management (NetMgt), Business Critical, and the Default.
Assume that in this particular network we will use IP-based phones and video endpoints. Then we need to create a class reserved strictly for those voice and video streams, and nothing else. I will not put any TCP traffic in this class whatsoever due to its traffic flow characteristics. My intent is to protect these voice and video RTP streams because they are extremely sensitive to delays (latency and jitter) and packet loss (drops). That is unacceptable with these traffic flows.
Network Management traffic is all traffic used to manage, maintain, and monitor the network. This includes traffic such as Telnet, SSH, NetFlow, SNMP, and Call Control (SCCP, H.323, SIP, MGCP). This class is often overlooked, but it is extremely important. There are a lot of networks out there where the Network Management class (or its equivalent) is ranked lower than the Business Critical class, which can slow down your ability to assess the problem quickly.
Network Management traffic is never very much, but if it can’t access the problem devices due to congestion, then it does you no good whatsoever. Don’t cut yourself off from your network; give yourself and your tools the priority they need to solve problems when they occur.
This class changes for every network. It all depends on what network services (Oracle, SQL, Web, etc.) are in use and critical. The key word here is critical. What is critical? E-mail is not. Critical is something that is time-sensitive and crucial to the operations of your business. For example, for a call center, the data-base queries to pull up information on the people calling in is critical. In a hospital environment, life-support monitoring is critical. In a stock brokerage, the service that buys and sell stocks is critical. Every business is different and has to be evaluated individually. The main thing to remember here is it must be time-critical, and if you momentarily lose it, the impact on your business would be huge.
This is all the other traffic on our network that we have not singled out. If we have done our job correctly, we have thinned out this queue by moving certain traffic flows to the other queues, thus leaving room for the remaining traffic.
How Much Bandwidth do I Allocate to Each Class?
How much bandwidth? Well, if you don’t already have a good feel for the amount required for each class, then stick with the defaults (on switches it is 25% for each of the four queues). Also, since these are “minimum” bandwidth guarantees, they can go over 25% if the bandwidth is available. If you don’t know how it needs to be moved, then leave it alone and spend the time setting up monitoring tools to see what you really need.
Should I Have a “Scavenger” Class
Scavenger class is intended for undesirable traffic (i.e., virus, worms, etc.) and non-productive or employee-distracting applications. We used to be able to include bit-torrent traffic and social networking web sites in this, but bit-torrents are becoming common for downloading large files (e.g., Linux distributions), and many companies use social networking sites (Facebook, YouTube, Linked-in, etc.) for marketing and support information.
The scavenger class of traffic will reside in the same queue as the default class of traffic. Some switches (with adjustable thresholds) will allow you to have multiple classes in each queue and still penalize one class more than another. You need to check the capabilities of your switches to determine if you have adjustable thresholds on your queues; otherwise it doesn’t do us much good.
Generally, I don’t create a scavenger class the first time through, but as I fine-tune my QoS policy, I can add one later if it is beneficial.
Do I Implement From the Core Out to the Edge or the Edge Into the Core?
This is a personal preference. However, I do have an opinion on this. I suggest you go from the edge in towards to the core. The reason is that it does me no good whatsoever to implement QoS in the core layers if nothing is classified or marked at the edge because everything is not trusted and treated the same. Yet I do benefit from implementing QoS (classifying, marking, and queuing) at the edge while having nothing at the core. At least the traffic headed to the core is marked properly and can be given the correct preferential treatment. We can also throttle back or even drop abnormal traffic at the edge.
Do I Trust or Not Trust Existing Markings?
A trust boundary is an imaginary line, normally at the edge of your networks, where you perform classification and marking of the packets that enter your network. An administrator can choose to “trust” the existing marking, or they can choose to classify and mark the packets as they see fit. Once inside this boundary, you should trust all DSCP markings at every interface, because they have already been classified and marked. The concept here is that nothing should be allowed across this imaginary line without being marked appropriately, and from that point on we can trust the markings.
Should I Use COS or DSCP Values in My QoS Policy?
Layer 2 COS values only exist if it is an 802.1q trunk link, meaning anytime the packets traverse across a nontrunk link, the Layer 2 COS values are thrown away. This means we will have to reclassify each packet again, which is not a good idea because of the processing load required to do this. DSCP values, on the other hand, are layer 3 and will remain with the packet for the entire life of the packet. It never gets thrown away. Thus we will never have to reclassify after it has been done once. To help avoid unnecessary headaches and processing loads, just use the DSCP values because they are never thrown away.
Now that we have covered the basics, let’s review the steps for developing a QoS policy that is just right and capable of growing if necessary.
Step 1. Start with four queues, and define four classes. I have found RTP, NetMgt, BusCritical, and Default to be a great starting point. You can always expand later if necessary.
Step 2. Identify what traffic goes into each class. Do your homework, find out what is truly critical, and try not to promote too much out of the default class.
Step 3. Start at the access layer of your network, don’t trust any markings, create your own policies that classify, and mark the traffic to meet your needs.
Step 4. Carry these same policies (markings and classes) throughout your entire network, access-layer, distribution, core layers, and WAN networks.
Step 5. Watch and monitor your network and adjust as needed.
Yes, this is a simplified approach to designing and implementing QoS for your network. Remember to keep it simple and consistent throughout your entire network. This will make your network much easier to manage and troubleshoot.
Reproduced from Global Knowledge White Paper: QoS, Keeping It Simple