Network Congestion Part 3: Packet Loss and the Impact on Network Performance
In part one of this series I discussed network congestion inside a private network and in part two, congestion across the Internet. In this month’s installment I will discuss packet loss and the impact this has on network performance.
What is “packet loss”? A packet is, to put it simply, an envelope of data. Let’s say you are sending an email to a friend. To make it possible to send the message the data is converted into a format called MIME (Multipurpose Internet Mail Extensions), a standardized character set capable of being sent across the Internet. Once the message has been converted it is broken up into small pieces (packets), which the Internet can transmit to your (or your ISP’s), mail server. Packet loss, then, is merely the loss or corruption of these blocks or “packets” of data.
When packets are lost or corrupt they have to be resent. Each packet is assigned a calculated value as it is sent and that value goes along for the ride to the packet’s final destination. At the other end the packet’s value is recalculated and the first value assigned is verified against the new value. If these match the packet is considered good and it is saved. If not, a request to resend the bad packet is sent back to the transmitter and the packet is sent again. The process repeats until all packets are present, accounted for and correct.
So why care about packet loss? The most basic of answers is that is takes more time to resend a packet that is deemed bad, making your connection look saturated. If you have a lot of packet loss you could be looking at a great deal more time transferring information. Generally speaking up to 1% loss isn’t too bad, but it’s best to at least try to get as close to 0% as possible. This is guideline we use at Skyway West.
How to Test For Packet Loss
The easiest way to test for packet loss is to use our old friends ping and traceroute. You may remember these tools from last month’s article. I will be performing similar tasks here. Before I begin though I want to discuss the two most likely places you might expect to find packet loss:
1. Your network: Every cable and every device connected by cables on your network is susceptible to packet loss at some point in its life. Cables can wear and/or can be damaged over time. Most electronics have a life span of up to 6 years.
2. The Internet: As above, every cable and every piece of hardware from your ISP’s network to those across the entire Internet are also susceptible to packet loss. While failures do occur, most reliable ISPs monitor their networks so resolution of these kind of problems usually happens quickly. Packet loss can also occur as a result of large volumes of Internet traffic.
Testing Your Own Network for Packet Loss
To test for packet loss we first start by looking at our own network. In the first post of this series I spoke to you about testing your network for saturation. Pinging the devices in your network (assuming they are reachable and not being blocked by a software firewall) is a good indicator of loss on your network. Remember to also try pinging your network gateway. The following is an example of a healthy response:
Ping 192.168.1.1
Pinging 192.168.1.1 with 32 bytes of data:
Reply from 192.168.1.1: bytes=32 time<1ms TTL=64
Reply from 192.168.1.1: bytes=32 time<1ms TTL=64
Reply from 192.168.1.1: bytes=32 time<1ms TTL=64
Reply from 192.168.1.1: bytes=32 time<1ms TTL=64
Ping statistics for 192.168.1.1:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms
Responses that look like this might be an indicator of something wrong.
Reply from 192.168.1.22: bytes=32 time<1ms TTL=64
Reply from 192.168.1.22: Destination host unreachable (emphasis added)
Reply from 192.168.1.22: bytes=32 time<1ms TTL=64
Reply from 192.168.1.22: bytes=32 time<1ms TTL=64
The above could just meant the device on this IP is too busy to respond (which is a whole other kettle of fish – See part one of this series). It could also mean that something is wearing out or has been damaged or has a programming issue. The example below, however, indicates a definite problem:
Reply from 192.168.1.22: Destination host unreachable
Reply from 192.168.1.22: Destination host unreachable
Reply from 192.168.1.22: Destination host unreachable
Reply from 192.168.1.22: Destination host unreachable
Again this could be a firewall or programming issue, but if, after verifying these things and confirming you are checking the correct IP address, you are most likely left with a packet loss problem.
Testing the Internet for Packet Loss
This is a little more challenging in that inside your network there is typically only one hop between you and whatever device you are checking. On the Internet there will most likely be several. For the purposes of this testing we’ll need to start out with the traceroute (tracert), command. As we did last month we’ll use the www.mywebsite.com site for example:
tracert www.mywebsite.com
tracing to www.mywebsite.com (216.250.121.107), over a maximum of 30 hops
1 router.skywaywest.net (216.251.128.254) 2.310 ms 2.520 ms 2.724 ms
2 216.251.132.60 (216.251.132.60) 0.191 ms 0.184 ms 0.170 ms
3 v525.core1.yvr1.he.net (216.218.185.185) 0.265 ms 0.436 ms 0.422 ms
4 10gigabitethernet3-2.core1.mci3.he.net (184.105.222.22) 48.391 ms 48.422 ms 48.138 m
5 206.51.7.2 (206.51.7.2) 48.347 ms 48.670 ms 48.308 ms
6 ae-1.gw-dista-a.ga.mkc.us.oneandone.net (74.208.6.121) 48.272 ms 48.558 ms 48.712 ms
7 vl-982.gw-ps6.ga.mkc.us.oneandone.net (74.208.6.131) 48.781 ms 48.751 ms 49.160 ms
8 86 ms 85 ms 86 ms perfora.net [216.250.121.107]
Trace complete.
The above is our health/normal example. Each hop (each line of text), returns a response correctly. Now what about a response like this:
1 router.skywaywest.net (216.251.128.254) 2.310 ms 2.520 ms 2.724 ms
2 216.251.132.60 (216.251.132.60) 0.191 ms 0.184 ms 0.170 ms
3 v525.core1.yvr1.he.net (216.218.185.185) 0.265 ms 0.436 ms 0.422 ms
4 10gigabitethernet3-2.core1.mci3.he.net (184.105.222.22) 48.391 ms 48.422 ms 48.138 m
5 Destination host unreachable * * *
6 ae-1.gw-dista-a.ga.mkc.us.oneandone.net (74.208.6.121) 48.272 ms 48.558 ms 48.712 ms
7 vl-982.gw-ps6.ga.mkc.us.oneandone.net (74.208.6.131) 48.781 ms 48.751 ms 49.160 ms
8 86 ms 85 ms 86 ms perfora.net [216.250.121.107]
Notice line 5? This could mean one of two things. One option is packet loss. The more likely answer is the router is just too busy to answer or it has been programmed not to answer. The reason this is most likely is because the next hop returns a reply. Hop 6 has to send the response back through the router on line 5. Now for an example of what packet loss might look like:
1 router.skywaywest.net (216.251.128.254) 2.310 ms 2.520 ms 2.724 ms
2 216.251.132.60 (216.251.132.60) 0.191 ms 0.184 ms 0.170 ms
3 v525.core1.yvr1.he.net (216.218.185.185) * 0.436 ms *
4 10gigabitethernet3-2.core1.mci3.he.net (184.105.222.22) 48.391 ms * 48.138 m
5 206.51.7.2 (206.51.7.2) 48.347 ms 48.670 ms *
6 ae-1.gw-dista-a.ga.mkc.us.oneandone.net (74.208.6.121) 48.272 ms * *
7 vl-982.gw-ps6.ga.mkc.us.oneandone.net (74.208.6.131) * 48.751 ms 49.160 ms
8 86 ms 85 ms * perfora.net [216.250.121.107]
In this case look at lines 1 and 2. They appear to be fine. Line 3 however is missing some responses. Look at the rest of the trace. Each successive line following shows a similar pattern of missing ping returns. While this could just be an indication of a busy router, this is most likely packet loss. It would take other diagnostic tools that are beyond the scope of this article to say for certain but we can try to confirm a bit more by pinging the IP address found on line 3 – 216.218.185.185. If we see lines like the following that pretty well confirms our assertion.
Reply from 216.218.185.185: Destination host unreachable
Solutions: Correcting the problem
On your own network you can start by swapping cables to the devices that are not responding. If that doesn’t help you can check your configurations and see if you can reach the device by another PC on the network. There may be a problem with the computer you are using. If you are still unsuccessful you will need to consider changing out hardware.
The Internet is a completely different animal. To resolve this issue won’t be as easy. Depending on where the packet loss is, if the loss starts at the top of the of the traceroute, you can contact your ISP and they can investigate the possibility of loss on your service, within their networks or with their upstream provider(s). If the packet loss starts toward the bottom of the trace you might be able to contact the company at the other end to see if they know of the problem and are working on it internally, or let them know they need to contact their ISP to resolve the problem. Loss starting in the middle of a traceroute is a little more challenging. Your ISP may be able to send an email to the owner of the IP to have them investigate the issue. However, there is never a guarantee a response will come of it. If, like Skyway West, your ISP has multiple upstream providers, it may be possible to re-route your traffic until the problem goes away.
And there you have it. My hope now you have a bit of an understanding what network congestion and packet loss are. Diagnosing these issues can be somewhat of a pain with both congestion and loss happen together and the testing mapped out here is only what a networking professional might call first steps. There is a lot of reading on these topics out there. If you are interested in learning more about network troubleshooting try searching on Google and Wikipedia.
Got a question or an idea for a topic you would like to see covered in one of my upcoming blogs? Write to support@skywaywest.com and let me know. I’ll do what I can to address your questions or concerns either personally in a reply email or on the blog. Until next month, take care.
–Wes
1 thought on “Network Congestion Part 3: Packet Loss and the Impact on Network Performance”
Comments are closed.