Why does packet loss seriously hurt application performance over the WAN? How do we address it?

TCP was never optimized for high-bandwidth WANs or interactive applications over the WAN. Packet loss has the greatest impact on the applications performance  over the WAN, by design.

Why is packet loss such a killer? There are many reasons, most having to do with the nature of how TCP was designed especially how TCP does congestion control/congestion avoidance. The key issue revolves around dealing with contention for limited bandwidth.

TCP is designed to use all available bandwidth, and to use it “fairly” across flows on average. To do this, given that each end station and TCP flow doesn’t know how much bandwidth is available – neither if the single flow was the only one using bandwidth end-to-end at the moment, nor in the more typical case when given multiple flows, the amount available changes moment to moment.  So the sender of the TCP data needs a way to know when “enough is enough.” Packet loss is the basic signal of this.

TCP and routers together are designed to control data flow to prevent over-utilization of the network and the potential of congestion. The goals of TCP’s design are to minimize the amount of time that the data flow grinds to a halt (congestion avoidance), and to react appropriately to reduce traffic at those times that it does (congestion control).

TCP packets received by the receiving station are acknowledged back to the sending station. TCP is a window-based protocol, meaning that it can have a certain amount of traffic “in flight” between sending station and receiving station. It is designed to back off and substantially reduce the amount of bandwidth offered (by half) when packet loss is observed. Further, until the lost packet is received, and acknowledged by the receiver, only limited amounts of additional packets will be offered. Even for those applications that use multiple TCP flows, the similar principle applies that only so many new flows opened/packets sent until a lost packet is received at the other end and its receipt acknowledged.

Packet loss is detected in one of two ways. For a longer transfer where just a packet or two is lost, the sender notices and reacts to the loss when subsequent packets are acknowledged by the receiver, but not the missing one. Alternatively – and more typically for new or short TCP flows – packet loss is detected by the occurrence of a “timeout”: the absence of receipt of an acknowledgement of the packet. The amount of time until a “timeout” is deemed to have occurred varies typically between a couple hundred milliseconds and three seconds.

TCP is an elegant protocol designed over 40 years ago when CPU and memory was extremely expensive. This worked – and continues to work – fantastically well on high-bandwidth, low-latency LANs and on low-bandwidth, high-latency WANs. But TCP wasn’t designed to work optimally in the medium-to-high bandwidth, high-latency environment that characterizes most WAN use today. TCP also wasn’t designed optimally for running interactive applications (web browsing, remote desktop) across very long-distance WANs.  Thus, application performance suffers as packet loss and latency increase.

TCP particularly was designed so that each end station could make its decisions completely independently of every end station. This conservative approach contributes to network stability and minimization of congestion.

Because the amount of data offered into the network is reduced by half – and only increased slowly thereafter as packets received successfully are acknowledged – when a single packet loss is detected by the sending station, WAN packet loss can have a huge impact on large transfer performance.  This is why private networks, such as MPLS, VPLS or IEPL improve application performance so significantly: they nearly eliminate packet loss.

What else can be done about  packet loss? Well, at a standards-compliant end station, pretty much nothing. But for an intelligent device in the middle of the network, and especially one at a key WAN edge location, there are many possibilities. There are at least six different approaches to minimizing the impact of WAN packet loss on application performance:

– Drastically reduce the number of WAN packets transmitted.

– React differently to loss (if good knowledge of the network in between).

– Mitigate the effects of the loss and hide it from the end station.

– Enable the end stations to react more quickly to loss.

– Avoid much of the loss in the first place (think MPLS, VPLS, IEPL)

– Avoid the additional loss that often follows after a burst of loss.

Application layer solutions are the first, most obvious approach here.  Doing replicated file service avoids WAN packet loss in accessing files, delivering full LAN-speed performance, because all client access to the data is in fact done locally.

Similarly, “static” caching of objects via a local web (HTTP) object cache completely avoids WAN access for those objects, and thus any impact from packet loss.

Beyond these, drastically reducing the number of packets transmitted is an area where WAN Optimization offerings do a great job.  Now, since we’re talking about reducing the number of packets transmitted, you might think first of memory-based compression, which is one of the techniques almost every WAN Optimization solution offers. Memory-based compression can reduce the time it takes to do the first-time transmission of data – a factor of two for compressible data is typical – but in fact it doesn’t do proportionately better in the face of packet loss than when there is little or no loss. Reducing the amount of data sent by 50% doesn’t really help that much when it comes to packet loss and its impact on a window-based protocol like TCP. So while memory-based compression certainly doesn’t hurt here, it’s not really the answer when the problem is WAN packet loss.

There are two other technologies in most WAN Optimization products that do have a large performance impact in the face of packet loss: data deduplication, and CIFS-specific application proxy.  These techniques improve application performance.

Data deduplication essentially does “dynamic” caching of data locally, and while this requires at least one round-trip across the WAN, it will always involve far fewer such round-trip transactions than when the data is not stored locally. Besides saving bandwidth and speeding up data transfers in the more typical case of little to no packet loss, the application speed-up is proportionately greater still in the face of any meaningful amount of packet loss. And data deduplication is usually applicable for any application, not just for file access.

For the very chatty Microsoft CIFS protocol, data deduplication is usually combined with an application-specific proxy that will reduce round-trip requests still further. By essentially doing local CIFS termination, a CIFS proxy provides much faster access to files on a remotely located file server even for the first access. The impact on application performance of the combination of data deduplication and CIFS proxy can be 10 to 40 times even when there is no packet loss; in the face of packet loss, the additional benefit can be another 2x to 10x, meaning a combined performance impact of anywhere from 20x to 400x or more. For files that have been previously accessed across the WAN, this is essentially full LAN-speed performance, versus the very slow, often unusable WAN performance under packet loss if accessing large files across a WAN completely unaided.

Andy Gottleib is a twenty-five year data networking veteran, who founded Talari Networks, a pioneer in WAN Virtualization technology, and served as its first CEO, and is now leading product management at Aryaka Networks. Andy is the author of an upcoming book on Next-generation Enterprise WANs.  His bog is located at http://www.networkworld.com/community/blog/26142

Share this post