Network Traffic Capture & Aggregation: Why buffer size is crucial

Latest News AND EVENTS

Stay up to date and find all the latest news and latest events from Metamako right here.

Network Traffic Capture & Aggregation: Why buffer size is crucial

Posted on August 15, 2017 by Matthew Knight 15 August 2017

In relation to network traffic capture, buffering is the process of an aggregation tap storing incoming network packets (frames), usually in memory, mainly for the purpose of absorbing short-term bandwidth oversubscription of the output port(s) by the input port(s), before egressing them.

Three main ways to capture Ethernet traffic

There are three ways in which you can capture Ethernet traffic:

  1. Tapping a link: Optical taps are widely used, however, due to their enforcement of fibre over copper media, the limitation of only being able to tap the link they are physically plugged in to and the additional rack space they take up, Layer 1 switches are increasingly becoming the enterprise standard for tapping. This is due to their higher tap port density and  their ability to be dynamically reconfigured to tap any port connected to/through them. Metamako devices even offer the ability to provide Layer 2 -style statistics on every port.

  2. Configuring an Ethernet switch: Switch ports can often be configured to mirror traffic to a dedicated egress port for connection to a capture device; some switches can be configured to mirror a single or multiple incoming or outgoing link(s) to a dedicated egress port for off-box capture/analysis. This is typically referred to as a SPAN port.

  3. On a networked host: On a Linux host, the libpcap interface provides the ability to request that the kernel replicate and deliver raw frames entering or leaving the host to an application running locally.

Each of the above capture methods has its pros and cons (also read: Avoiding the network tap dance), however in all cases, the ultimate goal is to to capture traffic without having to drop, possibly important, packets (frames). One key point above worth reinforcing is that an 10 GbE Ethernet link is bi-directional and can carry 20 Gbps of traffic. Capture ports are inherently unidirectional, so for complete bi-directional capture, two capture ports are required per link.

Why buffering is useful

The vast majority of network links are not running at anything near their capacity. Less than 10% average link utilisation is very common. Given that ports on capture devices can cost tens of thousands of dollars each, it really does not make sense to have to dedicate them to links running at a fraction of their capacity. Independently of the type of capture device ultimately used, they can be combined, or aggregated – multiple links with low utilisation feeding into a smaller number of capture ports.

This is the role of the aggregation tap. Of course, in any situation involving more links in than going out, without buffering, the instant aggregate ingress bandwidth cannot exceed the bandwidth of the egress port or packets (frames) will be dropped. This is simply due to there being nowhere to store even a single packet (frame) if two packets (frames) arrive simultaneously. By providing the ability to buffer incoming packets (frames) arriving on multiple ports simultaneously all packets (frames) are egressed and available for capture, as long as the buffer does not overflow (the aggregate ingress rate exceeds the egress rate for too long).

Aggregating ports without a buffer

The following diagram represents a situation where two Ethernet input streams are aggregated into a single stream with no buffering. The aggregated stream can only be switched between one or the other input stream at any time. Any time packets (frames) arrive on both input streams simultaneously, one of the packets (frames) will have to be dropped.

Unbuffered 2-way Ethernet switch.svg

 

The below two diagrams show a (contrived) scenario when packets (frames) are interleaved successfully as packets (frames) never arrive simultaneously and the real-world use-case where they do and the resulting loss - two of Stream B's packets (frames), new to the second diagram, coincide with Stream A's packets (frames) and so must be dropped:

 

unbuffered_aggregation_plot.svg

 

unbuffered_aggregation_collision_plot.svg

Managing oversubscription ratios

When aggregating captured traffic using an aggregation switch, it is worthwhile taking the time to understand the trade-off between the ratio of capture ports per egress port. If too many capture ports are being aggregated into a single egress port and both capture and egress ports are the same speed e.g. 1 GbE or 10 GbE, and if the aggregate instantaneous bandwidth of the capture ports exceeds the bandwidth of the egress port, the delta is either buffered or dropped. To mitigate the problem of dropped packets (frames), either larger buffers can be used or the ratio of capture ports per egress needs to be adjusted downward. Even a 2:1 aggregation ratio can cause a problem if the sum of the instantaneous traffic bandwidth on each capture port exceeds the bandwidth of the output port.

Buffering and line rate

The term line rate is readily understood to mean that a given link is saturated with data. In the case of Ethernet this means frames of any valid sizes separated by an average of 12bytes of Inter-Frame Gap (IFG). It is a term that relates to bandwidth. It is important to keep in mind that bandwidth is generally expressed in units of bits per second or bps. So if a 10 GbE link is at 50% bandwidth utilisation over a second, 625 MB is being transferred in that second (actually somewhat less when IFGs are taken into account).

What is not apparent however is the distribution of that data over the second:

  • Was all the data sent in the first half of the second and the line was idle for the second half?
  • Were the frames spread out so the average utilisation was consistently 50% throughout the second?
  • Was the frame distribution throughout the second random but with an average bandwidth utilisation of 50%?

In fact, the instantaneous bandwidth utilisation of an Ethernet link can really either be 100% (frame) or 0% (idle). Though Ethernet bandwidth is usually quoted over a second, when it comes to buffering, the traffic patterns within that second can be extremely important, especially the length of the periods where the link is at line rate (100%) i.e. back-to-back frames.

The importance of buffer size

At this point, it is worth looking at what constitutes instantaneous. Network bandwidth is usually expressed in kilo-, mega- or giga- bits per second (Kbps, Mbps, Gbps) and memory in Bytes with the same prefixes being applied. For any given amount of buffer memory, bandwidth can be converted into the amount of time it takes to fill it. For example, assuming we have a typical aggregation switch with 10 MB of buffer and two 10 GbE capture ports into a single 10 GbE egress port, what is the longest line rate burst that can be absorbed without dropping any packets (frames)?

The formula is:

$$\textrm{Time to fill Buffer}=\cfrac{\textrm{Buffer Size (MB)}}{\left(\textrm{Number of Capture Ports}\times\cfrac{\textrm{Capture Port Bandwidth (Gbps)}}{\textrm{8 (bits)}}\right)-\cfrac{\textrm{Egress Port Bandwidth (Gbps)}}{\textrm{8 (bits)}}}$$

N.B. As the buffer can drain to the egress port at 10 Gbps, the buffer will only grow at 10 Gbps rather than the incoming 20 Gbps.

 


Filling it in:

$$\cfrac{\textrm{10 (MB)}}{\left(\textrm{2}\times\cfrac{\textrm{10 (Gbps)}}{\textrm{8 (bits)}}\right)-\cfrac{\textrm{10 (Gbps)}}{\textrm{8 (bits)}}}=\cfrac{\textrm{10 (MB)}}{\textrm{1.25 (GB/s)}}=0.008\textrm{ s or }8\textrm{ ms}$$

 

If we increase the aggregation ratio to a more realistic 10:1, how does this affect our ability to absorb bursts?

$$\cfrac{\textrm{10 (MB)}}{\left(\textrm{10}\times\cfrac{\textrm{10 (Gbps)}}{\textrm{8 (bits)}}\right)-\cfrac{\textrm{10 (Gbps)}}{\textrm{8 (bits)}}}=\cfrac{\textrm{10 (MB)}}{\textrm{11.25 (GB/s)}}=0.0009\textrm{ s or }900 \mu\textrm{s}$$

 

In summary, with 2:1 oversubscription, a line rate burst on both capture ports longer than 8 ms would cause the buffer to overflow and almost certainly result in lost capture packets (frames). At a 10:1 oversubscription, a line rate burst on all ten capture ports longer than just under one ms would have the same impact. It is important to keep in mind that in these worked examples, the aggregate per second line bandwidth utilisation could be as low as 1% and the buffer would still overflow. All it takes for a short burst of traffic (a microburst) on multiple otherwise idle aggregated Ethernet links at the same time.

The Solution: Using Deep Buffers

Deep buffering allows many more ports to be aggregated into a single link per device and hence fewer aggregation taps.

These deep buffers permit aggregation at significantly higher link utilisation and/or higher aggregation ratios. The buffers can hold multiple seconds of captured packets (frames) at 10 GbE coming in across multiple links. In short, deep buffers can smooth out much longer bursts and weather far longer periods where egress bandwidth is oversubscribed without dropping packets (frames).

An Overview of Metamako Deep Buffering

Metamako's MetaWatch application turns one of Metamako's K-Series Devices into an extremely powerful aggregation tap. The K-Series devices can have 32, 48 or 96 ports and can capture any 30 ports into either a 8 GB or 32 GB bufferTo put this into perspective, this buffer is 1000 times larger than that of most ASIC-based aggregation taps. Taking our earlier worked examples and adjusting them for 8 GB of buffer:

2 x 10 GbE capture:

$$\cfrac{\textrm{8 000 (MB)}}{\left(\textrm{2}\times\cfrac{\textrm{10 (Gbps)}}{\textrm{8 (bits)}}\right)-\cfrac{\textrm{10 (Gbps)}}{\textrm{8 (bits)}}}=\cfrac{\textrm{8 000 (MB)}}{\textrm{1.25 (GB/s)}}=6.4\textrm{ s}$$

10 x 10 GbE capture:

$$\cfrac{\textrm{8 000 (MB)}}{\left(\textrm{10}\times\cfrac{\textrm{10 (Gbps)}}{\textrm{8 (bits)}}\right)-\cfrac{\textrm{10 (Gbps)}}{\textrm{8 (bits)}}}=\cfrac{\textrm{8 000 (MB)}}{\textrm{11.25 (GB/s)}}=0.7\textrm{ s}$$

These deep buffers permit aggregation at significantly higher link utilisation and/or higher aggregation ratios. The buffers can hold multiple seconds of captured packets (frames) at 10 GbE coming in across multiple links. In short, deep buffers can smooth out much longer bursts and weather far longer periods where egress bandwidth is oversubscribed without dropping packets (frames).

Scale Capture/Analytics for average rather than peak aggregated rates

MetaWatch has a further trick up its sleeve, it can accept IEEE 802.3x PAUSE Frames on its aggregated egress ports. When receiving them, the port will back pressure the buffer to reduce the egress bandwidth. Any device consuming the aggregated stream supporting Ethernet Flow Control can take advantage of this useful feature. For example, the ixgbe Linux driver for Intel 10 GbE adapters has Ethernet Flow Control enabled by default and the majority of Ethernet Switches support it. Capture and Analytics devices that can generate PAUSE Frames can therefore be scaled down in performance and hence, cost, to take advantage of the deep upstream buffer.

In Summary

  • Buffers are required in any realistic aggregation scenario
  • The size of the aggregation tap's buffer dictates how many ports can be aggregated
  • The ASICs in most aggregation taps only have around 10 MB of buffer allowing line rate bursts at 10 GbE on two ports simultaneously of only single digit milliseconds
  • MetaWatch has buffers of 8 GB or 32 GB allowing far greater aggregation ratios to be configured with a vastly reduced likelihood of packet (frame) loss
  • MetaWatch can also moderate its aggregated egress rate in response to Ethernet Flow Control configured on the consuming device allowing a less costly device to be used

You may also want to read:

New Call-to-action