MetaMux breaks the 100 ns barrier

Latest News AND EVENTS

Stay up to date and find all the latest news and latest events from Metamako right here.

MetaMux breaks the 100 ns barrier

Posted on April 18, 2015 by David Snowdon 18 April 2015

Our devices are some of the best-tested in the world. Not only do we (at Metamako) put them through their paces during design, simulation, qualification and testing, but our customers perform their own tests. We are regularly asked to prove our bold claims and it's very important to us that we are honest about our devices' capabilities and performance. We describe here in detail how we've tested the MetaMux latency to our own satisfaction, but if you see holes in the methodology, we'd be very keen to hear it – info@metamako.com

I've already discussed our methodology for testing device latency in this article, including the sources of error in the measurements. Things have moved on a bit since I wrote that article, and so I will describe it again. 

Test configuration

At a high level, the test setup is as one would expect. We generate some packets on 10GbE, then pass those packets through a system under test. We record the time at which the packet enters the system under test, and that it exits the system under test using splitters and an Endace DAG 9.2X2 card, which has a published accuracy of 7.5 ns. Since the timestamps are recorded by the same device, there is no need for synchronisation to another clock source. 

Basic test setup

Our internal test infrastructure is configured using two MetaConnect 48 devices – one which acts as a patch panel in our production rack, and another which acts as a patch panel in our test rack. There are a number of passive direct-attach cables between them. As such, we actually use a MetaConnect 48 device to perform the splitting for these tests. Any latency added by these devices is calibrated out as part of the tests below. Using MetaConnect devices to perform the patching in our test and production racks gives a lot of flexibility but maintains the determinism (as evidenced by the results to follow here). Our test setup looks this:

Actual test setup

To generate packets we simply use a flood ping between the two NIC cards. This is sufficient for latency testing the system under test. Alternate traffic patterns could easily be generated. Traffic outbound from NIC1 is passed through the system under test. Return traffic from NIC 2 goes directly through the MetaConnect. Therefore we see two copies of the same packet on each of the DAG capture ports, and the delay between the two is the time for the packet to make it through the 7m direct-attach copper cable, through a MetaConnect 48, and then through the system under test. We then vary the latency of the system under test in order to compare multiple SUTs. 

Methodology

The methodology used for testing is very similar to that which is discussed for MetaConnect 16. We start out with five fibres connected using fibre couplers. The system under test looks like this:

Four fibre couplers

This forms the zero-hop case – there are no hops through the MetaMux in this configuration, and this gives a base-line measurement against which we can compare the MetaMux performance. 

We configure a MetaMux device with four 8-to-1 muxes. Each mux is configured to receive data on eight ports and send it out a single port. We disconnect one of the fibre couplers and connect the fibre with the incoming stream to one Mux's input, and connect other fibre to the output of the MetaMux. It's connected like this:

One hop

Having measured the latency through that configuration, we then test a third system – with two hops... 

Two hops

And so on, until we have replaced all of the fibre-couplers with hops through the MetaMux. Since the fibre-couplers clearly have zero latency, the difference in the latency of the different test configurations is related to the number of hops through the MetaMux – in other words, the only thing that changes between one test configuration and another is the number of hops through the MetaMux (the fibre distance stays the same in all cases). 

The fibres connected were 1 m OM1 multi-mode, and the SFP+ modules used were Finisar FTLX8571D3BCV, running at 10G. For completeness, the network cards used were Mellanox ConnectX3. 

Each test ran for 30 s. The flood ping used generated approximately 800,000 packets in that time (e.g. each sample below is around 800,000 samples).  

Results

The results were captured using the Emulex tools, and the run through this script – decode.py

Hops Min Mean Median Max StdDev
0 89.406967 95.636179 96.857548 104.308128 4.88886
1 178.813934 194.97804 193.715096 216.066837 5.478998
2 275.671482 292.934851 290.572643 312.924385 5.652144
3 365.078449 390.376401 387.430191 409.781933 6.354743
4 469.386578 492.820367 491.738319 514.090061 6.485671

Plotting these results and fitting a linear model, we see a clear trend. The slope of this line is nanoseconds per hop – the latency of each MetaMux hop. The high R^2 value gives a good confidence that the results are meaningful, and the slope of 98.977 ns shows that the MetaMux latency is 99 ns per hop, including the SFP+ modules's latency. The minimum, maximum and standard deviation values show the determinism for the Mux – we increase the non-determinism only slightly compared with the zero-hop case. Note that there is no tail – these maximum and minimum values are the outliers. 

MetaMux latency results

Similar tests conducted for loopback using the matrix switch in MetaMux showed the latency to be 5.1 ns per hop for pure layer 1 switching. This differs from MetaConnect's 4 ns latency, because of the much larger number of ports which can be connected via the internal matrix switch.

Conclusions

We have measured the MetaMux latency and shown it to be under 100 ns, including the latency incurred by the Finisar SFP+ modules used. For more information, use cases, etc, have a look at the product page for MetaMux.