The FD.io Vector Packet Processing (VPP) release 20.05 is now available at https://packagecloud.io/fdio/release. Full release notes can be found at https://docs.fd.io/vpp/20.05/d7/d84/release_notes_2005.html
But for those looking for the “reader’s digest” highlights, this blog is for you. There are three key takeaways herein:
- First, Release 20.05 introduces a number of feature enhancements – each of which is summarized below, including a brief explanation of why you should care.
- Second, the FD.io community is particularly enthusiastic about VPP’s IPsec performance, a notoriously difficult encryption protocol to support without cratering throughput – unless you invest in behemoth, and costly, hardware. VPP breaks that paradigm. We’ll use this blog to highlight the value of the CSIT project, as well as its most recent IPsec performance testing results.
- Third, and perhaps most importantly, many of our blog readers might get a bit ‘bleary-eyed’ from detailed feature and performance factoids. We understand. We’ll take this opportunity to highlight a key use case story, and several well-written technology articles that keep the ‘bigger picture’ front of mind for all of us.
Read on, and let us know anytime how we can communicate with you even better!
Before we delve into performance metrics, we’d like to mention the project that generates our source data. The Continuous System Integration and Testing (CSIT) project provides ongoing, automated testing for FD.io software projects. The project’s tests focus on functional and performance regressions – constantly validating key network test benchmarks like No Drop Rate (NDR), and Maximum Receive Rate (MRR). For more information on the CSIT tests and reports, visit: https://docs.fd.io/csit/master/doc, https://docs.fd.io/csit/master/report and https://docs.fd.io/csit/master/trending.
The first performance figures we’d like to highlight are around IPsec. FD.io VPP 20.05 testing with a 1518 byte packet size utilizing a four (4) core CPU yields an impressive NDR rate of 47 Gbps. These results hold whether using 1000, 10,000, 20,000 or even 40,000 tunnels. Full testing was performed with varying packet sizes and encryption algorithms (see highlighted results below), using a 2.5GHz Skylake Platinum 8180 CPU with Turbo boost off, and two (2) threads/core. Employed NICs were Intel xxv710-DA2 2p25GE.
|Packet size||#cores||#tunnels||Throughput (GBps)|
For a list of all the CSIT IPsec test results please visit https://docs.fd.io/csit/master/report/detailed_test_results/vpp_performance_results/crypto.html
Remote Direct Memory Access (RDMA) is a technology that allows computers in a network to exchange data in main memory without involving the processor, cache or operating system of either computer. VPP 20.05 now supports RDMA. CSIT has started testing with the Mellanox-CX556A. Using a packet size of 64 bytes, here are some of the highlights. CSIT RDMA trending data can be found at https://docs.fd.io/csit/master/trending/trending/ip4-2n-clx-cx556a.html#b-ip4routing-base-rdma
Maximum Receive Rate (MRR):
Without impeding performance, FDio VPP continues to be enriched with new features. The full feature list for 20.05 can be found in the release notes referenced above. Highlights are excerpted here for convenience:
VPP is now fully integrated with the Kubernetes Calico plugin. As a result, containers can take advantage of VPP’s unrivaled IPsec performance. To see how this transpired, read this article: https://medium.com/fd-io-vpp/getting-to-40g-encrypted-container-networking-with-calico-vppon-commodity-hardware-d7144e52659a.
Snap packaging includes everything needed to run an application in a single, compressed binary package. This is quite useful for dodging distro library version issues. Users can now build on ubuntu 20.04, run on ubuntu 18.04 , or vice versa, without incident.
Arm Neoverse-N1 now has the multi-architecture support mechanism. Key node processing functions will be compiled with dedicated optimized compiler options, tuned per CPU architecture, and enabled with selected runtime in the initialization stage. End users will now realize higher performance when running VPP on Arm Neoverse-N1 CPUs.
Generic Segmentation Offload (GSO) is a technique that improves throughput. NICs receive jumbo frames up to 64KB in size from applications. GSO then reduces per packet processing in the network stack. Essentially, packets are segmented into smaller chunks using software-based segmentation, if the physical NIC does not support TCP Segmentation Offload (TSO). Hardware NICs also support some encapsulated types, e.g., VXLAN, and IPIP for TSO.
Although GSO/TSO was already supported on physical NICs which utilizes DPDK, the VPP bonding driver lacked this support. If the physical interface was part of the bonding, and GSO was not supported in bonding, TSO benefit for the physical NIC was unrealized. By adding GSO support to the VPP bonding driver, the full benefit of GSO for the physical NIC is achievable.
Additionally, VPP previously implemented software-based segmentation for regular non-encapsulated traffic. But, overlay network support was missing for GSO. Overlay networks, therefore, could not benefit from GSO throughput optimization. With 20.05, software-based GSO support for VXLAN tunnels has been implemented in VPP. VXLAN-based overlay networks can now see improved network throughput.
Internet Key Exchange Protocol (IKEv2)
Users can now use ikev2_profile_set_ipsec_udp_port API message to specify a custom UDP port for IPSec communication (different from 4500 which IPsec uses by default when UDP
encapsulation is turned on). Both IKE peers are now able to detect the liveness of their partner by periodically sending an empty informational request. In a case where the responder goes down, the initiator starts a new initiation process. If the responder cannot reach the initiator, it cleans up all associated connections for that initiator.
FD.io VPP now uses DPDK version 20.02.
The goal of Quicly Crypto Offloading is to use the VPP crypto native engine for the encryption/decryption of Quic packets.
Default behavior has been to pass each packet, one at a time, to the Quicly external library. With this patch, VPP now retains several packets, and uses its crypto API to encrypt/decrypt
packets on a batch basis, thus optimizing Quic communication and increasing throughput.
VRRPv3 (RFC 5798) provides failover capability for routers on IPv4 and IPv6 networks. Two or more routers, if configured with a VRRP Virtual Router, elect a master to forward packets. If the master becomes unavailable, the remaining peers will elect a new master, which will seamlessly takes over forwarding for the Virtual Router.
When creating an IPsec SA via API, the UDP source port can now be specified (needed for NAT traversal). When IKE negotiates through a NAT, the source port is changed from port 4500 to an alternate port. With the SA now having this knowledge, packets are properly encapsulated upon return. For more information on VPP multi-point tunnels, visit: https://wiki.fd.io/view/VPP/IPSec#Multi-point_Tunnels
Linux Kernel provides virtual network devices TUN/TAP. TAP is layer 2 device. VPP has implemented TAP interface using virtio backend for fast communication between VPP and applications running on top of Linux network stack.
TUN is layer 3 point-to-point interface. It is faster as compared to TAP interface, as IP packets can traverse through it without ethernet header. Its support has been implemented using virtio backend in VPP, so users can leverage faster communication between VPP and host applications.
A number of interesting articles have been written about Fd.io VPP recently. Here are a few that are noteworthy:
- US Army Cyber School Deploys 100 Gbps Router Network with TNSR®: https://www.netgate.com/blog/us-army-cyber.html?utm_campaign=TNSR&utm_content=129157545&utm_medium=social&utm_source=twitter&hss_channel=tw-80797684
- Kernel bypass networking with FD.io and VPP: https://medium.com/swlh/kernel-bypass-networking-with-fd-io-and-vpp-fc3a53a669f9?source=friends_link&sk=ab92fa42f7ffdfb6dca39ae9601f3d3e
- Building fast QUIC sockets in VPP: https://blogs.cisco.com/cloud/building-fast-quic-sockets-in-vpp
- How FD.io VPP Release Improves Multicore IPSec: https://www.lfnetworking.org/blog/2020/03/23/how-fd-io-20-01-release-improves-multicore-ipsec/
- The UDPI (Universal Deep Packet Inspection) project: https://wiki.fd.io/view/UDPI