DPDK 02
Introduction
In this section we will write a smiple C application to recieve packets.
Let’s dive in!
PreReqs
Make sure that
-
DPDK is built
-
A DPDK compatible NIC is binded to the
igb_uio
driver -
Hugepages are setup
-
Minimum system requirements are met
Please see DPDK-01 if any of the aforementioned pre-reqs are not set.
File Setup
From here on out we will refer to our dpdk installation directory as
RTE_SDK
.
- Go into the your dpdk installation directory and run the following command
export RTE_SDK=$(pwd)
- The directory should contain following folders
app/
build/
buildtools/
.ci/
config/
devtools/
doc/
drivers/
.editorconfig
examples/
.gitattributes
.gitignore
kernel/
lib/
license/
MAINTAINERS
Makefile
meson.build
meson_options.txt
README
.travis.yml
usertools/
VERSION
-
For our purposes, we are concerned with the
examples
directory and theusertools
directory. -
The
examples
directory contains the example codes and we will also create our application here -
The
usertools
directory contains user-help scripts such ascpu_layout.py
; prints out the Architecture,dpdk-devbind.py
; for binding/unbinding PCI devices,dpdk-hugepages.py
; for setting up hugepages
-
some of these we used at the time of setup.
-
Now, go to the examples folder and create a folder for our app
cd $RTE_SDK/examples
mkdir my_app
cd my_app
MAKE File
- lets create a simple Makefile
vim Makefile
- we are going to name our app
my_app
and usemain.c
as the source c file
# binary name
APP = my_app
# all source are stored in SRCS-y
SRCS-y := main.c
# Build using pkg-config variables if possible
ifneq ($(shell pkg-config --exists libdpdk && echo 0),0)
$(error "no installation of DPDK found")
endif
all: shared
.PHONY: shared static
shared: build/$(APP)-shared
ln -sf $(APP)-shared build/$(APP)
static: build/$(APP)-static
ln -sf $(APP)-static build/$(APP)
PKGCONF ?= pkg-config
PC_FILE := $(shell $(PKGCONF) --path libdpdk 2>/dev/null)
CFLAGS += -O3 $(shell $(PKGCONF) --cflags libdpdk)
LDFLAGS_SHARED = $(shell $(PKGCONF) --libs libdpdk)
LDFLAGS_STATIC = $(shell $(PKGCONF) --static --libs libdpdk)
CFLAGS += -DALLOW_EXPERIMENTAL_API
build/$(APP)-shared: $(SRCS-y) Makefile $(PC_FILE) | build
$(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_SHARED)
build/$(APP)-static: $(SRCS-y) Makefile $(PC_FILE) | build
$(CC) $(CFLAGS) $(SRCS-y) -o $@ $(LDFLAGS) $(LDFLAGS_STATIC)
build:
@mkdir -p $@
.PHONY: clean
clean:
rm -f build/$(APP) build/$(APP)-static build/$(APP)-shared
test -d build && rmdir -p build || true
Main.c
- Now lets move on to the main program
#include <stdio.h>
#include <stdlib.h>
#include <rte_eal.h>
#include <rte_common.h>
int main(int argc, char* argv[]){
printf("\n");
return 0;
}
EAL: “Environment Abstraction Layer”
- The first thing that is required is initialising up the
EAL
. We userte_eal_init()
function.- It gets parameters from cli adn sets up a some of the following:
- cpu_init: fill cpu_info structure
- log_init
- config_init: create memory configuration in shared memory
- pci_init: scan pci bus
- memory_init (hugepages)
- memzone_init: initialize memzone subsystem
- lcore_init: Create a thread per lcore
- pci_probe: probel all physical devices
- It gets parameters from cli adn sets up a some of the following:
- lets start by including
rte_eal.h
- contains the EAL configuration functions
- Contains the defitnion for
rte_eal_init()
- The
rte_eal_init
returns-ive
in case of an error.
- The
- See more here.
- we will also include
rte_common.h
- It contains generic, commonly-used macro and inline function definitions for DPDK.
- It contains the defitnion for
rte_exit()
- we would need to exit in case of an error.
- See more here
#include <stdio.h>
#include <stdlib.h>
#include <rte_eal.h>
#include <rte_common.h>
int main(int argc, char* argv[]){
printf("\n");
int ret;
/* The EAL arguments are passed when calling the program */
ret = rte_eal_init(argc,argv);
if (ret<0)
rte_exit(EXIT_FAILURE,"EAL Init failed\n");
argc -= ret;
argv += ret;
return 0;
}
Get Port Count
- We are going to be receiving on one port and transmitting on the other. For that we need even number of ports.
- lets first include
rte_ethdev.h
in our program.- includes functions to setup/configure an Ethernet device.
- Ethernet devices are represented by a generic data structure of type
rte_eth_dev
. - contains the definition of
rte_eth_dev_count_avail()
- returns the total number of dpdk binded devices.
- see more here
- lets do that nex within our main function
nb_ports = rte_eth_dev_count_avail();
if(nb_ports < 2 || (nb_ports & 1))
rte_exit(EXIT_FAILURE,"Invalid port number\n");
Logs
- We are going to include
rte_log
for logging.- contains log API to RTE applications.
- contains the
RTE_LOG()
function we are going to use for logging.
- define the User type logs as follows:
#define RTE_LOGTYPE_APP RTE_LOGTYPE_USER1
- now update the code as follows
/*...*/
nb_ports = rte_eth_dev_count_avail();
if(nb_ports < 2 || (nb_ports & 1))
rte_exit(EXIT_FAILURE,"Invalid port number\n");
RTE_LOG(INFO, APP, "Number of ports:%u\n",nb_ports);
/*...*/
MBUF Pool Creation
We need to reserve some memory for holding the packets in our application. In this section we are going to do just that
- We are going to include
rte_mbuf.h
for memory management.- The mbuf library provides the ability to create and destroy buffers that may be used by the RTE application to store message buffers.
- we are going to use
rte_pktmbuf_pool_create
to create set up a buffer to hold packets. - it uses
mempool
library. - Read DPDK Docs regarding
mbuf
here
- A good read about general concepts here
- First We are going to define
NUM_MBUFS
andMBUF_CACHE_SIZE
.
#include <rte_mbuf.h>
#define NB_MBUFS 8191
#define MBUF_CACHE_SIZE 250
#define RX_RING_SIZE 128
#define TX_RING_SIZE 512
- Now we are going to call
rte_pktmbuf_pool_create
in themain
function.- This function creates and initializes a packet mbuf pool.
- This function uses
rte_memzone_reserve()
to allocate memory.- It reserves a portion of physical memory from hugepages
- Required params are as follows
|---------------|-------------------------------------------------------|
| param | Description |
|---------------|-------------------------------------------------------|
| name | The name of the mbuf pool. we are setting it as |
| | "MBUF_POOL" |
| n | The number of elements in the mbuf pool. We are |
| | setting it as NB_MBUF * number of ports. The optimum |
| | size (in terms of memory usage) for a mempool is when |
| | n is a power of two minus one: n = (2^q - 1). |
| cache_size | Size of the per-core object cache. |
| | Set to MBUF_CACHE_SIZE |
| priv_size | Size of application private are between the rte_mbuf |
| | structure and the data buffer. Set to 0. |
| data_room_size| Size of data buffer in each mbuf, |
| | including RTE_PKTMBUF_HEADROOM. |
| socket_id | The socket identifier where the memory should be |
| | allocated. The value can be SOCKET_ID_ANY |
| | if there is no NUMA constraint for the reserved zone. |
|---------------|-------------------------------------------------------|
- If the creation is unsuccessful we are going to exit the program.
/* Create a new mbuf mempool */
mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL",
NB_MBUFS *nb_ports,
MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE,
rte_socket_id());
if (mbuf_pool == NULL)
rte_exit(EXIT_FAILURE,"mbuff_pool create failed\n");
Ports Initialisation
- We will create a function
port_init
for initialising the ports.
port_init(u_int8_t port,struct rte_mempool *mbuf_pool){
return 0;
}
- We will loop through all the ports and initialise them one by one.
int main(int argc, char* argv[]){
/*...*/
/* Initialize all ports */
for (portid = 0; portid < nb_ports; portid++){
if(port_init(portid,mbuf_pool) != 0)
rte_exit(EXIT_FAILURE,"port init failed\n");
}
/*...*/
}
-
Now lets update the
port_init
function -
First we set the
rte_eth_conf
- A structure used to configure an Ethernet port
- We can set up Receive mode and Transmit mode flags
- If you want to enable
RSS
(Receive Side Scaling) this would be the place to start.
struct rte_eth_conf port_conf = {
.rxmode = { .max_rx_pkt_len = RTE_ETHER_MAX_LEN
}
};
- Now we define the rx/tx queues we are going to use per port i.e. 1 per port, lcore
const u_int16_t nb_rx_queues = 1;
const u_int16_t nb_tx_queues = 1;
- Next we use the main function
rte_eth_dev_configure()
function to set up the ports.- configures the Ethernet device.
- this function must be invoked first before any other function in the Ethernet API.
/* configure the ethernet device */
ret = rte_eth_dev_configure(port,
nb_rx_queues,
nb_tx_queues,
&port_conf);
- Now we allocate one
RX queue
per port. - We will use function
rte_eth_rx_queue_setup()
- The function allocates a contiguous block of memory for receive descriptors
- Read more here
- We will set up 1 rx queue per port
for (q=0;q<nb_rx_queues;q++){
ret=rte_eth_rx_queue_setup(port,q,RX_RING_SIZE,
rte_eth_dev_socket_id(port),
NULL, mbuf_pool);
if (ret<0)
return ret;
}
- Now we allocate one
TX queue
per port. - We will use function
rte_eth_tx_queue_setup()
- Allocate and set up a transmit queue for an Ethernet device.
- Read more here
- We will set up 1 tx queue per port
for (q=0;q<nb_tx_queues;q++){
ret=rte_eth_tx_queue_setup(port,q,TX_RING_SIZE,
rte_eth_dev_socket_id(port),
NULL);
if (ret<0)
return ret;
}
- All togethor now:
int port_init(u_int8_t port,struct rte_mempool *mbuf_pool){
struct rte_eth_conf port_conf = {
.rxmode = { .max_rx_pkt_len = ETHER_MAX_LEN }
};
const u_int16_t nb_rx_queues = 1;
const u_int16_t nb_tx_queues = 1;
int ret;
/* configure the ethernet device */
ret = rte_eth_dev_configure(port,
nb_rx_queues,
nb_tx_queues,
&port_conf);
if (ret != 0)
return ret;
/* Allocate and setup 1 RX queue per Ethernet port */
for (q=0;q<nb_rx_queues;q++){
ret=rte_eth_queue_setup(port,q,RX_RING_SIZE,
rte_eth_dev_socket_id(port),
NULL, mbuf_pool);
if (ret<0)
return ret;
}
/* Allocate and setup 1 RX queue per Ethernet port */
for (q=0;q<nb_tx_queues;q++){
ret=rte_eth_tx_queue_setup(port,q,TX_RING_SIZE,
rte_eth_dev_socket_id(port),
NULL);
if (ret<0)
return ret;
}
return 0;
}
- Finally, we start the device using
rte_eth_dev_start()
- we will also enable promisuous mode
rte_eth_promiscuous_enable()
- in
promiscuous
mode every data packet transmitted can be received and read by a network adapter
- in
- Update the main function as follows
/* start the ethernet port */
ret = rte_eth_dev_start(port);
if (ret<0){
return ret;
}
/* Enable RX in promiscuous mode for the Ethernet device */
rte_eth_promiscuous_enable(port);
Check Link Status
- We are going to create a new function to check the link status of the ports
- It is going to tell us which of the links are up and which are down.
static int check_link_status(u_int16_t nb_ports)
{
struct rte_eth_link link;
u_int8_t port;
for (port=0;port<nb_ports;port++){
rte_eth_link_get(port,&link);
if(link.link_status == ETH_LINK_DOWN){
RTE_LOG(INFO,APP,"Port: %u Link DOWN\n",port);
return -1;
}
RTE_LOG(INFO,APP,"Port: %u Link UP Speed %u\n",
port,link.link_speed);
}
}
- Now we can call our function in the
main()
ret = check_link_status(nb_ports);
if ( ret < 0 ){
RTE_LOG(WARNING,APP,"Some ports are down\n");
}
Workers Init
- We will create a worker function that will run all the cores
- This worker will be responsible for
- receiving packets
- parsing packets
- transmitting packets
int worker_main(void *arg){
/* Run until app is killed or quit */
for(;;){
}
return 0;
}
- In the main program we will launch the
woker_main()
function on all the lcores available - We will use
rte_eal_mp_remote_launch()
for this- Launches a function on all lcores
- first argument is the function name (
worker_main
) that needs to be launched on all cores - The next argrument takes in the arguments that need to be passed. Since we have nothing to pass we will set this as
NULL
. - The third arguemnt specifies whether the function needs to run on all cores
- including
MASTER
/MAIN
core (CALL_MAIN
flag) - excluding
MASTER
/MAIN
core (SKIP_MAIN
flag)
- including
- Then we use
rte_eal_mp_wait_lcore
so thatMASTER
core waits for other worker cores.- This keeps the program from exiting.
- We can print stats on master core if needed.
ret=check_link_status(nb_ports);
if (ret<0){
RTE_LOG(WARNING,APP,"Some ports are down\n");
}
rte_eal_mp_remote_launch(worker_main,NULL,SKIP_MAIN);
rte_eal_mp_wait_lcore();
Receive/Transmit Packets
- Lets define the
BURST_SIZE
first. = This defines how many packets will the core pick up at a time.
#define BURST_SIZE 32
- In the worker_main we will set up struct rte_mbuf array that will be of the size
BURST_SIZE
- This will be defined by
struct rte_mbuf *bufs[BURST_SIZE];
- This will be defined by
- We will recieve packets using
rte_eth_rx_burst
function- It requires
port
number on which to receivequeue
number on which to receiverte_mbuf
structure to hold the packets- Number of packets to receive.
- It will return the total number of packets received (
nb_rx
) - We will transmit the recieved packets on the opposite port.
- Receive on
1
, send on0
- Receive on
0
, send on1
- Receive on
- For Transmitting we use the function
rte_eth_tx_burst
port
number on which to transmitqueue
number on which to transmitrte_mbuf
structure that holds the packets- Number of packets to transmit
- After that we will loop through packets again and free all the unsent packets
- packets are freed using
rte_pktmbuf_free()
function
- packets are freed using
- It requires
int worker_main(void *arg){
const u_int8_t nb_ports = rte_eth_dev_count_avail();
u_int8_t port;
u_int8_t dest_port;
/* Run until app is killed or quit */
for(;;){
/* Receive packets on port */
for(port=0;port< nb_ports;port++){
struct rte_mbuf *bufs[BURST_SIZE];
u_int16_t nb_rx;
u_int16_t buf;
/* Get burst fo RX packets */
nb_rx = rte_eth_rx_burst(port,0,
bufs,BURST_SIZE);
if (unlikely(nb_rx==0))
continue;
/* send burst of Tx packets to the
* second port
*/
dest_port = port ^ 1;
nb_tx= rte_eth_tx_burst(dest_port, 0,
bufs, nb_rx);
/* Free any unsent packets. */
for (buf=0;buf<nb_rx;buf++)
rte_pktmbuf_free(bufs[buf]);
}
}
return 0;
}
Some Stats
- Lets print some stats when we exit the program.
- We’ll need to add the
signal handler
to first catch theinterrupt
.- First lets create a
flag
that we will change when the signal is caught
- First lets create a
static volatile bool force_quit;
- Now the
handler
which updates the flag and displays the the stats
static void signal_handler(int signum){
if(signum== SIGINT || signum== SIGTERM){
printf("\n\nSignal %d received,preparing to exit...\n",signum);
force_quit = true;
print_stats();
}
}
- lets register
- add following in the
main
function
- add following in the
force_quit = false;
signal(SIGINT,signal_handler);
signal(SIGTERM,signal_handler);
- update the
for loop
condition inworker_main
- use
while(!force_quit)
instead offor(;;)
- use
- Lastly, lets create a function to display the stats
- It uses the
rte_eth_stats
struct andrte_eth_stats_get
function - the function takes in
port_id
for which the stats are required andrte_eth_stats
struct - It fills the
rte_eth_stats
struct which containsipackets
: packets received by the interfaceopackets
: packets sent by the interfaceimissed
: packets dropped by the interface
- It uses the
static void
print_stats(void){
struct rte_eth_stats stats;
u_int8_t nb_ports = rte_eth_dev_count_avail();
u_int8_t port;
for(port=0;port<nb_ports;port++){
printf("\nStatistics for the port %u\n",port);
rte_eth_stats_get(port,&stats);
printf("RX:%911u Tx:%911u dropped:%911u\n",
stats.ipackets,stats.opackets,stats.imissed);
}
}
Now Lets Run this
- First we build our program
- this should have created a
build
directory within our directory
make
- We will pass our program arguments just like
L2FWD
.-l
will indicate cores-p
will indicate portmask0x1
indicates 1 port i.e binary mask0001
0x3
indicates 2 port i.e binary mask0011
0x7
indicates 3 port i.e binary mask0111
./build/my_app -l 0-3 -n 3 -- -p 0x3
- you should see something like
EAL: Detected 18 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: Invalid NUMA socket, default to 0
EAL: Invalid NUMA socket, default to 0
EAL: Invalid NUMA socket, default to 0
EAL: Invalid NUMA socket, default to 0
EAL: No legacy callbacks, legacy socket not created
APP: Number of ports:2
APP: MAC address swapping enabledi
APP: Port: 0 Link UP Speed 10000
APP: Port: 1 Link UP Speed 10000
APP: Some ports are down
APP: lcore 2 exiting
APP: lcore 3 exiting
- Press
Ctrl-C
to exit the program and print some stats
Signal 2 received,preparing to exit...
Statistics for the port 0
RX: 0 Tx: 13664640 dropped: 0
Statistics for the port 1
RX: 13664640 Tx: 0 dropped: 0
WORD
- Give yourself a pat on the back you are done. (FOR NOW!)
- We will parse a few layers in the next one.
- I followed ferruhy/dpdk-simple-app and it was of a great help to me.
- Please do check his repo out.
- Infact this whole thing was inspired by this simple learning repo of his.
- You can look at the code here
- Checkout the DPDK documentation as well.