DPDK 03

July 2, 2021 - 8 mins read

Introduction

Hello There. Today we will parse a few layers.

This is the continuation of the DPDK series. You will need to have gone through DPDK-02.

Let us begin!

PreReqs

DPDK-02
Live dpdk binded port with traffic

The Layers

To understand packet filtering, you first have to understand packets and how they are handled at each layer of the TCP/IP protocol stack:
1. Application layer (e.g., FTP, Telnet, HTTP)
2. Transport layer (TCP or UDP)
3. Internet layer (IP)
4. Network access layer (e.g., Ethernet, FDDI, ATM)
We are going to parse Ethernet, IP and Transport Layer.
We will only get a few values from each and log them.
We will extract following params:
Source/Destination Mac Addresses
- From Ethernet Layer
Source/Destination IP
- From IP Layer
Source/Destination Port
- From Transport Layer
let’s step in our working folder

cd $RTE_SDK/examples/my_app

Here is what we are aiming for

design

The Network Access Layer

The packet here has got two parts:
- Ethernet header
  - Packet kind
  - Ethernet source address
  - Ethernet destination address
- Ethernet body
  - rest of packet data

The code

Lets write some code to parse the layer now.
we are going to use the rte_ether_hdr struct from rte_ether.h
- it has following members
  - struct rte_ether_addr d_addr
    - this contains destination mac address array
  - struct rte_ether_addr s_addr
    - this contains source mac address array
  - ether_type
    - Frame type
- see more here
in the main.c file go to the worker_main function and add a loop for going over all the packets.

int worker_main(void *arg){
  const u_int8_t nb_ports = rte_eth_dev_count_avail();
  u_int8_t port;
  u_int8_t dest_port;

  /* Run until app is killed or quit */
  for(;;){
    /* Receive packets on port */
    for(port=0;port< nb_ports;port++){
      struct rte_mbuf *bufs[BURST_SIZE];
      u_int16_t nb_rx;
      u_int16_t buf;

      /* Get burst fo RX packets */
      nb_rx = rte_eth_rx_burst(port,0,
          bufs,BURST_SIZE);
      if (unlikely(nb_rx==0))
        continue;

      for(int i =0;i<nb_rx;i++){
        /* Write Code here */
      }

      /* send burst of Tx packets to the 
       * second port
       */
      dest_port = port ^ 1;
      nb_tx= rte_eth_tx_burst(dest_port, 0,
          bufs, nb_rx);
      
      /* Free any unsent packets. */
      for (buf=0;buf<nb_rx;buf++)
        rte_pktmbuf_free(bufs[buf]);

    }
  }
  return 0;
}

lets first define the ether_header

struct rte_ether_hdr *ethernet_header;

Now we will arrays to store source/destination MAC addresses
The length of the array is defined by RTE_ETHER_ADDR_LEN which is 6

u_int8_t source_mac_address[RTE_ETHER_ADDR_LEN];
u_int8_t destination_mac_address[RTE_ETHER_ADDR_LEN];

now we use rte_pktmbuf_mtod() function
- The way to remember this function is packet-mbuf-to-data function.
- we populate our ethernet_header struct using this.

ethernet_header = rte_pktmbuf_mtod(bufs[i], struct rte_ether_hdr *);

Now we simply populate our value holders.

u_int16_t ethernet_type;
ethernet_type = ethernet_header->ether_type;
rte_memcpy(source_mac_address,&ethernet_header->s_addr,sizeof(u_int8_t)*RTE_ETHER_ADDR_LEN);
rte_memcpy(destination_mac_address,&ethernet_header->d_addr,sizeof(u_int8_t)*RTE_ETHER_ADDR_LEN);

Finally we will log all the values extracted.
All together it looks like below:

/* Get burst fo RX packets */
nb_rx = rte_eth_rx_burst(port,0,
    bufs,BURST_SIZE);
if (unlikely(nb_rx==0))
  continue;

for(int i=0;i<nb_rx;i++){
  /* Write Code here */
  struct rte_ether_hdr *ethernet_header;
  u_int8_t source_mac_address[RTE_ETHER_ADDR_LEN];
  u_int8_t destination_mac_address[RTE_ETHER_ADDR_LEN];
  ethernet_header = rte_pktmbuf_mtod(bufs[i], struct rte_ether_hdr *); 
  ethernet_type = ethernet_header->ether_type;
  rte_memcpy(source_mac_address,&ethernet_header->s_addr,sizeof(u_int8_t)*RTE_ETHER_ADDR_LEN);
  rte_memcpy(destination_mac_address,&ethernet_header->d_addr,sizeof(u_int8_t)*RTE_ETHER_ADDR_LEN);
  RTE_LOG(INFO,APP,"Source Mac: ");
  for(int i=0;i<RTE_ETHER_ADDR_LEN;i++)
    printf("%x",source_mac_address[i]);
  printf("\n");
  RTE_LOG(INFO,APP,"Destination Mac: ");
  for(int i=0;i<RTE_ETHER_ADDR_LEN;i++)
    printf("%x",source_mac_address[i]);
  printf("\n");
  RTE_LOG(INFO,APP,"ether type: %d",ethernet_type);
}
/* 
* send burst of Tx packets to the 
* second port
*/
dest_port = port ^ 1;
nb_tx= rte_eth_tx_burst(dest_port, 0,
    bufs, nb_rx);

The Internet Layer

This layer is responsible for routing messages between different local networks.
IP addresses in IPv4 follow a format of xxx.xxx.xxx.xxx, where each decimal value (0–255) translates into 8 binary bits called an octet.
We are going to use struct rte_ipv4_hdr from rte_ip.h
- contains IP-related defines
- contains struct rte_ipv4_hdr and struct rte_ipv6_hdr
rte_ipv4_hdr has following members
- version_ihl
  - version and header length
- type_of_service
  - type of service
- total_length
  - Length of packet
- packet_id
  - packet ID
- fragment_offset
  - fragmentation offset
- time_to_live
  - time to live
- next_proto_id
  - protocol ID
- hdr_checksum
  - header checksum
- src_addr
  - source ip address
- dst_addr
  - destination ip address

The Code

The last two bytes in the ethernet layer tell us about the next layer.
Let’s write a function that will take the 16 bit value and tell us whether the layer is ipv4 or not.
- the value is 2048 (dec) in case of ipv4
write down the following functions

#define IPV4_PROTO 2048 
#define IPV6_PROTO 34525
u_int16_t get_Ether_Type(char *pointer)
{
    u_int8_t slb= 0;
    u_int8_t lb= 0;
    slb= *(pointer - 2 );// second last byte of ETH layer
    lb= *(pointer - 1 );// last last byte of ETH layer
    return (slb* 256) +lb;
}

bool is_ipv4(u_int16_t val)
{
  if (val == IPV4_PROTO)
    return true;
  return false;
}

now let’s create a pointer that we will use to traverse the packet bytes

void* pHdrTraverse = (void*) ((unsigned char*) ether_header + sizeof (struct rte_ether_hdr));// Pointer to Next Layer to Eth

now lets call get_Ether_type() and assign the value to next_proto

u_int16_t next_proto = get_Ether_Type(pHdrTraverse);// holds last two byte value of ETH Layer

now let’s check what the next layer is
- incase the next_proto is not IPv4 we will log it

if (is_ipv4(next_proto)){

}else
{
  RTE_LOG(INFO,APP,"NOT IPv4\n");
}

now within the if clause we will parse the ipv4 header.

struct rte_ipv4_hdr ipv4_header;
u_int32_t u32SrcIPv4;
u_int32_t u32DstIPv4;
ipv4_header= rte_pktmbuf_mtod_offset(bufs[i], struct rte_ipv4_hdr*, sizeof (struct rte_ether_hdr));

okay so now the struct may be filled however, we need to check whether it is a valid ipv4 or not.
- for that we create is_valid_ipv4_pkt() function

static inline int is_valid_ipv4_pkt(struct rte_ipv4_hdr *pkt, uint32_t link_len)
{
    /* From http://www.rfc-editor.org/rfc/rfc1812.txt section 5.2.2 */
    /*
     * 1. The packet length reported by the Link Layer must be large
     * enough to hold the minimum length legal IP datagram (20 bytes).
     */
    if (link_len < sizeof(struct rte_ipv4_hdr))
        return -1;
    /* 2. The IP checksum must be correct. */
    /* this is checked in H/W */
    /*
     * 3. The IP version number must be 4. If the version number is not 4
     * then the packet may be another version of IP, such as IPng or
     * ST-II.
     */
    if (((pkt->version_ihl) >> 4) != 4)
        return -3;
    /*
     * 4. The IP header length field must be large enough to hold the
     * minimum length legal IP datagram (20 bytes = 5 words).
     */
    if ((pkt->version_ihl & 0xf) < 5)
        return -4;
    /*
     * 5. The IP total length field must be large enough to hold the IP
     * datagram header, whose length is specified in the IP header length
     * field.
     */
    if (rte_cpu_to_be_16(pkt->total_length) < sizeof(struct rte_ipv4_hdr))
        return -5;
    return 0;
}

now lets call it in our if clause.
we are going to hold ipv4 src and dst addresses in unsigned int 32 variables.

if (is_valid_ipv4_pkt(pIP4Hdr,bufs[i]->pkt_len)>=0){
    /* update TTL and CHKSM */
    --(pIP4Hdr->time_to_live);
    ++(pIP4Hdr->hdr_checksum);

    u32SrcIPv4 = rte_bswap32(pIP4Hdr->src_addr);
    u32DstIPv4 = rte_bswap32(pIP4Hdr->dst_addr);
} else {
  RTE_LOG(INFO,APP,"invalid IPv4\n");
}

The rte_bswap32 is used to swap bytes in a 32-bit value.

The Transport Layer

The protocols of this layer provide host-to-host communication services for applications.
- Different applications use either TCP or UDP to establish a connection.
- The ports used by the application are contained within the header as the source port and destination port

The Code

The next layer can be checked by next_proto_id within the rte_ipv4_hdr struct.
- We are going to extract the source and destination ports incase of
  - TCP notified by IPPROTO_TCP i.e. 6
  - UDP notified by IPPROTO_UDP i.e. 17

switch(pIP4Hdr->next_proto_id){
  case IPPROTO_TCP:    
    break;

  case IPPROTO_UDP: 
      break;

  default:
      u16DstPort = 0;
      u16SrcPort = 0;
      break;

}

Default will handle the case in which next_proto_id is neither TCP or UDP.
- We will set the default value 0 for both ports indicating this case
Now let’s parse the TCP header
- We will create a rte_tcp_hdr struct contained in rte_tcp.h
- Read more about rte_tcp.h here
- The rte_tcp_hdr contains
  - src port
  - dst port
  - sent_seq
  - recv_ack
  - data_off
  - tcp_flags
  - rx_win
  - cksum
  - tcp_urp
- Read more about rte_tcp_hdr here
Include the header

include <rte_tcp.h>

Create the rte_tcp_hdr struct

struct rte_tcp_hdr pTcpHdr;

Lastly, handle the TCP case

case IPPROTO_TCP:
    pTcpHdr = (struct rte_tcp_hdr *) ((unsigned char *) pIP4Hdr + sizeof(struct rte_ipv4_hdr));
    u16DstPort = rte_bswap16(pTcpHdr->dst_port);
    u16SrcPort = rte_bswap16(pTcpHdr->src_port);
    break;

Create the rte_tcp_hdr struct

struct rte_udo_hdr pUdpHdr;

Lastly, handle the UDP case

#include <rte_udp.h>

We will create a rte_udp_hdr struct contained in rte_udp.h

struct rte_udp_hdr pUdpHdr;

The rte_udp_hdr contains
- src port
- dst port
- dgram_len
- dgram_cksum
- Read more about rte_upd_hdr here
Read more about rte_udp.h here

case IPPROTO_UDP:
    pUdpHdr = (struct rte_udp_hdr *) ((unsigned char *) pIP4Hdr + sizeof (struct rte_ipv4_hdr));
    u16DstPort = rte_bswap16(pUdpHdr->dst_port);
    u16SrcPort = rte_bswap16(pUdpHdr->src_port); 
    break;

All Together

for(int i=0;i<nb_rx;i++){
    /* Write Code here */
    struct rte_ether_hdr *ethernet_header;
    u_int8_t source_mac_address[RTE_ETHER_ADDR_LEN];
    u_int8_t destination_mac_address[RTE_ETHER_ADDR_LEN];
    struct rte_ipv4_hdr     *pIP4Hdr;
    struct rte_ipv6_hdr     *pIP6Hdr;
    struct rte_udp_hdr      *pUdpHdr;
    struct rte_tcp_hdr      *pTcpHdr;
    u_int32_t u32SrcIPv4= 0;
    u_int32_t u32DstIPv4= 0;
    u_int16_t u16SrcPort= 0;
    u_int16_t u16DstPort= 0;
    u_int16_t ethernet_type;
    ethernet_header = rte_pktmbuf_mtod(bufs[i], struct rte_ether_hdr *);
    ethernet_type = ethernet_header->ether_type;
    rte_memcpy(source_mac_address,&ethernet_header->s_addr,sizeof(u_int8_t)*RTE_ETHER_ADDR_LEN);
    rte_memcpy(destination_mac_address,&ethernet_header->d_addr,sizeof(u_int8_t)*RTE_ETHER_ADDR_LEN);
    RTE_LOG(INFO,APP,"Source Mac: ");
    for(int i=0;i<RTE_ETHER_ADDR_LEN;i++)
        printf("%x ",source_mac_address[i]);
    printf("\n");
    RTE_LOG(INFO,APP,"Destination Mac: ");
    for(int i=0;i<RTE_ETHER_ADDR_LEN;i++)
        printf("%x ",source_mac_address[i]);
    printf("\n");
    RTE_LOG(INFO,APP,"ether type: %d",ethernet_type);
    void* pHdrTraverse = (void*) ((unsigned char*) ethernet_header + sizeof (struct rte_ether_hdr));// Pointer to Next Layer to Eth
    u_int16_t next_proto = get_Ether_Type(pHdrTraverse);// holds last two byte value of ETH Layer
    RTE_LOG(INFO,APP,"next proto %u\n",next_proto);
    if (is_ipv4(next_proto)){
        pIP4Hdr = rte_pktmbuf_mtod_offset(bufs[i], struct rte_ipv4_hdr*, sizeof (struct rte_ether_hdr));
        /* check for valid packet */
        if (is_valid_ipv4_pkt(pIP4Hdr,bufs[i]->pkt_len)>=0){
            /* update TTL and CHKSM */
            --(pIP4Hdr->time_to_live);
            ++(pIP4Hdr->hdr_checksum);
            u32SrcIPv4 = rte_bswap32(pIP4Hdr->src_addr);
            u32DstIPv4 = rte_bswap32(pIP4Hdr->dst_addr);
            RTE_LOG(INFO,APP,"IPv4 src %u dst %u\n",u32SrcIPv4,u32DstIPv4);

            switch(pIP4Hdr->next_proto_id){
            case IPPROTO_TCP:
                pTcpHdr = (struct rte_tcp_hdr *) ((unsigned char *) pIP4Hdr + sizeof(struct rte_ipv4_hdr));
                u16DstPort = rte_bswap16(pTcpHdr->dst_port);
                u16SrcPort = rte_bswap16(pTcpHdr->src_port);
                break;

            case IPPROTO_UDP:
                pUdpHdr = (struct rte_udp_hdr *) ((unsigned char *) pIP4Hdr + sizeof (struct rte_ipv4_hdr));
                u16DstPort = rte_bswap16(pUdpHdr->dst_port);
                u16SrcPort = rte_bswap16(pUdpHdr->src_port);
                break;

            default:
               u16DstPort = 0;
               u16SrcPort = 0;
               break;

            }
            RTE_LOG(INFO,APP,"TL src %u dst %u\n",u16SrcPort,u16DstPort);
        }

    }
}

Now Lets Run this

First we build our program
This should have created a build directory within our directory

  make

now run the program using the following command

  ./build/my_app -l 0-3 -n 3 -- -p 0x3

you should see some logs as following

  APP: Source Mac: 0 ff 56 88 84 ff
  APP: Destination Mac: 0 50 56 ff 84 ff
  APP: ether type: 8APP: next proto 2048
  APP: IPv4 src 3232200221 dst 1700832002
  APP: TL src 14550 dst 443

WORD

Meow

Have a cookiee. Treat-yo-self! We are finally done with this series
You can look at the code here
As always, checkout the DPDK documentation as well.