[Show/Hide Left Column]

5. IPSec

5.1 Introduction

For providing security at the IP layer the IETF has developed the IPSec protocol suite. IPSec protocol suite provides end-to-end security by implementing mechanisms for mutual authentication of the communicating entities, and by using encryption and data integrity protection. The advantage to have security implemented at the IP layer is that applications do not have to be altered to start using secure communication. The applications just continue their use of TCP or UDP as before. The IPSec protocol suite has many parts one of which is IPSec itself. Other protocols are IKE and ISAKMP.

When implementing IPSec there are several options. One is that of integrating the IPSec with the IP layer. This is getting more common and particularly for IPV6 where IPSec support is mandatory this is the case. Another option is that of implementing IPSec between the IP layer and the Link layer, see Figure . This is done for many, especially older, IPv4 stacks. The option is referred to as bump-in-the-stacks (BITS). BITS allows IPSec support to be added to stacks that originally has no IPSec support. The disadvantage is that one has the duplicate many of the IP handling in the IPSec layer. Another approach to add IPSec is referred to as Bump-in-the-wire (BITW). In a BITW implementation the IP packages are processed by hardware

Figure 5.1: Two ways of implementing IPSec; integrated and "bump-in-the-stack".
Starting with Windows 6 (Vista) and Linux kernel 2.6 (NETKEY IPSec), IPSec is integrated in the TCP/IP network stacks of these OSes. This implies that if one wants to trace packet processing in an IPSec enabled communication stack one has to be aware of how IPSec is implemented. Luckily there are tools that handle these problems for us.

The IPSec protocol has beside the cryptographic mechanisms also comprehensive mechanisms to handle and use session contexts, the so-called Security Associations (SA). SAs contain all the information required for execution of various network security services, such as the IP layer services (such as header authentication and payload encapsulation), transport or application layer services, or self-protection of negotiation traffic. IPSec uses ISAKMP to handle SAs.

An SA consists of a source, a destination and an instruction. For example, an authentication SA may look like this:

add ah 13800 -A hmac-md5 "1234567890123456";

The above says 'traffic going from to that needs an AH can be signed using HMAC-MD5 using secret 1234567890123456'. This instruction is labeled with SPI ('Security Parameter Index') id '13800'. Both sides of a conversation share exactly the same SA. For two-way traffic, two SAs are needed.

A sample ESP SA may look like:

add esp 13801 -E aes-cbc "123456789012123456789012";

This instruction means 'traffic going from to that needs encryption can be encrypted using aes-cbc with key 123456789012123456789012'. The SPI id is '13801'.

So far, we've seen that SAs describe possible instructions, but do not in fact describe the policy as to when these SAs need to be used. In fact, there could be an arbitrary number of nearly identical SAs with only differing SPI ids. To perform actual crypto operations, we need to describe a policy. This policy can include things as 'use ipsec if available' or 'drop traffic unless we have ipsec'.
A typical simple Security Policy (SP) may look like this:
 spdadd any -P out ipsec

If entered on host, this means that all traffic going out to must be encrypted and be wrapped in an AH authenticating header. Note that this does not describe which SA is to be used. This task is left for the kernel to determine.
To summarize our brief description here we see that a Security Policy specifies WHAT we want for IPSec and a Security Association describes HOW we want it.
It is important for an efficient IPSec implementation to have efficient solutions for the handling of the SAs and SPs. IPSec speaks about the SA Database (SAD) and the SP Database (SPD) but actual implementation of the SAD and SPD is left to the implementer as long the processing complies with what the IPSec standard prescribes.

5.2 Basic knowledge about ISAKMP

When tracing a the process of establishing an IPSec connection one meets often ISAKMP messages. ISAKMP (Internet Security Association and Key Management Protocol) constitutes the standard for the procedures and packet formats to establish, negotiate, modify and delete Security Associations for IPSec. ISAKMP defines payload formats for exchanging key generation and authentication data. The ISAKMP formats provide a framework for transfering key and authentication data which is independent of the actual key generation technique, the selected encryption algorithm and/or authentication mechanism. In this section we repeat some of the essential definitions of ISAKMP. An implementation of ISAKMP should at least be able to run over UDP and port 500 but may use other ports.

Datagram format

MAC header IP header UDP header ISAKMP packet

UDP on port:500 (but may be different)

ISAKMP header:

The ISAKMP header has the following definition (the explanation of the specific fields are given below):
  0   1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Initiator cookie
Responder cookie
Next Payload Mjr version Mnr version Exchange type Flags
Message ID

Initiator cookie. 32 bits
The cookie of the entity that initiated an SA establishment, SA notification, or SA deletion.
Responder cookie. 32 bits.
The cookie of the entity that is responding to an SA establishment request, SA notification, or SA deletion.
Next payload. 8 bits.
Indicates the type of the first payload in the message.
0 NoneRFC 2408
1 Security Association,,
4Key Exchange,,
5Identification ,,
6Certificate ,,
7Certificate Request ,,
8Hash ,,
9Signature ,,
10Nonce ,,
11Notification ,,
12Delete ,,
13Vendor ID ,,
15 SAK, SA KEK PayloadRFC 3547
16 SAT, SA TEK Payload ,,
17 Key Download ,,
18Sequence Number ,,
19Proof of Possession ,,
20NAT-D, NAT Discovery RFC 3947
21NAT-OA, NAT Original Address ,,
128-255Private use

Mjr version. 4 bits.
The major version of the ISAKMP protocol in use.
Mnr version. 4 bits.
The minor version of the ISAKMP protocol in use.
Exchange type. 8 bits.
Indicates the type of exchange being used. This dictates the message and payload orderings in the ISAKMP exchanges.
2Identity protection
3Authentication only
6-31ISAKMP future use
32-239 DOI specific use
240-255Private use

Flags. 8 bits.
Indicates the options that are set for the ISAKMP exchange.

A, Authentication only. 1 bit
Intended for use with the Informational Exchange with a Notify payload and will allow the transmission of information with integrity checking, but no encryption.

C, Commit. 1 bit
Used to signal key exchange synchronization. It is used to ensure that encrypted material is not received prior to completion of the SA establishment.

E, Encryption. 1 bit
If set, all payloads following the header are encrypted using the encryption algorithm identified in the ISAKMP SA.

Message ID. 32 bits.
A unique value used to identify the protocol state during Phase 2 negotiations. It is randomly generated by the initiator of the Phase 2 negotiation.
Length. 32 bits.
The total length of the ISAKMP header and the encapsulated payloads in bytes.

5.3 Example

A typical setup of a Roadwarrior case with a pre-shared key for authentication. It consists of a laptop called Roadwarrior that is connected to the Internet and a server that is connected to a Gateway which is connected to the Internet

5.3.1 Purpose

The purpose of this worked out example is to demonstrate a simple use-case for IPsec and how IPsec can be configured. Another purpose is to make you feel comfortable with the configuration steps and the tracing tools that might come handy.

5.3.2 Scenario

In detail, the setup for our example consists of the following pieces
  IPsec Client software: 
    Shrew Soft VPN Client for Windows
  Network Protocol Analyzer:
IPsec Gateway:
  IPsec Gateway: 
  strongSwan for Linux

5.3.3 Configuration

For the Gateway
  The strongSwan configuration file  looks like below:
      config setup
        # plutodebug=all             # uncomment the states where no defaults
        # crlcheckinterval=600
        # strictcrlpolicy=yes
        # cachecrls=yes
        # nat_traversal=yes
        # charonstart=no
        # plutostart=no
      conn  rw
        left=           #  left is itself, right is the other side
        leftsubnet=      #  Intranet Resource behind Gateway 
        pfs=no                         # the IPsec SAs derived from previous keys
        authby=secret                  # the Pre-shared Key authentication
    { : PSK “ipsec” }                  # the Pre-share key

    Hence we use here a pre-shared key which is the simplest setup.

For the Roadwarrior Client

Next create a VPN client instance and configure the parameters in the Access Manager of the ShrewSoft VPN Client program group as is described below. IPsec and particularly the IKE key exchange method that must be used for key establishment can be configured in many ways due to the basic options of IKE and the choices that can be made in the VPN Client program. We walk step by step through the configuration.

Step 1: General

Remote IPsec Gateway:

The port which IKE uses:
UDP at port 500

IKE Mode Config protocol:
Pull mode

Address Method:
We use the local physical adapter and IP for the later IPsec communications.

Step 2: Client

For simplicity, we assume that there are no NAT and the firewall.

Dead Peer Detection (DPD) is an extension protocol to IPsec.

You can let the VPN Client IPsec Daemon to forward ISAKMP failure notifications.

Client Login Banner is used for extension authentication of Xauth+PSK and Xauth+RSA

Step 3: Name Resolution

For simplicity, we only use in this case the IP address to carry out test instead using a DNS name

Step 4: Authentication - Local Identity

Here we select the PSK authentication method as the test.

Local Identity type is the IP Address.

Step 5: Authentication - Remote Identity

Remote Identity Type is also IP Address.

Step 6: Authentication - Credentials

Here you can input the pre-shared key in the Credentials.

Step 7: IKE Phase 1

The proposal parameters in Phase 1 will be used to establish a tunnel to protect the subsequent negotiations of IKE, in order to reduce the burden, the Key Life Time is always set up a long time. Here it is one day.

Step 8: IKE Phase 2

The proposed parameters in Phase 2 are utilized to set up another tunnel to secure the sessions (i.e IPsec sessions), they have nothing to do with the parameters in Phase 1 except that the session keys derived from the keys of phase one.

The Key Life Time is always shorter than the IKE keys. Here it is set to one hour.

Step 9: Policy

Here we manually configure the Remote Network Resource, i.e IP Addresses scope behind Intranet in the Policy.

For further information about other options, please read the manual under the help menu.

5.3.4 Capturing IKE Traffic

We want to investigate the traffic that is exchanged. This requires some additional configuration and steps.

1) Enable the debug options by Open file->options in the VPN Trace application of ShrewSoft VPN Client program group.

Here we set the decode mode to Log output level.

2) Run the vpn client instance created above in the Access Manager of ShrewSoft VPN Client program group to get the dump packets between IPsec client and gateway.

Here you can see some information on what the client is processing.

Here you can see some basic information about the tunnel status. In fact, the Phase two negotiation will not be going on and the “Established” still is 0 if there is no traffic initiated.

3) Stop the IKE and IPSec Services in the VPN Trace application in order to open the dump files.

4) Open the dump files using the Network Protocol Analyzer, wireshark.
The default directory:
  C:\Program Files\ShrewSoft\VPN Client\debug
Dumped files:
    a binary packet dump of the decrypted IKE conversation
    a binary packet dump of the encrypted IKE conversation
    a binary packet dump of IPsec conversation
    a binary packet dump of the traffic before outbound or after inbound IPsec processing.

In the next two sections we take a closer look at these dump files.

5.3.5 IKE Traffic Inside For File dump-ike-decrypt.cap

There are 6 packets in the phase one and 3 packets in the phase two. Let’s check all of these one by one.

Phase One

1. 1st packet

The proposal lists include all combinations of cipher algorithm, hash algorithm and DH exchange group, but only one of them lies here because of configuration already defined manually.
The initiator generates a cookie, sets the responder cookie to zero and sends to the responder.

2. 2nd packet

The responder checks its support suite of encryption and authentication algorithms for IKE to decide whether it can fit one from the proposal lists. Here in the case, the responder supports the specific one which the initiator proposes.
The responder generates a responder cookie, copies the initiator cookie to the message and sends it to the initiator.

3. 3rd packet

The cookie pair will keep during the entire IKE negotiation process.
The initiator sends Key Exchange Payload which are used for generating DH key and the Nonce Data which is a random number involved in the practical keys computing.

4. 4th packet

The responder replies with its Key Exchange Payload and Nonce Data, which serve the same function as the 3rd packet.
In fact, The ISAKMP SA is generated after this step, the following negotiations are to be encrypted by the ISAKMP SA.

5. 5th packet

The initiator announces its identity. Hash data is the value of function of parameters for Pre-shared key, random number (nonce), initiator and responder cookie, and DH exchange data.

6. 6th packet

The responder announces its identity in the same way as in the 5th packet.

Phase Two

1. 1st packet

Because there can be many quick mode (phase two) instances under the ISAKMP SA protection, and the cookies are totally the same for all instances, so the “Message ID” field is the only identification from each others in phase two.
SPI is the key index for the later outbound IPsec data communication. Here the outbound denotes the direction for the data leaving client. All packets are encrypted by this SPI index key.
The initiator sends IPsec session proposal requests to the responder.
Let’s check the other items of the packet.

If the initiator doesn’t use its physical IP, the “identification payload” of client should be a vitual IP or a subnet.

2. 2nd packet

The “cookie” pair and “Message ID” are the same with the first packet above.
“SPI” is the key index for the later inbound IPsec data communication, here the inbound denotes the direction for the data entering the client.
Just like in phase one, the responder checks the suite of encryption and authentication algorithms it supports for the subsequent “IPsec session” to decide whether it can fit one from the proposal lists. Here in the case, the responder supports the specific one which the initiator proposes.

3. 3rd packet

The initiator replies with a packet carrying a hash payload to serve as the data source authentication in case of a replay attack. For File dump-ike-encrypt.cap

The file is the encrypted version of the same stuff in the file dump-ike-decrypt.cap.

Phase One

1. 1st -4th packets
The first four packets in IKE conversation are clear text, the same with the first four packets inside dump-ike-decrypt.cap dump file, other packets, besides the following phase one and phase two packets, the exchange information of IKE(i.e the 7th packet in this case) are also encrypted and protected by the ISAKMP SA.

2. 5th packet

Of course the header of ISAKMP is still in clear text, otherwise the machine can’t know what payload above udp levels and how to deal with that.

3. 6th packet:
It is almost the same with the 5th packet in header, different in the encrypted payload data which you can hardly tell anything from that.

Phase Two

1. 1st packet

ISAKMP payload is encrypted, the Header information are the cookie pair, quick mode, and the Message ID.

2. 2nd -3rd packets:
It is almost the same with the 1st packet in header except the length value, the payload data is under protection and you can tell nothing either.

5.3.6 Data Communication Inside For File dump-ipsec-prv.cap

The file dump-ipsec-prv.cap is actually data communication in clear text between the client and the intranet behind the Gateway.
Because the IPsec session tunnel has already been established, we can issue a “ping” command to test the connectivity to the remote intranet machine.

1. 1st packet
ARP Reply packet

It is “ARP Reply” packet with a strange MAC answer bb:bb:bb:bb:bb:00. Virtually, if there is no VPN connection on the client, the first “ARP Reply” should directly came from the client’s default route gateway with the gateway IP as “sender IP address” when the destination IP addresses are outside the client subnet. Consequentially is treated as a local IP inside the client subnet.
Here MAC bb:bb:bb:bb:bb:00 is an virtual network interface that represents the IPsec tunnel, and all IPsec traffic packets to other side will be forwarded to this MAC address and then encrypted.

2. 2nd packet
ICMP Request Message

This is a Ping Request packet, you can see the traffics for which are sent to the Destination MAC bb:bb:bb:bb:bb:00.

3. 3rd packet
ICMP Reply Message

This is the Ping Reply packet, the respond to the earlier sent Ping Request. They all have the same “Sequence number”. the traffic source MAC should be bb:bb:bb:bb:bb:00. For File dump-ipsec-pub.cap

File dump-ipsec-pub.cap is the encrypted version of the same stuff inside file dump-ipsec-prv.cap.

Ping Request

Ping Reply

From these two packets, you can only know they are the ESP payloads and their key indexes between two machines. You can’t tell what they are doing, just like this example, they are doing a ping test inside. All the information about Ping application above IP levels are encrypted and packed into the payload of ESP.

5.4 NAT-Traversal IPsec VPN

5.4.1 Introduction

IPsec makes use of the ESP protocol to communicate between the peers, as is always good when there are no NATs. The problem with NAT is that it will map all the private IP hosts and its application connected to the peers outside to different ports under one public IP. So for such applications as IPsec, using fixed source ports or having no ports involved, the many same application instances will not work simultaneously. In principle, it introduces a UDP Header with port 4500 between the IP Header and ESP payload to solve the above failure.
Below we discuss the NAT-T solution as standardized by IETF.

5.4.2 NAT-T IKE negotiation process in RFC3947

Just like in the last example, the basic IKE protocol is the same but a few differences in implementation. Let’s see how it works.
The NAT-based IKE(main mode) consists of the following process steps:

Phase one
Initiator                                    Responder
------------                               ------------
1)Hello message
  UDP(500,500),HDR, SA, VID -->
                                      <-- UDP(500,X),HDR, SA, VID
2)Key Generated
  UDP(500,500),HDR, KE, Ni, NAT-Dr, NAT-Di-->
                                      <-- UDP(500,X),HDR, KE, Nr, NAT-Di, NAT-Dr
3)Authenticated each other
  UDP(4500,4500),HDR*#, IDii, [CERT, ] SIG_I -->
                                      <-- UDP(4500,Y),HDR*#, IDir, [CERT, ], SIG_R
  The # sign indicates that those packets are sent to the changed port if NAT is detected.
  The * signifies payload encryption after the ISAKMP header.
  HDR is an ISAKMP header
  SA is a SA negotiation payload
  KE is the key exchange payload
  IDx is the identity payload for "x". x can be: "ii" or "ir" for the ISAKMP initiator and responder, respectively.
  AUTH is a generic authentication mechanism, such as HASH or SIG.
  Nx is the nonce payload for "x". x can be: "i" or "r" for the ISAKMP initiator and responder, respectively.
  CERT is the certificate..

Phase two
It’s completely identical to that without NAT-T except for using the UDP port 4500 instead of using port 500.

5.4.3 NAT-T Communication

This is also identical to that without NAT-T except that there is an inserted UDP with port 4500 between the IP Header and ESP Header as shown in the following diagram.

IP Header UDP (4500) ESP Header ESP payload

5.5 NAT-T Example

The last example scenarios are supposed not to support NAT-based IPsec VPN. Today almost all implementations of IPsec VPN have native support for NAT Traversal.
In this example we try to figure out how this NAT-based IPsec VPN works by considering the following scenario where a client behind a NAT/router wants to get access via an IPsec gateway to a server.

5.5.1 Scenario

The example has the following pieces.
  IPsec Client software
    Shrew Soft VPN Client for Windows
  Network Protocol Analyzer
  iptables under Linux
IPsec Gateway
  IPsec Gateway
    StrongSwan for Linux

5.5.2 Configuration

For IPSec Gateway
This time we use the certification authentication and all the certification files are stored under the relatively path etc/ipsec.d directory. The strongSwan configuration file looks like below:
    config setup
      nat_traversal=yes                # must be uncommented for the NAT support
    conn  rw
      left=%defaultrouter              # left is itself, right is the other side
      leftsubnet=        # Intranet Resource behind Gateway
      leftcert=fugaCert.pem            # Certification authentication
      rightsubnetwithin=  # right subnet IP pool   
      pfs=no                           # the IPsec SAs derived from previous keys
  { : RSA fugaKey.pem "fuga" }         # the Gateway’s private key and passphr

For the Client

According to the gateway’s configuration above, the client also has to be set up with the certification authentication mode.
Next create a VPN client instance and configure the parameters in the Access Manager of the ShrewSoft VPN Client program group. In fact, the configure is almost the same to that without NAT-T except using the certification authentication mode as are described below.

Here is ASN.1 Distinguished Name type in Local Identity due to its use to generate the client certification with Openssl tools.

The gateway Identity also is ASN.1 Distinguished Name type due to the same reason to the Local Identity.

Here are the certification files of root and client.

5.5.3 Capturing IKE Traffic

Here are totally the same to ones without NAT-T of above sectors.

5.5.4 IKE Traffic Inside with NAT-T

The decrypted file dump-ike-decrypt.cap in the VPN client is used to describe all the steps as it does without NAT-T.

There are also 6 packets in Phase one and 3 packets in Phase two. Let’s check all of these one by one.

Phase One

1. 1st packet

(1)The standard IKE negotiation UDP ports 500.
(2)The lists of all supported combinations of encryption and authentication algorithms.
(3)The negotiation of NAT-T in the IKE to detect the presence of NAT between the peers, as seen here, there is a serial of NAT-T negotiations to be developed to improve the NAT-T facility, the RFC3947 is the newest version from all others. However some old IPsec implementations are still in use, there is a distinct “vendor ID” for each NAT-T negotiation in order to keep the old ones worked.

2. 2nd packet

(1)The IKE UDP ports 500.
(2)The cookie pair, identified from other IKE session, will keep the same during the ISAKM SA key lifetime.
(3)The selected combination of encryption and authentication algorithms responding to the initiator’s request.
(4)The chosen negotiation of NAT-T by the responder.

3. 3rd packet

(1)The IKE UDP ports 500.
(2)The initiator’s KE payload referring to DH group data.
(3)Nonce data is a random number.
(4)Request for responder’s certification
(5)NAT-Discovery field, the first is for the responder’s (the remote) hash value
  • HASH = HASH(CKY-i| CKY-r | IP-r| Port-r)
  • CKY-i: initiator’s cookie
  • CKY-r: responder’s cookie
  • IP-r: responder’s IP
  • Port-r: responder’s UDP Port
(6)NAT-Discovery field, the second is for the initiator’s(the local) hash value
  • HASH = HASH(CKY-i| CKY-r | IP-i| Port-i)
  • IP-i: initiator’s IP address
  • Port-i: initiator’s UDP Port
Here cookies pair are added to the hash to make precomputation attacks for the IP address and port impossible.

4. 4th packet

(1)The IKE UDP Ports 500.
(2)The responder’s KE data, DH group
(3)The random number
(4)Request for initiator’s certification
(5)The initiator’s(the remote) hash value
  • HASH = HASH(CKY-i| CKY-r | IP-i| Port-i)
(6)The responder’s(the local) hash value
  • HASH = HASH(CKY-i| CKY-r | IP-r| Port-r)

5. 5th packet

(1)When the initiator received the 4th packet sent by the responder, it has detected the presence of NAT before itself by the hash comparation between the (6) in 3rd packet and (5) in 4th packet. There must be no NAT before the gateway because of the same hash value between the (5) in 3rd packet and (6) in 4th packet. So the client adjust its IKE port to 4500 from 500 for further NAT-compatibility needs.
(2)This is a field with value “0000” which identify the IKE UDP payload from the ESP UDP payload because the ESP UDP payload at the same location is the SPI field with a no zero value.
(3)It’s the initiator’s ID payload in a ASN.1 distinguished name in the certification. i.e:
  • commonName=pluto
  • organizationlUnitName=eit
  • OrganizationName=lund university
  • StateorprovinceName=skane
  • CountryName=se
(4)Signature payload which is signed by the initiator’s private key will be used to verify the initiator’s claimed identity in the responder side.

As you see, here emerges a certification payload in the packet. Let’s check it in more detail by detailing the field of certification payload.

(1)The certificate chain, indicating the certificate sequence number issued by the same root CA. here is the 3rd certificate
(2)The issuer, who issued the certificate to server or client, here is:
  • commonName=root-ca
  • organizationlUnitName=eit
  • OrganizationName=lund university
  • StateorprovinceName=skane
  • CountryName=se
(3)The certificate’s issued and expired time.
(4)The initiator’s certification,here is:
  • commonName=pluto
  • organizationlUnitName=eit
  • OrganizationName=lund university
  • StateorprovinceName=skane
  • CountryName=se
(5)The initiator’s public key value, 1024 bits.
(6)This is the value signed by the issuer’s private key to testify the initiator’s claimed identity.

6. 6th packet

(1)Then the responder also perceived an existing NAT prior to the initiator while it got the 3rd packet and made hash value comparison.
(2)Also be the field with value “0000” which identify the IKE UDP payload from the ESP UDP payload.
(3)It’s the responder’s ID payload in a ASN.1 distinguished name in the certificate. i.e:
  • commonName=fuga
  • organizationlUnitName=eit
  • OrganizationName=lund university
  • StateorprovinceName=skane
  • CountryName=se
(4)The Signature payload which is signed by the responder’s private key is used to verify the responder’s claimed identity in the initiator side.

There is also a certification payload to present the responder’s certification just like that of initiator in the 5th packet. The payload format is totally the same, but the content are the responder’s certification information. Let’s have a brief check.

(1)The certificate chain, indicating the certificate sequence number issued by the same root CA. apparently here is the first issued certificate by the same root CA.
(2)The issuer, who issued the certificate to server or client, here is:
  • commonName=root-ca
  • organizationlUnitName=eit
  • OrganizationName=lund university
  • StateorprovinceName=skane
  • CountryName=se
So it’s the same root CA to the initiator’s.
(3)The certificate’s issued and expired time.
(4)The responder’s certificate, here is:
  • commonName=fuga
  • organizationlUnitName=eit
  • OrganizationName=lund university
  • StateorprovinceName=skane
  • CountryName=se
(5)The responder’s public key value, 1024 bits.
(6)This is the value signed by the issuer’s private key to testify the responder’s claimed identity.

As we see, the IKE UDP Port 500 switches to 4500 from the beginning of the 5th packet. In fact, all packets involved in this IKE session will be changed to UDP Port 4500, including the quick mode packets and auxiliary informational packets.

Phase Two

Besides the changed UDP ports and a additional Non-ESP marker, even though here is the certification authentication, it is almost the same to the IPsec VPN authenticated by Pre-shared key without NAT.
1. 1st packet

2. 2nd packet

3. 3rd packet

The above descriptions are demonstrating the IKE protocol implementation under NAT-T. All packets will have a few changes in the format while passing the NAT box. Now let’s check what will happen while passing the NAT box?

5.5.5 IKE traffic while passing NAT

The above parts are the IKE in the view of IPsec VPN client ( the initiator side), actually it will be the same to that of gateway if there is no NAT box between them. Due to some doings of NAT, there are some little changes in the IKE packet formats. To be more precise, there are changes in the IP and UDP header of the IKE packets when they pass NAT. Let us check in detail what happens for a better understanding of IKE with the NAT-T.

Phase One

1. 1st packet
Not passing NAT

Passing NAT

By comparison, the IP addresses in IP header are translated into the NAT’s internet IP address, and the source port in the UDP is replaced as port 1 from the standard port 500. Hence both checksums are also to be altered correspondingly in the IP and UDP headers. Others parts above the udp layer are not affected and no changes are needed.
There emerges an IP Fragment after the NAT. As you see before, many combinations of encryption and authentication algorithms are proposed to be decided by the peer in the 1st packet, so all this together could be packed into a huge packet. Maybe it either is not a problem in local transportation, but for remote transportation, especially when it is passing some device like NAT or other type of physical network, such a fragment was necessary.
2. 2nd packet
The NAT device needs to map the destination UDP port 1 to UDP port 500 and the destination internet address to private intranet address when the reply packet reaches to the initiator( the client). It’s a reverse processing compared to the 1st packet.
3. 3rd -4th packets
The following straight 2 packets look like their leading ones.
4. 5th packet
Not passing NAT

Here UDP source port begun to switch to port 4500 for NAT ready.

Passing NAT

The NAT device mapped the client source UDP port 4500 to another UDP port 1024.

6th packet
In this step, besides the IP address’s switch, the NAT also need to map the destination UDP port 1024 to the client’s UDP port 4500 while finally give it to the initiator(the client). It’s a reverse processing in contrast to the 5th packet’s NAT processing.

Phase Two

The following straight 3 packets in phase two really look like the 5th and 6th packet’s processing.

5.5.6 ESP packet while passing NAT

Not passing NAT

Passing NAT

The MAC address, IP address, UDP source port and some checksums in IP and UDP header are changed. The ESP payloads, in essence, use the same NAT map table list as that of IKE starting from the 4th packet. Here it is a map from UDP port 1024 to UDP port 4500. In fact, any other subsequent packets involved in the same IKE and ESP, comprised of the information packet, UDP keep-alive packet and rekeying packet, have/use the same map table list.
Comparison to the example in the last chapter shows, that now there is an additional UDP packet inserted between the IP and ESP header. Due to the presence of this UDP packet, the NAT box can just treat it as a normal UDP packet and has nothing problem with the IPsec. The ESP payload is encapsulated in an UDP packet. The UDP-encapsulated ESP payload then are left to be processed by the peers.
There is a field to identify the ESP payload from the IKE payload because they use the same UDP port. As seen in the previous IKE part, it’s a dedicated filed with the value “000000” in the IKE UDP payload. But for the ESP payload, that field shares the same room with the SPI in ESP header. The processing program can tell the difference from the IKE UDP payload because in that case the value of SPI must non-zero.

5.5.7 Keep-alive UDP packet

With ipsec packets traversal of NAT box, there must be of generating a map table for the seesion inside NAT box. There is a mechanism as default time out to recycle the no longer used map table resources on NAT box. For the IPsec application, the session always lasts more longer time than any others, maybe there exists such a chance that no exchanged data interrim period could be more longer than the NAT box’s time out, so later packets could be blocked accidently because of the lost map table in the NAT box. The UDP Keep-alive is about to tackle this issue by sending an UDP packet regularly to keep IPsec session map table fresh. The keep-alive sign is a one-octel-long payload with the value “ff” in the UDP and generated from the peer who realizes the presence of NAT box before it.


While doing the above experiment one could find that UDP port 4500 is kept unchanged. That is because some NAT implements will keep the client source ports unchanged when these ports are not used in their map table. The change generally with port above the port 1024, will happen only if another connection using the same port comes to the NAT. Of course, the IP address is always changed when passing the NAT.
As seen in this example, the UDP port 500 in the first four straight packet in the phase one of IKE were changed to the port below 1024 by NAT box, in part because it is more of a standard internet port (below 1024) than a common application port (above 1024).

5.6 To find PMTU under IPsec

5.6.1 Introduction

Consider the case of an ordinary application on a client requesting data on a server, for example this might be the case when browsing web pages. The server including large data need to detect the maximal non-fragment transition unit of all path alongside for avoiding packet cuts and thus improving data exchange efficiency. This detection mechanism is called PMTU discovery protocol and results on a PMTU that is the minimum among every next-hop MTUs. However, when you switch to the IPsec tunnel, the data is encrypted and inserted into a new packet, which become larger than the original one. So the packets that were of the discovered PMTU size could become a larger and will become fragmented by the middle Router/Device when following the same passage. Such fragments in the path of IPsec tunnel is definitely not desired for IPsec implementation because it leads to degradation of the throughput/performance.
Here we try to provide some explanations of the involved problems to help to better understand the PMTU behavior under IPsec.

5.6.2 simple scenario

For convenience consider the IPsec NAT-T example that we used earlier in Example xxx. Assume that for every section of the network topology we have a default MTU of 1500 bytes. If there is no IPsec tunnel activated between client and Gateway, the PMTU will be still 1500 bytes with the assumption it’s reachable route from client to server without the ESP tunnel. Next one activates an IPsec tunnel between the client and the Gateway. We will check what differences emerge now.

Test Steps
1. On the client
   Select the IPsec session algorithms in the Shrew soft VPN client (see the previous instructions).
     Encryption: AES
     Authentication: SHA1
     IPsec session algorithms are claimed in phase two of IKE. 
2. Bring the IPsec tunnel up
3. On the Server 
   Issue the command
     # tracepath
     “tracepath” is an UDP-based application for PMTU detection under linux.

Packets captured and inspected
issuing the following command and the responses may look like this:
  # tracepath
    1: (                       1.048ms pmtu 1500
    1: (                       0.577ms 
    1: (                       0.682ms 
    2: (                       0.476ms pmtu 1422
    2: (                         4.034ms reached
    Resume: pmtu 1422 hops 2 back 127

The first hop is local, so MTU remains 1500 bytes unchanged as usual. Once the second hop, it’s just got to IPsec Tunnel between gateway and client, as you expected, PMTU was 1422 bytes changed. Let’s check the packets connected the change.

Source IP: Protocol: UDP Size: 1500

(1) IP packet size, 1500 bytes, general Ethernet II type.
(2) Fragment field is set on (Don’t fragment)
(3) Route hops
The next hop receiver checks this field and minus 1, if the receiver is not the destination and the value became zero, it will reply a TTL expired ICMP message to sender.
(4) Source and destination IPs
(5) Source UDP port and destination UDP port
As you see here, the request UDP packet has a size of 1500 bytes and has its fragmentation field set on. when getting to IPsec gateway and then packed into the ESP tunnel using AES and SHA1 algorithms, the new packet size will be 1560, 60bytes overhead compared to the next hop interface MTU 1500. with the fragment field set on in old packet, IPsec gateway shouldn’t cut them into small ones to forward and have to reply a fragment needed ICMP message to sender.

Source IP: Protocol: ICMP Type: Fragmentation needed

(1) Source and destination IPs
It’s the gateway ( in this case) rather than client to generate the fragmentation needed ICMP message.
(2) ICMP type
Fragmentation needed.
(3) MTU of next hop.
It becomes 1422 bytes now.
(5) The other payload of the ICMP message
The payload cut from the request UDP packet.

From the perspective of network topology, it seems that there is another hop in the NAT/Router box to go for the UDP request packet. However it’s transparent to the peers inside the tunnel that there may have a complicated network topology outside. In this case, there are logically only two hops for the packets between server and client.
Now turn back to check the value of PMTU, already knowing there, while switching to IPsec, are some additional payload because of the encryption needs, why it’s the value of 1422 bytes and not other?
In fact, this value depend on encrypted and authenticated algorithms. 1422 bytes is just the value using AES and SHA1 algorithms designated in the first step of experiment in this case. Here are some references on the overhead of MTU 1500 accord to different IPsec algorithms.
MD5 SHA-1 DES 3DES SEAL AES(128) Packet Size

Table: Packet size for different IPsec algorithms. IPsec Mode: ESP tunnel Original packet size:1500.

Next, adjust algorithms with combination of MD5 and 3DES and repeat last steps to see what happen in PMTU.
1. Changes on the client 
   Encryption: 3DES
   Authentication: MD5
2. Results changed on the server
   # tracepath
   1: (                       0.145ms pmtu 1500
   1: (                       0.453ms 
   1: (                       1.960ms 
   2: (                       1.098ms pmtu 1438
   2: (                         4.413ms reached
   Resume: pmtu 1438 hops 2 back 127

Now PMTU is 1438 bytes and changed again, let’s have a look at the reply of Fragment Needed ICMP message.

Source IP: Protocol: ICMP Type: Fragmentation needed

5.6.3 Complex scenario

Still using above IPsec NAT-T example, the only difference is that MTUs of every sections are to be arbitrary set, with 1300 from NAT/Router to Gateway, 1000 from NAT/Router to Client, 1500 others as usual. It’s easy to alter every interface MTU thanks to linux-based systems on Gateway, Server and NAT/Router.

Test Steps
1. on the client
  Encryption: 3DES
  Authentication: MD5
2. on the NAT/Router
  # ifconfig eth1 mtu 1000
3. on the Gateway
  # ifconfig eth0 mtu 1300
4. starting IPsec ESP tunnel 
  While the tunnel switching on, issue the following commands to start capturing packets.
  on the Gateway
    # tcpdump -i eth1 -s0 -w pmtu-gateway.cap
  on the Server
    # tcpdump -i eth0 -s0 -w pmtu-server.cap

Detecting PMTU
On server, issuing the command “tracepath” like this:
  # tracepath
  1: (                       0.215ms pmtu 1500
  1: (                       0.500ms 
  1: (                       0.395ms 
  2: (                       0.309ms pmtu 1238
  2: (                       0.448ms pmtu 942
  2: (                         2.221ms reached
  Resume: pmtu 942 hops 2 back 127

Packets inspected
The first hop PMTU remain 1500 unchanged, however, there are two second hops with PMTU 1238 and 942 respectively.
Of the second hops, the first PMTU subjected to the combination of gateway interface eth0 MTU 1300 and IPsec algorithms. If there is no change in gateway’s interface eth1 MTU 1500, PMTU should be 1438, exactly the same as the last emample using 3DES and MD5 as IPsec algorithms. So it should be a little smaller, 1238 compared to 1438, with a new altered MTU 1300. Let’s check the process it happened.

Source IP: Protocol: UDP Size: 1500

(1) UDP packet size 1500
(2) Fragment field is set on (Don’t fragment)
(3) TTL field
(4) Source and Destination IPs
It should be completely the same with the request UDP in last example.

Source IP: Protocol: ICMP Type: Fragmentation needed

(1)Source and Destination IPs
(2)ICMP type and MTU of next hop
  • Type: Fragment needed
  • MTU: 1238 bytes
(3)Other ICMP payload
  • Cuttings from REQUET packet

Of the second hops, the second PMTU subjected to the combination of NAT/Router interface eth1 MTU 1000 and IPsec algorithms. In fact, when sender get the first PMTU 1238, it should cut Request UDP payload to that size and forward again. The gateway packed the cut package into the ESP tunnel with fragment field set on and formed a new ESP packet, less than or equal to Gateway’s interface eth0 MTU 1300. when this new ESP packet arrived NAT/Router box, an fragmentation needed ICMP message had to be generated to return back to gateway (not the sender on server) because of its size lager than NAT/Router’s interface eth1 MTU 1000. The gateway then got the Fragment needed ICMP message and realized a smaller MTU 1000 in the passage, so it will recalculated the right PMTU for the sender on server based on the new MTU 1000, with cuts of some added payload due to encryption. When PMTU calculation finished, the gateway tried to match the sender behind the gateway (there may be many servers) and generated a Fragment needed ICMP message with the newest PMTU to the real sender whose packets meet transmission failure in the path. Next let’s check what happens in the order of packets sent, changed and replyed.

After the first round of try, the server knew new PMTU and adjusted Request UDP packet size to 1238 bytes and forwarded again.

Request from Server
Source IP: Protocol: UDP Size: 1238

(1) UDP packet size
  • The new UDP request packet size is 1238.
(2) Fragment field
  • Don’t fragment field is set on.
(3) TTL field
  • Still remain two hops unchanged because of no TTL expired ICMP ever return to server.
(4) Source and Destination IPs
  • Packet from to

Encrypted Request packet
The Request UDP packet with size of 1238 bytes was packed into ESP tunnel on Gateway and formed a new ESP packet with size 1296 bytes which could pass through it.
Source IP: Protocol: ESP Size:1296

(1) ESP Packet size
  • The encrypteded packet, 1296 bytes less than gateway’s interface eth0 MTU 1300, includes new IP header, UDP header of IPsec NAT-T and the ESP payload which inside is UDP packet with size of 1238 bytes from the sender on the server.
(2) fragment field
  • Don’t fragment field is also set on, in compliance with that of the request UDP packet.
(3) TTL field
  • It has nothing with field of the sender because it is considered as a new packet starting from
(4) Source and Destination IPs
  • Gateway IP:
  • NAT official IP:
(5) IPsec payload
  • UDP header of IPsec NAT-T and ESP payload

Encrypted Reply packet
The ESP packet with size of 1296, was larger than NAT/Router eth1 interface MTU 1000 and had to be dropped with don’t fragment field set on. A Fragment needed ICMP message should be generated and sent back to gateway (not sender on the server).
Source: Protocol: ICMP Type: Fragment needed

(1) Soure and Destination IPs
  • ICMP packet from NAT/Router to Gateway
(2) ICMP type and MTU of next hop
  • Type: Fragment needed.
  • MTU:1000
(3) ICMP payload
  • The cuttings from ESP packet sent by the gateway.

Decrypted Reply Packet
When Fragment needed ICMP returned back to Gateway, new PMTU for the sender was recalculated accord to MTU 1000. In the same way, the Gateway should generate a Fragment needed ICMP message with the new MTU 942 and send back to the real sender behind gateway.
Source: Protocol: ICMP Type: Fragment needed

(1)Source and Destination IPs
  • It is gateway rather than NAT/Router to send Fragment needed ICMP message back to server, which is different from the case without IPsec.
(2)ICMP type and MTU of next hop
  • Type: Fragment needed
  • MTU: 942
(3)ICMP payload
  • The cuttings from Request UDP packet sent by server.


As you know, there is also a short MTU in Nat/Router even without IPsec, Fragment Needed ICMP message should be generated by NAT/Router with destination IP of server But for the case with IPsec tunnel on, NAT/Router could only responde to the Gateway because ESP packet with source IP of gateway hides everything from its real sender. That’s the exact works IPsec should do. So IPsec Gateway has to do some additional works both to recaculate MTU and to match Fragment Needed ICMP message to the real sender because of many server behind the same IPsec Gateway.
For the server and client behind gateway, no matter how many hops between IPsec gateways it is simply transparent. Just like in this case, the sender on the server observes two times MTU changed in the second hop, which is a little strange for users not realizing the presence of IPsec tunnel.

5.6.5 PMTU discovery protocol implementation

In our experiments, a UDP-based application under linux, called “tracepath”, is used to determine the PMTU. In practice,each developer of an UDP-based application has to deal him/herself with the PMTU issues. However, this is different for TCP-based applications. For TCP applications the developers doesn’t have to know or to deal with the PMTU problem because of TCP’s native PMTU implementation in the operating system network protocol. That PMTU aware mechanism of TCP is very similar to that in “tracepath” application. TCP sessions try to send the largest packets which are first negotiated in the initial handshake steps. If some intermediate devices have a shorter MTU blocking the TCP packet, a “Fragement Needed” ICMP message with right MTU will be sent back to the sender, and then the packet with smaller size will be sent again. Due to security concerns, some intermediate devices may block ICMP packet resulting in that the sender could not be notified about the right PMTU. How to tweak this kind of failure notice is up to the operating system implementations. Some may trigger a time-out mechanism and next retransmit packet with least internet MTU of 576 bytes, which lead to a poor efficiency. Others maybe still try to retransmit the same size packet until TCP sessions fail completely, which lead to an unavailable service.

Menu [toggle]