Linux Kernel Overlapping Internet Fragment Bug

last updated 2020-09-15 16:13:24 by met

Some Linux kernel versions had a bug in their ip fragmentation module which caused system crash during handling of a malformed ip fragment back in 1997.

What's an Ip Fragment

The "Internet Protocol" allows packets larger than the maximum transmission unit to be sent as smaller fragments. If an internet packet is fragmented; it uses "More Fragments" flag to indicate further coming fragments, "Fragment Offset" field to indicate fragments position and "IP Identifier" field to group fragments.

The procedure for handling this in Linux kernel 2.1 is fairly simple. When the ip stack encounters with a packet with "More Fragments" flag, it keeps the packet in a hashmap until all fragments are received. Later, the ip stack tries to order packets since packets can arrive without order. After ordering, fragments are glued by their offset into a single packet for the further steps.

IP Fragment IP Fragment
0 128   256
A normal fragmented ip packet.

An example fragmented ip packet ready to glue is seen on the figure. The first fragment starts at offset 0 and finishes at offset 128 which the following fragment starts at. If there were a space caused by a lost fragment, then the kernel would wait for packet around 30 seconds before discarding all related fragments. But assuming the second fragment has its "More Fragments" flag unset, this fragment is ready for gluing into a single packet.

Packet Overlapping

So now, we know what happens if we make a gap between fragments. But what if we reduce the offset of the second fragment so it overlaps with the first one? RFC791, the Internet Protocol describes an example defragmentation algorithm that allows overlapping so a newer fragment can overwrite previous fragments which linux just skipped overwriting part. So nothing spectacular happens on linux.

IP Fragment  
  IP Fragment    
0 64 96 128
A packet inside another packet

Problems start to arise when we "set the fragment offset to be inside of the previous packet's payload (it overlaps inside the previous packet) but do not include enough payload to cover complete the datagram".[quoted from teardrop.c inline comments]

The Bug

As you might start to guess, the linux kernel calculated the length of a fragment by substracting the end of the first fragment from end of the second fragment so it skips the overwriting previous fragment and writes only the non-overlapping part - what's new from a fragment. This length calculation led to calculation of negative sizes for the fragments that overlap but finish earlier than the previous fragment.

/* Fill in the structure. */
fp->offset = offset;
fp->end = end;
fp->len = end - offset;
In this context; the "end" variable is the end of the second packet, "offset" is the end of the first packet.

Let the length be calculated as negative, the gluing part which uses the length requires an unsigned type which a small negative value represented as a very high number. What gluing does? Let's see:

/* Copy the data portions of all fragments into the new buffer. */
fp = qp->fragments;
while(fp) {
    if(count+fp->len > skb->len) {
        NETDEBUG(printk(KERN_ERR "Invalid fragment list: "
                "Fragment over size.\n"));
        return NULL;
    memcpy((ptr + fp->offset), fp->ptr, fp->len);
    if (!count) {
        skb->dst = dst_clone(fp->skb->dst);
        skb->dev = fp->skb->dev;
    count += fp->len;
    fp = fp->next;
The part in ip_glue function that actually glues fragments into a new buffer.

The memcpy part evaluates the len as a very big number and starts to write huge amounts of data into the offsetted address. A hotfix released for this in version 2.1.63, checking if the len variable is greater than zero before passing it to memcpy.

The Fix

	/* Copy the data portions of all fragments into the new buffer. */
	fp = qp->fragments;
	while(fp) {
		if (fp->len < 0 || count+fp->len > skb->len) {
The hotfix dropped fragments with negative len.

The Comeback

The story doesn't end here. The hotfix checked for a negative value but what about a packet with a length 0? Almost two years later, in march 1999, the check for zero length packets added as following:

	/* Copy the data portions of all fragments into the new buffer. */
	fp = qp->fragments;
	count = qp->ihlen;
	while(fp) {
		if ((fp->len <= 0) || ((count + fp->len) > skb->len))
It's now dropping negative and zero length fragments.

A set of similar ip fragmentation vulnerabilities including these ones called "Teardrop Attack" and was used for a while to crash computers and cause denial of service.