Incorrect integer overflow detection in C code

A widely used method of detecting integer overflows results in undefined behavior according to the C standard.

Who Should Read This Document

This advisory deals with with details of the C programming language. It is targeted at C programmers.

Systems Affected

Systems which run binaries compiled by C compilers which perform particular optimizations.

Currently, no such systems are known to RUS-CERT.

Attack Requirements And Impact

Attack requirements and impact depend on the application.

Description

According to the C standard (ISO/IEC 9899:1999, section 6.5, paragraph 5), overflow in integer expressions results in undefined behavior:

If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

For unsigned integer types, the C standard later on defines that arithmetic can never overflow (ibid, parapgraph 9):

[...] A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

However, for signed integer types, there is no such guarantee. As a result, it is not possible to check for signed integer arithmetic overflow after the overflow occured. This is problematic especially if such checks are introduced in order to prevent buffer overflows (see the example below).

RUS-CERT is not aware of a concrete example of a code/compiler pair which shows the problem. However, compilers will perform more and more optimizations (such as range tracking for variables), and this problem might become practical in the future.

Examples

The following source code excerpt comes from the OpenSSL library, after the fix for CAN-2002-0659 has been applied (comments in italics have been added):

static int asn1_get_length(unsigned char **pp, int *inf, long *rl, int max)
           {
           unsigned char *p= *pp;
           long ret=0;
           int i;
   
           if (max-- < 1) return(0);
           if (*p == 0x80)
                   {
                   *inf=1;
                   ret=0;
                   p++;
                   }
           else
                   {
                   *inf=0;
                   i= *p&0x7f;
                   if (*(p++) & 0x80)
                           {
                           if (i > sizeof(long))
                                   return 0;
                           if (max-- == 0) return(0);
/* (1) */                  while (i-- > 0)      
                                   {
/* (2) */                          ret<<=8L; 
                                   ret|= *(p++);
                                   if (max-- == 0) return(0);
                                   }
                           }
                   else
                           ret=i;
                   }
/* (3) */  if (ret < 0)         
                   return 0;
           *pp=p;
           *rl=ret;
           return(1);
           }

This routine is used to extract an encoded long value of varying length, stored with a length prefix and the actual data in big endian format. Suppose that the encoded integer is stored using the maximum number of bytes. During the last iteration of the while loop (1), assignment expression (2) copies the first byte to the most significant byte in ret. If the value of this byte exceeds 127, the range of values representable in type long is exceeded, at least on typical C implementations, leading to integer overflow and undefined behavior.

If such an overflow does not occur, the check (3) does not succeed, and the return statement ist not executed. However, after undefined behavior has occured, all behavior is allowed by the C standard, so a conforming implementation may choose not to execute the return in this case, either. As a result, the check (3) can be ommitted by a conforming C implementation. However, this check has been added to eliminate a security risk — therefore this is obviously dangerous.

How likely is it that actual compilers will perform such an optimization in the future (or already do)? Range tracking on the ret variable could inform compilers that only positive values are assigned to it. A range tracking feature does not seem too esoteric, so compiler vendors will probably implement it one day.

As suggested below, the correct approach uses an unsigned long (unsigned arithmetic can never overflow, but in this case, the wraparound is avoided, at least if objects of type unsigned long contain no padding bits). In addition, the check (3) is replaced with one against LONG_MAX. The assignment (4') does not trigger undefined behavior because the check (3') ensures that the value is representable in type long.

#include <limits.h>

static int asn1_get_length(unsigned char **pp, int *inf, long *rl, int max)
           {
           unsigned char *p= *pp;
           unsigned long ret=0;
           int i;
   
           if (max-- < 1) return(0);
           if (*p == 0x80)
                   {
                   *inf=1;
                   ret=0;
                   p++;
                   }
           else
                   {
                   *inf=0;
                   i= *p&0x7f;
                   if (*(p++) & 0x80)
                           {
                           if (i > sizeof(long))
                                   return 0;
                           if (max-- == 0) return(0);
/* (1') */                 while (i-- > 0)      
                                   {
/* (2') */                         ret<<=8L;     
                                   ret|= *(p++);
                                   if (max-- == 0) return(0);
                                   }
                           }
                   else
                           ret=i;
                   }
/* (3') */ if (ret > LONG_MAX)      
                   return 0;
           *pp=p;
/* (4') */ *rl=(long)ret;           
           return(1);
           }

The ap_get_chunk_size routine featured a similar overflow bug CAN-2002-0392), and again, the developers tried to check for the overflow after it had occured. However, in this case, it seems to be less likely that future compiler will optimize away the crucial check.

About RUS-CERT

RUS-CERT is the Computer Emergency Response Team located at the Computing Center (RUS) of the University of Stuttgart, Germany.

Revision History

This document was published on 2002-08-05.