Monday, April 25, 2016

Dangling pointers and undefined behavior

My previous blog post "C pointers are not hardware pointers" says that dangling pointers may invoke undefined behavior even if they are not dereferenced, as in the example
void foo(int *p, int *q)
    if (p == q)  // Undefined behavior!
I have got lots of questions and comments of the form "This cannot be right — the comparison must be valid as p is not modified by free(p)", so this blog post provides more details.

That the behavior is undefined follows from C11 6.2.4 "Storage durations of objects"
The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.
and 7.22.3 "Memory management functions" that says that free ends the lifetime of objects
The lifetime of an allocated object extends from the allocation until the deallocation.

The reason for the standard to make this undefined behavior is that even simple operations on dangling pointers, such as assignment or comparison, may misbehave and result in exceptions or arbitrary values on some architectures.1 The C99 rationale has an example of how this can happen
Consider a hypothetical segmented architecture on which pointers comprise a segment descriptor and an offset. Suppose that segments are relatively small so that large arrays are allocated in multiple segments. While the segments are valid (allocated, mapped to real memory), the hardware, operating system, or C implementation can make these multiple segments behave like a single object: pointer arithmetic and relational operators use the defined mapping to impose the proper order on the elements of the array. Once the memory is deallocated, the mapping is no longer guaranteed to exist. Use of the segment descriptor might now cause an exception, or the hardware addressing logic might return meaningless data.

1. At least on architectures that existed when the first version of the standard was written.

No comments:

Post a Comment