Jump to content

Null pointer

From Wikipedia, the free encyclopedia

Incomputing,anull pointerornull referenceis a value saved for indicating that thepointerorreferencedoes not refer to a validobject.Programs routinely use null pointers to represent conditions such as the end of alistof unknown length or the failure to perform some action; this use of null pointers can be compared tonullable typesand to theNothingvalue in anoption type.

A null pointer should not be confused with anuninitialized pointer:a null pointer is guaranteed to compare unequal to any pointer that points to a valid object. However, in general, most languages do not offer such guarantee for uninitialized pointers. It might compare equal to other, valid pointers; or it might compare equal to null pointers. It might do both at different times; or the comparison might beundefined behaviour.Also, in languages offering such support, the correct use depends on the individual experience of each developer and linter tools. Even when used properly, null pointers aresemantically incomplete,since they do not offer the possibility to express the difference between a "Not Applicable" value versus a "Not known" value or versus a "Future" value.

Because a null pointer does not point to a meaningful object, an attempt to access the data stored at that (invalid) memory location may cause a run-time error or immediate program crash. This is thenull pointer error.It is one of the most common types of software weaknesses,[1]andTony Hoare,who introduced the concept, has referred to it as a "billion dollar mistake".

C

[edit]

InC,two null pointers of any type are guaranteed to compare equal.[2]The preprocessor macroNULLis defined as an implementation-defined null pointer constant in<stdlib.h>,[3]which inC99can be portably expressed as((void *)0),the integer value0convertedto the typevoid*(seepointer to void type).[4]The C standard does not say that the null pointer is the same as the pointer tomemory address0, though that may be the case in practice.Dereferencinga null pointer isundefined behaviorin C,[5]and a conforming implementation is allowed to assume that any pointer that is dereferenced is not null.

In practice, dereferencing a null pointer may result in an attempted read or write frommemorythat is not mapped, triggering asegmentation faultor memory access violation. This may manifest itself as a program crash, or be transformed into a softwareexceptionthat can be caught by program code. There are, however, certain circumstances where this is not the case. For example, inx86real mode,the address0000:0000is readable and also usually writable, and dereferencing a pointer to that address is a perfectly valid but typically unwanted action that may lead to undefined but non-crashing behavior in the application. There are occasions when dereferencing the pointer to address zeroisintentional and well-defined; for example,BIOScode written in C for 16-bit real-mode x86 devices may write theinterrupt descriptor table(IDT) at physical address 0 of the machine by dereferencing a null pointer for writing. It is also possible for the compiler to optimize away the null pointer dereference, avoiding a segmentation fault but causing other undesired behavior.[6]

C++

[edit]

In C++, while theNULLmacro was inherited from C, the integer literal for zero has been traditionally preferred to represent a null pointer constant.[7]However,C++11introduced the explicit null pointer constantnullptrand typenullptr_tto be used instead.

Other languages

[edit]

In some programming language environments (at least one proprietary Lisp implementation, for example),[citation needed]the value used as the null pointer (callednilinLisp) may actually be a pointer to a block of internal data useful to the implementation (but not explicitly reachable from user programs), thus allowing the same register to be used as a useful constant and a quick way of accessing implementation internals. This is known as thenilvector.

In languages with atagged architecture,a possibly null pointer can be replaced with atagged unionwhich enforces explicit handling of the exceptional case; in fact, a possibly null pointer can be seen as atagged pointerwith a computed tag.

Programming languages use different literals for thenull pointer.In Python, for example, a null value is calledNone.InPascalandSwift,a null pointer is callednil.InEiffel,it is called avoidreference.

Null dereferencing

[edit]

Because a null pointer does not point to a meaningful object, an attempt todereference(i.e., access the data stored at that memory location) a null pointer usually (but not always) causes a run-time error or immediate program crash.MITRElists the null pointer error as one of the most commonly exploited software weaknesses.[8]

  • InC,dereferencing a null pointer isundefined behavior.[5]Many implementations cause such code to result in the program being halted with anaccess violation,because the null pointer representation is chosen to be an address that is never allocated by the system for storing objects. However, this behavior is not universal. It is also not guaranteed, since compilers are permitted to optimize programs under the assumption that they are free of undefined behavior.
  • InDelphiand many other Pascal implementations, the constantnilrepresents a null pointer to the first address in memory which is also used to initialize managed variables. Dereferencing it raises an external OS exception which is mapped onto a PascalEAccessViolationexception instance if theSystem.SysUtilsunit is linked in theusesclause.
  • InJava,access to a null reference causes aNullPointerException(NPE), which can be caught by error handling code, but the preferred practice is to ensure that such exceptions never occur.
  • InLisp,nilis afirst class object.By convention,(first nil)isnil,as is(rest nil).So dereferencingnilin these contexts will not cause an error, but poorly written code can get into an infinite loop.
  • In.NET,access to null reference causes aNullReferenceExceptionto be thrown. Although catching these is generally considered bad practice, this exception type can be caught and handled by the program.
  • InObjective-C,messages may be sent to anilobject (which is a null pointer) without causing the program to be interrupted; the message is simply ignored, and the return value (if any) isnilor0,depending on the type.[9]
  • Before the introduction ofSupervisor Mode Access Prevention(SMAP), a null pointer dereference bug could be exploited by mappingpage zerointo the attacker'saddress spaceand hence causing the null pointer to point to that region. This could lead tocode executionin some cases.[10]

Mitigation

[edit]

While we could have languages with no nulls, most do have the possibility of nulls so there are techniques to avoid or aid debugging null pointer dereferences.[11]Bond et al.[11]suggest modifying the Java Virtual Machine (JVM) to keep track of null propagation.

We have three levels of handling null references, in order of effectiveness:

1. languages with no null;

2. languages that can statically analyse code to avoid the possibility of null dereference at run time;

3. if null dereference can occur at run time, tools that aid debugging.

Purefunctional languagesare an example of level 1 since no direct access is provided to pointers and all code and data is immutable. User code running ininterpretedor virtual-machine languages generally does not suffer the problem of null pointer dereferencing.

Where a language does provide or utilise pointers which could become void, it is possible to avoid runtime null dereferences by providingcompilation-time checkingviastatic analysisor other techniques, with syntactic assistance from language features such as those seen in theEiffel programming languagewith Void safety[12]to avoid null derefences,D,[13]andRust.[14]

In some languages analysis can be performed using external tools, but these are weak compared to direct language support with compiler checks since they are limited by the language definition itself.

The last resort of level 3 is when a null reference occurs at run time, debugging aids can help.

Alternatives to null pointers

[edit]

As a rule of thumb, for each type of struct or class, define some objects representing some state of the business logic replacing the undefined behaviour on null. For example, "future" to indicate a field inside a structure that will not be available right now (but for which we know in advance that in the future it will be defined), "not applicable" to indicate a field in a non-normalized structure, "error", "timeout" to indicate that the field could not be initialized (probably stopping normal execution of the full program, thread, request or command).

History

[edit]

In 2009,Tony Hoarestated[15][16] that he invented the null reference in 1965 as part of theALGOL Wlanguage. In that 2009 reference Hoare describes his invention as a "billion-dollar mistake":

I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

See also

[edit]

Notes

[edit]
  1. ^"CWE-476: NULL Pointer Dereference".MITRE.
  2. ^ISO/IEC 9899,clause 6.3.2.3, paragraph 4.
  3. ^ISO/IEC 9899,clause 7.17, paragraph 3:NULL... which expands to an implementation-defined null pointer constant...
  4. ^ISO/IEC 9899,clause 6.3.2.3, paragraph 3.
  5. ^abISO/IEC 9899,clause 6.5.3.2, paragraph 4, esp. footnote 87.
  6. ^Lattner, Chris (2011-05-13)."What Every C Programmer Should Know About Undefined Behavior #1/3".blog.llvm.org.Archivedfrom the original on 2023-06-14.Retrieved2023-06-14.
  7. ^Stroustrup, Bjarne(March 2001). "Chapter 5:
    Theconstqualifier (§5.4) prevents accidental redefinition ofNULLand ensures thatNULLcan be used where a constant is required. ".The C++ Programming Language(14th printing of 3rd ed.). United States and Canada: Addison–Wesley. p.88.ISBN0-201-88954-4.
  8. ^"CWE-476: NULL Pointer Dereference".MITRE.
  9. ^The Objective-C 2.0 Programming Language,section "Sending Messages to nil".
  10. ^"OS X exploitable kernel NULL pointer dereference in AppleGraphicsDeviceControl"
  11. ^abBond, Michael D.; Nethercote, Nicholas; Kent, Stephen W.; Guyer, Samuel Z.; McKinley, Kathryn S. (2007). "Tracking bad apples".Proceedings of the 22nd annual ACM SIGPLAN conference on Object oriented programming systems and applications - OOPSLA '07.p. 405.doi:10.1145/1297027.1297057.ISBN9781595937865.S2CID2832749.
  12. ^"Void-safety: Background, definition, and tools".Retrieved2021-11-24.
  13. ^Bartosz Milewski."SafeD – D Programming Language".Retrieved17 July2014.
  14. ^"Fearless Security: Memory Safety".Archivedfrom the original on 8 November 2020.Retrieved4 November2020.
  15. ^Tony Hoare(2009-08-25)."Null References: The Billion Dollar Mistake".InfoQ.
  16. ^Tony Hoare(2009-08-25)."Presentation:" Null References: The Billion Dollar Mistake "".InfoQ.

References

[edit]