This articleneeds additional citations forverification.(February 2015) |
printfis aCstandard libraryfunctionthatformatstextand writes it tostandard output.
![Diagram illustrating syntax of printf function. The first argument to the function is a template string, which may contain format specifiers, which are introduced with the percent sign (%) character. Format specifiers instruct printf how to interpret and output values given in the corresponding arguments which follow the format string. printf replaces the format specifiers with the accordingly-interpreted contents of the remaining arguments, and outputs the result.](https://upload.wikimedia.org/wikipedia/commons/thumb/2/2c/Printf.svg/350px-Printf.svg.png)
The name,printfis short forprint formattedwhereprintrefers to output to aprinteralthough the functions are not limited to printer output.
The standard library provides many other similar functions that form a family ofprintf-likefunctions. These functions accept aformat stringparameter and avariablenumber of value parameters that the functionserializesper the format string and writes to anoutput streamor astring buffer.
The format string isencodedas atemplate languageconsisting of verbatim text andformat specifiersthat each specify how to serialize a value. As the format string is processedleft-to-right,a subsequent value is used for each format specifier found. A format specifier starts with a%
character and has one or more following characters that specify how to serialize a value.
The format stringsyntaxandsemanticsis the same for all of the functions in the printf-like family.
Mismatch between the format specifiers and count andtypeof values can cause acrashorvulnerability.
The printf format string is complementary to thescanf format string,which provides formatted input (lexinga.k.a.parsing). Both format strings provide relatively simple functionality compared to other template engines, lexers and parsers.
The formatting design has been copied in otherprogramming languages.
History
edit1950s: Fortran
editEarly programming languages likeFortranused special statements with different syntax from other calculations to build formatting descriptions.[1]In this example, the format is specified on line601,and thePRINT
[a]command refers to it by line number:
PRINT601,IA,IB,AREA
601FORMAT(4HA=,I5,5HB=,I5,8HAREA=,F10.2,13HSQUAREUNITS)
Hereby:
4H
indicates astringof 4characters"A="
(H
meansHollerith Field);I5
indicates anintegerfield of width 5;F10.2
indicates afloating-pointfield of width 10 with 2 digits after the decimal point.
An output with input arguments100
,200
,and1500.25
might look like this:
A= 100 B= 200 AREA= 1500.25 SQUARE UNITS
1960s: BCPL and ALGOL 68
editIn 1967,BCPLappeared.[2]Its library included thewritef
routine.[3]An example application looks like this:
WRITEF( "%I2-QUEENS PROBLEM HAS %I5 SOLUTIONS*N", NUMQUEENS, COUNT)
Hereby:
%I2
indicates anintegerof width 2 (the order of the format specification's field width and type is reversed compared to C'sprintf
);%I5
indicates an integer of width 5;*N
is a BCPLlanguageescape sequencerepresenting anewlinecharacter (for which C uses the escape sequence\n
).
In 1968,ALGOL 68had a more function-likeAPI,but still used special syntax (the$
delimiters surround special formatting syntax):
printf(($"Color"g",number1"6d,",number2"4zd,",hex"16r2d,",float"-d.2d,",unsigned value"-3d"."l$,
"red",123456,89,BIN255,3.14,250));
In contrast to Fortran, using normal function calls anddata typessimplifies the language and compiler, and allows the implementation of the input/output to be written in the same language.
These advantages were thought to outweigh the disadvantages (such as a complete lack oftype safetyin many instances) up until the 2000s, and in most newer languages of that era I/O is not part of the syntax.
People have since learned[4]that this potentially results in consequences, ranging from security exploits to hardware failures (e.g., phone's networking capabilities being permanently disabled after trying to connect to an access point named "%p%s%s%s%s%n"[5]). Modern languages, such asC++20and later, tend to include format specifications as a part of the language syntax,[6]which restore type safety in formatting to an extent, and allow the compiler to detect some invalid combinations of format specifiers and data types at compile time.
1970s: C
editIn 1973,printf
was included as aC standard libraryroutine as part ofVersion 4 Unix.[7]
1990s: Shell command
editIn 1990, aprintf
shell commandwas attested as part of4.3BSD-Reno.It is modeled after the C standard library function.[8]
In 1991, aprintf
command was included with GNU shellutils (now part ofGNU Core Utilities).
2000s: -Wformat safety
editThe need to do something about the range of problems resulting from lack oftype safetyhas prompted attempts to make the C++ compilerprintf
-aware.
The-Wformatoption ofGCCallows compile-time checks toprintf
calls, enabling the compiler to detect a subset of invalid calls (and issue either a warning or an error, stopping the compilation altogether, depending on other flags).[9]
Since the compiler is inspectingprintf
format specifiers, enabling this effectively extends the C++ syntax by making formatting a part of it.
2020s: C++20 Format Specifiers and C++23 print
editAs said above, numerous issues[10]withprintf()
's lack oftype safetyresulted in the revision[11]of approach to formatting, andC++20onwards include format specifications in the language[12]to enable type-safe formatting.
The approach (and syntax) of C++20std::format
resulted from effectively incorporating Victor Zverovich'slibfmt
[13]API into the language specification[14](Zverovich wrote[15]the first draft of the new format proposal); consequently,libfmt
is an implementation of the C++20 format specification.
The formatting function has been combined with output inC++23,which provides[16]thestd::print
command as a replacement forprintf()
.
As the format specification has become a part of the language syntax, a C++ compiler is able to prevent invalid combinations of types and format specifiers in many cases. Unlike the-Wformatoption, this is not an optional feature.
The format specification oflibfmt
andstd::format
is, in itself, an extensible "mini-language" (referred to as such in the specification),[17]an example of adomain-specific language.
Incorporation of a separate, domain specific mini-language specifically for formatting into the C++ language syntax forstd::print
,therefore, completes the historical cycle, bringing the state-of-the-art (as of 2024) back to what it was in the case ofFORTRAN's firstPRINT
implementation in the 1950s discussed in the beginning of this section.
Format specifier
editFormatting of a value is specified as markup in the format string. For example, the following outputsYour age isand then the value of the variableagein decimal format.
printf("Your age is %d",age);
Syntax
editThe syntax for a format specifier is:
%[''parameter''][''flags''][''width''][.''precision''][''length'']''type''
Parameter field
editThe parameter field is optional. If included, then matching specifiers to values isnotsequential. The numeric valuenselects the n-th value parameter.
Character | Description |
---|---|
n$ | nis the index of the value parameter toserializeusing this format specifier |
This is aPOSIXextension; notC99.
This field allows for using the same value multiple times in a format string instead of having to pass the value multiple times. If a specifier includes this field, then subsequent specifiers must also.
For example,
printf("%2$d %2$#x; %1$d %1$#x",16,17);
outputs:17 0x11; 16 0x10.
This field is particularly useful forlocalizingmessages to differentnatural languagesthat use differentword orders.
InMicrosoft Windows,support for this feature is via a different function,printf_p
.
Flags field
editThe flags field can be zero or more of (in any order):
Character | Description |
---|---|
- (minus) |
Left-align the output of this placeholder. (The default is to right-align the output.) |
+ (plus) |
Prepends a plus for positive signed-numeric types. positive =+ ,negative =- .(The default does not prepend anything in front of positive numbers.) |
(space) |
Prepends a space for positive signed-numeric types. positive = ,negative =- .This flag is ignored if the+flag exists.(The default does not prepend anything in front of positive numbers.) |
0 (zero) |
When the 'width' option is specified, prepends zeros for numeric types. (The default prepends spaces.) For example, printf("%4X",3); produces3,whileprintf("%04X",3); produces0003.
|
' (apostrophe) |
The integer or exponent of a decimal has the thousands grouping separator applied. |
# (hash) |
Alternate form: ForgandGtypes, trailing zeros are not removed. Forf,F,e,E,g,Gtypes, the output always contains a decimal point. Foro,x,Xtypes, the text0,0x,0X,respectively, is prepended to non-zero numbers. |
Width field
editThe width field specifies theminimumnumber of characters to output. If the value can be represented in fewer characters, then the value is left-padded with spaces so that output is the number of characters specified. If the value requires more characters, then the output is longer than the specified width. A value is never truncated.
For example,printf("%3d",12);
specifies a width of 3 and outputs12with a space on the left to output 3 characters. The callprintf("%3d",1234);
outputs1234which is 4 characters long since that is the minimum width for that value even though the width specified is 3.
If the width field is omitted, the output is the minimum number of characters for the value.
If the field is specified as*
,then the width value is read from the list of values in the call.[18]For example,printf("%*d",3,10);
outputs10where the second parameter,3
,is the width (matches with*
) and10
is the value toserialize(matches withd
).
Though not part of the width field, a leading zero is interpreted as the zero-padding flag mentioned above, and a negative value is treated as the positive value in conjunction with the left-alignment-
flag also mentioned above.
The width field can be used to format values as a table (tabulated output). But, columns do not align if any value is larger than fits in the width specified. For example, notice that the last line value (1234) does not fit in the first column of width 3 and therefore the column is not aligned.
1 1
12 12
123 123
1234 123
Precision field
editThe precision field usually specifies amaximumlimit of the output, depending on the particular formatting type. Forfloating-pointnumeric types, it specifies the number of digits to the right of the decimal point to which the output should be rounded; for%g
and%G
it specifies the total number ofsignificant digits(before and after the decimal, not including leading or trailing zeroes) to round to. For thestring type,it limits the number of characters that should be output, after which the string is truncated.
The precision field may be omitted, or a numeric integer value, or a dynamic value when passed as another argument when indicated by an asterisk (*
). For example,printf("%.*s",3,"abcdef");
outputsabc.
Length field
editThe length field can be omitted or be any of:
Character | Description |
---|---|
hh | For integer types, causesprintfto expect anint-sized integer argument which was promoted from achar. |
h | For integer types, causesprintfto expect anint-sized integer argument which was promoted from ashort. |
l | For integer types, causesprintfto expect along-sized integer argument.
For floating-point types, this is ignored.floatarguments are always promoted todoublewhen used in avarargscall.[19] |
ll | For integer types, causesprintfto expect along long-sized integer argument. |
L | For floating-point types, causesprintfto expect along doubleargument. |
z | For integer types, causesprintfto expect asize_t-sized integer argument. |
j | For integer types, causesprintfto expect aintmax_t-sized integer argument. |
t | For integer types, causesprintfto expect aptrdiff_t-sized integer argument. |
Platform-specific length options came to exist prior to widespread use of the ISO C99 extensions, including:
Characters | Description | Commonly found platforms |
---|---|---|
I | For signed integer types, causesprintfto expectptrdiff_t-sized integer argument; for unsigned integer types, causesprintfto expectsize_t-sized integer argument. | Win32/Win64 |
I32 | For integer types, causesprintfto expect a 32-bit (double word) integer argument. | Win32/Win64 |
I64 | For integer types, causesprintfto expect a 64-bit (quad word) integer argument. | Win32/Win64 |
q | For integer types, causesprintfto expect a 64-bit (quad word) integer argument. | BSD |
ISO C99 includes theinttypes.h
header file that includes a number ofmacrosfor platform-independentprintf
coding. For example:printf("%"PRId64,t);
specifies decimal format for a64-bit signed integer.Since the macros evaluate to astring literal,and the compilerconcatenatesadjacent string literals, the expression"%"PRId64
compiles to a single string.
Macros include:
Macro | Description |
---|---|
PRId32 | Typically equivalent toI32d(Win32/Win64) ord |
PRId64 | Typically equivalent toI64d(Win32/Win64),lld(32-bit platforms) orld(64-bit platforms) |
PRIi32 | Typically equivalent toI32i(Win32/Win64) ori |
PRIi64 | Typically equivalent toI64i(Win32/Win64),lli(32-bit platforms) orli(64-bit platforms) |
PRIu32 | Typically equivalent toI32u(Win32/Win64) oru |
PRIu64 | Typically equivalent toI64u(Win32/Win64),llu(32-bit platforms) orlu(64-bit platforms) |
PRIx32 | Typically equivalent toI32x(Win32/Win64) orx |
PRIx64 | Typically equivalent toI64x(Win32/Win64),llx(32-bit platforms) orlx(64-bit platforms) |
Type field
editThe type field can be any of:
Character | Description |
---|---|
% | Prints a literal%character (this type does not accept any flags, width, precision, length fields). |
d,i | intas a signedinteger.%dand%iare synonymous for output, but are different when used withscanf for input (where using%iwill interpret a number as hexadecimal if it's preceded by0x,and octal if it's preceded by0.)
|
u | Print decimalunsigned int. |
f,F | doublein normal (fixed-point) notation.fandFonly differs in how the strings for an infinite number orNaNare printed (inf,infinityandnanforf;INF,INFINITYandNANforF). |
e,E | doublevalue in standard form (d.ddde±dd). AnEconversion uses the letterE(rather thane) to introduce the exponent. The exponent always contains at least two digits; if the value is zero, the exponent is00.In Windows, the exponent contains three digits by default, e.g.1.5e002,but this can be altered by Microsoft-specific_set_output_format function.
|
g,G | doublein either normal or exponential notation, whichever is more appropriate for its magnitude.guses lower-case letters,Guses upper-case letters. This type differs slightly from fixed-point notation in that insignificant zeroes to the right of the decimal point are not included, and that the precision field specifies the total number of significant digits rather than the digits after the decimal. Also, the decimal point is not included on whole numbers. |
x,X | unsigned intas ahexadecimalnumber.xuses lower-case letters andXuses upper-case. |
o | unsigned intinoctal. |
s | null-terminated string. |
c | char(character). |
p | void*(pointer to void) in an implementation-defined format. |
a,A | doublein hexadecimal notation, starting with0xor0X.auses lower-case letters,Auses upper-case letters.[20][21](C++11'sstd::iostream class provides ahexfloatthat works the same).
|
n | Print nothing, but writes the number of characters written so far into an integerpointerparameter. InJavathis prints anewline.[22] |
Custom data type formatting
editA common way to handle formatting with a custom data type is to format the custom data type value into astring,then use the%s
specifier to include theserializedvalue in a larger message.
Some printf-like functions allow extensions to theescape-character-basedmini-language,thus allowing the programmer to use a specific formatting function for non-builtin types. One is the (nowdeprecated)glibc'sregister_printf_function()
.However, it is rarely used due to the fact that it conflicts withstatic format string checking.Another isVstr custom formatters,which allows adding multi-character format names.
Some applications (like theApache HTTP Server) include their own printf-like function, and embed extensions into it. However these all tend to have the same problems thatregister_printf_function()
has.
TheLinux kernelprintk
function supports a number of ways to display kernel structures using the generic%p
specification, byappendingadditional format characters.[23]For example,%pI4
prints anIPv4 addressin dotted-decimal form. This allows static format string checking (of the%p
portion) at the expense of full compatibility with normal printf.
Family
editVariants ofprintf
provide the formatting features but with additional or slightly different behavior.
fprintf
outputs to a systemfile objectinstead ofstandard output.
sprintf
writes to astring bufferinstead of standard output.
snprintf
provides a level of safety oversprintf
since the caller provides a length (n) parameter that specifies the maximum number or chars to write to the buffer.
For most printf-family functions, there is a variant that acceptsva_list
rather than a variable length parameter list. For example, there is avfprintf
,vsprintf
,vsnprintf
.
Vulnerabilities
editFormat string attack
editExtra value parameters are ignored, but if the format string has more format specifiers than value parameters passed thebehavior is undefined.For some C compilers, an extra format specifier results in consuming a value even though there isn't one. This can allow theformat string attack.Generally, for C, arguments arepassed on the stack.If too few arguments are passed, then printf can read past the end of the stack frame, thus allowing an attacker to read the stack.
Some compilers, likethe GNU Compiler Collection,willstatically checkthe format strings of printf-like functions and warn about problems (when using the flags-Wallor-Wformat). GCC will also warn about user-defined printf-style functions if the non-standard "format"__attribute__
is applied to the function.
Uncontrolled format string exploit
editThe format string is often astring literal,which allowsstatic analysisof the function call. However, the format string can be the value of avariable,which allows for dynamic formatting but also a security vulnerability known as anuncontrolled format stringexploit.
Memory write
editAlthough an output function on the surface,printf
allows writing to a memory location specified by an argument via%n
.This functionality is occasionally used as a part of more elaborate format-string attacks.[24]
The%n
functionality also makesprintf
accidentallyTuring-completeeven with a well-formed set of arguments. A game of tic-tac-toe written in the format string is a winner of the 27thIOCCC.[25]
Programming languages with printf
editNotable programming languages that include printf or printf-like functionality:
Excluded are languages that use format strings that deviate from the style in this article (such asAMPLandElixir), languages that inherit their implementation from theJVMor other environment (such asClojureandScala), and languages that do not have a standard native printf implementation but have external libraries which emulate printf behavior (such asJavaScript).
- awk[26]
- C
- C++
- D
- F#
- G (LabVIEW)
- GNU MathProg
- GNU Octave
- Go
- Haskell
- J
- Java(since version 1.5) and JVM languages
- Julia(via Printf standard library[27])
- Lua(
string.format
) - Maple
- MATLAB
- Max(via the
sprintf
object) - Mythryl
- Objective-C
- OCaml(via the Printf module)
- PARI/GP
- Perl
- PHP
- Python(via
%
operator)[28] - R
- Raku(via
printf
,sprintf
,andfmt
) - Red/System
- Ruby
- Tcl(via
format
command) - Transact-SQL(via
xp_sprintf
) - Vala(via
print()
andFileStream.printf()
)
See also
edit- "Hello, World!" program– A basic example program first featured inThe C Programming Language(the "K&R Book" ), which in the C example uses printf to output the message "Hello, World!"
- Format (Common Lisp)– function in Common Lisp that can produce formatted text using a format string similar to the printf format string
- C standard library– Standard library for the C programming language
- Format string attack– Type of software vulnerability
std::iostream
– C++ standard library header for input/output- ML (programming language)– General purpose functional programming language
- printf debugging– Fixing defects in an engineered system
printf
(Unix)– Standard UNIX utilityprintk
– Linux kernel C functionscanf
– Control parameter used in programming languages- string interpolation– Replacing placeholders in a string with values
Notes
editReferences
edit- ^abBackus, John Warner;Beeber, R. J.; Best, Sheldon F.; Goldberg, Richard; Herrick, Harlan L.; Hughes, R. A.; Mitchell, L. B.; Nelson, Robert A.;Nutt, Roy;Sayre, David;Sheridan, Peter B.; Stern, Harold; Ziller, Irving (15 October 1956).Sayre, David(ed.).The FORTRAN Automatic Coding System for the IBM 704 EDPM: Programmer's Reference Manual(PDF).New York, USA: Applied Science Division and Programming Research Department,International Business Machines Corporation.pp.26–30.Archived(PDF)from the original on 4 July 2022.Retrieved4 July2022.(2+51+1 pages)
- ^"BCPL".cl.cam.ac.uk.Retrieved19 March2018.
- ^Richards, Martin; Whitby-Strevens, Colin (1979).BCPL - the language and its compiler.Cambridge University Press. p.50.
- ^"Format String Attack".
- ^"iPhone Bug Breaks WiFi When You Join Hotspot With Unusual Name".
- ^"C++20 Standard format specification".
- ^McIlroy, M. D.(1987).A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986(PDF)(Technical report). CSTR. Bell Labs. 139.
- ^"printf (4.3+Reno BSD)".man.freebsd.org.Retrieved1 April2024.
- ^Free Software Foundation(2024)."3.8 Options to Request or Suppress Warnings".GCC 14.2 Manual.self-published.Retrieved12 February2025.
- ^"How Not to Code: Beware of printf".10 August 2016.
- ^"C++20 Format improvements proposal to enable compile-time checks".
- ^"C++20 std::format".
- ^"libfmt: a modern formatting library".
- ^"C++20 Text Formatting: An Introduction".
- ^"C++ Format Proposal History".
- ^"C++ print".
- ^"Format Specification Mini-Language".
- ^"printf".cplusplus.com.Retrieved10 June2020.
- ^"7.19.6.1".ISO/IEC 9899:1999(E): Programming Languages – C.ISO/IEC.1999. para. 7.
- ^Free Software Foundation."Table of Output Conversions".The GNU C Library Reference Manual.self-published. sec. 12.12.3.Retrieved17 March2014.
- ^ "printf" (%aadded in C99)
- ^"Formatting Numeric Print Output".The Java Tutorials.Oracle Inc.Retrieved19 March2018.
- ^Dunlap, Randy; Murray, Andrew (n.d.)."How to get printk format specifiers right".The Linux Kernel documentation.Linux Foundation.Archivedfrom the original on 6 February 2025.Retrieved12 February2025.
- ^El-Sherei, Saif (20 May 2013)."Format String Exploitation Tutorial"(PDF).Exploit Database.Contributions by Haroon meer; Sherif El Deeb; Corelancoder; Dominic Wang.OffSec Services Limited.Retrieved12 February2025.
- ^Carlini, Nicholas (2020)."printf machine".International Obfuscated C Code Contest.Judged by Leonid A. Broukhis and Landon Curt Noll. Landon Curt Noll.Retrieved12 February2025.
- ^"The Open Group Base Specifications Issue 7, 2018 edition", "POSIX awk", "Output Statements".pubs.opengroup.org.The Open Group.Retrieved29 May2022.
- ^"Printf Standard Library".The Julia Language Manual.Retrieved22 February2021.
- ^"Built-in Types:
printf
-style String Formatting ",The Python Standard Library,Python Software Foundation,retrieved24 February2021
External links
edit- C++ reference for
std::fprintf
- gcc printf format specifications quick reference
- The Single UNIX Specification,Version 4 fromThe Open Group :print formatted output – System Interfaces Reference,
- The
Formatter
specificationinJava 1.5 - GNU Bash
printf(1)
builtin