Skip to content

c42f/tinyformat

Repository files navigation

tinyformat.h

A minimal type safe printf() replacement

tinyformat.his a type safe printf replacement library in a single C++ header file. If you've ever wantedprintf( "%s", s)to just work regardless of the type ofs,tinyformat might be for you. Design goals include:

  • Type safety and extensibility for user defined types.
  • C99printf()compatibility, to the extent possible usingstd::ostream
  • POSIX extension for positional arguments
  • Simplicity and minimalism. A single header file to include and distribute with your projects.
  • Augment rather than replace the standard stream formatting mechanism
  • C++98 support, with optional C++11 niceties

Build status, master branch: Linux/OSX build Windows build

Quickstart

To print a date tostd::cout:

std::string weekday ="Wednesday";
constchar* month ="July";
size_tday =27;
longhour =14;
intmin =44;

tfm::printf("%s, %s %d, %.2d:%.2d\n",weekday, month, day, hour, min);

POSIX extension for positional arguments is available. The ability to rearrange formatting arguments is an important feature for localization because the word order may vary in different languages.

Previous example for German usage. Arguments are reordered:

tfm::printf("%1$s, %3$d. %2$s, %4$d:%5$.2d\n",weekday, month, day, hour, min);

The strange types here emphasize the type safety of the interface, for example it is possible to print astd::stringusing the"%s"conversion, and a size_tusing the"%d"conversion. A similar result could be achieved using either of thetfm::format()functions. One prints on a user provided stream:

tfm::format(std::cerr,"%s, %s %d, %.2d:%.2d\n",
weekday, month, day, hour, min);

The other returns astd::string:

std::string date = tfm::format("%s, %s %d, %.2d:%.2d\n",
weekday, month, day, hour, min);
std::cout << date;

It is safe to use tinyformat inside a template function. For any type which has the usual stream insertionoperator<<defined, the following will work as desired:

template<typenameT>
voidmyPrint(constT& value)
{
tfm::printf("My value is '%s'\n",value);
}

(The above is a compile error for typesTwithout a stream insertion operator.)

Function reference

All user facing functions are defined in the namespacetinyformat.A namespace aliastfmis provided to encourage brevity, but can easily be disabled if desired.

Three main interface functions are available: an iostreams-basedformat(), a string-basedformat()and aprintf()replacement. These functions can be thought of as C++ replacements for C'sfprintf(),sprintf()and printf()functions respectively. All the interface functions can take an unlimited number of input arguments if compiled with C++11 variadic templates support. In C++98 mode, the number of arguments must be limited to some fixed upper bound which is currently 16 as of version 1.3. Supporting more arguments is quite easy using the in-source code generator based on cog.py- see the source for details.

Theformat()function which takes a stream as the first argument is the main part of the tinyformat interface.streamis the output stream, formatStringis a format string in C99printf()format, and the values to be formatted have arbitrary types:

template<typename... Args>
voidformat(std::ostream& stream,constchar* formatString,
constArgs&... args);

The second version offormat()is a convenience function which returns a std::stringrather than printing onto a stream. This function simply calls the main version offormat()using astd::ostringstream,and returns the resulting string:

template<typename... Args>
std::stringformat(constchar* formatString,constArgs&... args);

Finally,printf()andprintfln()are convenience functions which call format()withstd::coutas the first argument; both have the same signature:

template<typename... Args>
voidprintf(constchar* formatString,constArgs&... args);

printfln()is the same asprintf()but appends an additional newline for convenience - a concession to the author's tendency to forget the newline when using the library for simple logging.

Format strings and type safety

Tinyformat parses C99 format strings to guide the formatting process --- please refer to any standard C99 printf documentation for format string syntax. In contrast to printf, tinyformat does not use the format string to decide on the type to be formatted so this does not compromise the type safety:you may use any format specifier with any C++ type.The author suggests standardising on the%sconversion unless formatting numeric types.

Let's look at what happens when you execute the function call:

tfm::format(outStream,"%+6.4f",yourType);

First, the library parses the format string, and uses it to modify the state of outStream:

  1. TheoutStreamformatting flags are cleared and the width, precision and fill reset to the default.
  2. The flag'+'means to prefix positive numbers with a'+';tinyformat executesoutStream.setf(std::ios::showpos)
  3. The number 6 gives the field width; executeoutStream.width(6).
  4. The number 4 gives the precision; executeoutStream.precision(4).
  5. The conversion specification character'f'means that floats should be formatted with a fixed number of digits; this corresponds to executing outStream.setf(std::ios::fixed, std::ios::floatfield);

After all these steps, tinyformat executes:

outStream << yourType;

and finally restores the stream flags, precision and fill.

What happens ifyourTypeisn't actually a floating point type? In this case the flags set above are probably irrelevant and will be ignored by the underlyingstd::ostreamimplementation. The field width of six may cause some padding in the output ofyourType,but that's about it.

Special cases for "%p", "%c" and "%s"

Tinyformat normally usesoperator<<to convert types to strings. However, the "%p" and "%c" conversions require special rules for robustness. Consider:

uint8_t* pixels = get_pixels(/*...*/);
tfm::printf("%p",pixels);

Clearly the intention here is to print a representation of thepointerto pixels,but sinceuint8_tis a character type the compiler would attempt to print it as a C string if we blindly fed it intooperator<<.To counter this kind of madness, tinyformat tries to static_cast any type fed to the "%p" conversion into aconst void*before printing. If this can't be done at compile time the library falls back to usingoperator<<as usual.

The "%c" conversion has a similar problem: it signifies that the given integral type should be converted into acharbefore printing. The solution is identical: attempt to convert the provided type into a char using static_castif possible, and if not fall back to usingoperator<<.

The "%s" conversion sets the bool Alpha flag on the formatting stream. This means that aboolvariable printed with "%s" will come out astrueor falserather than the1or0that you would otherwise get.

Incompatibilities with C99 printf

Not all features of printf can be simulated simply using standard iostreams. Here's a list of known incompatibilities:

  • The"%a"and"%A"hexadecimal floating point conversions ignore precision as stream output of hexfloat (introduced in C++11) ignores precision, always outputting the minimum number of digits required for exact representation. MSVC incorrectly honors stream precision, so we force precision to 13 in this case to guarentee lossless roundtrip conversion.
  • The precision for integer conversions cannot be supported by the iostreams state independently of the field width. (Note:this is only a problem for certain obscure integer conversions;float conversions like %6.4fwork correctly.) In tinyformat the field width takes precedence, so the 4 in%6.4dwill be ignored. However, if the field width is not specified, the width used internally is set equal to the precision and padded with zeros on the left. That is, a conversion like%.4deffectively becomes%04dinternally. This isn't correct for every case (eg, negative numbers end up with one less digit than desired) but it's about the closest simple solution within the iostream model.
  • The"%n"query specifier isn't supported to keep things simple and will result in a call toTINYFORMAT_ERROR.
  • The"%ls"conversion is not supported, and attempting to format a wchar_tarray will cause a compile time error to minimise unexpected surprises. If you know the encoding of your wchar_t strings, you could write your ownstd::ostreaminsertion operator for them, and disable the compile time check by defining the macroTINYFORMAT_ALLOW_WCHAR_STRINGS. If you want to print theaddressof a wide character with the"%p" conversion, you should cast it to avoid*before passing it to one of the formatting functions.

Error handling

By default, tinyformat callsassert()if it encounters an error in the format string or number of arguments. This behaviour can be changed (for example, to throw an exception) by defining theTINYFORMAT_ERRORmacro before including tinyformat.h, or editing the config section of the header.

Formatting user defined types

User defined types with a stream insertion operator will be formatted using operator<<(std::ostream&, T)by default. The"%s"format specifier is suggested for user defined types, unless the type is inherently numeric.

For further customization, the user can override theformatValue() function to specify formatting independently of the stream insertion operator. If you override this function, the library will have already parsed the format specification and set the stream flags accordingly - see the source for details.

Wrapping tfm::format() inside a user defined format function

Suppose you wanted to define your own function which wrapstfm::format. For example, consider an error function taking an error code, which in C++11 might be written simply as:

template<typename... Args>
voiderror(intcode,constchar* fmt,constArgs&... args)
{
std::cerr <<"error (code"<< code <<")";
tfm::format(std::cerr, fmt, args...);
}

Simulating this functionality in C++98 is pretty painful since it requires writing out a version oferror()for each desired number of arguments. To make this bearable tinyformat comes with a set of macros which are used internally to generate the API, but which may also be used in user code.

The three macrosTINYFORMAT_ARGTYPES(n),TINYFORMAT_VARARGS(n)and TINYFORMAT_PASSARGS(n)will generate a list ofnargument types, type/name pairs and argument names respectively when called with an integer nbetween 1 and 16. We can use these to define a macro which generates the desired user defined function withnarguments. This should be followed by a call toTINYFORMAT_FOREACH_ARGNUMto generate the set of functions for all supportedn:

#defineMAKE_ERROR_FUNC(n) \
template<TINYFORMAT_ARGTYPES(n)> \
voiderror(intcode,constchar* fmt, TINYFORMAT_VARARGS(n)) \
{ \
std::cerr <<"error (code"<< code <<")";\
tfm::format(std::cerr, fmt,TINYFORMAT_PASSARGS(n)); \
}
TINYFORMAT_FOREACH_ARGNUM(MAKE_ERROR_FUNC)

Sometimes it's useful to be able to pass a list of format arguments through to a non-template function. TheFormatListclass is provided as a way to do this by storing the argument list in a type-opaque way. For example:

template<typename... Args>
voiderror(intcode,constchar* fmt,constArgs&... args)
{
tfm::FormatListRef formatList =tfm::makeFormatList(args...);
errorImpl(code, fmt, formatList);
}

What's interesting here is thaterrorImpl()is a non-template function so it could be separately compiled if desired. TheFormatListinstance can be used via a call to thevformat()function (the name chosen for semantic similarity tovprintf()):

voiderrorImpl(intcode,constchar* fmt, tfm::FormatListRef formatList)
{
std::cerr <<"error (code"<< code <<")";
tfm::vformat(std::cout, fmt, formatList);
}

The construction of aFormatListinstance is very lightweight - it defers all formatting and simply stores a couple of function pointers and a value pointer per argument. Since most of the actual work is done inside vformat(),any logic which causes an early exit oferrorImpl()- filtering of verbose log messages based on error code for example - could be a useful optimization for programs using tinyformat. (A faster option would be to write any early bailout code insideerror(),though this must be done in the header.)

Benchmarks

Compile time and code bloat

The scriptbloat_test.shincluded in the repository tests whether tinyformat succeeds in avoiding compile time and code bloat for nontrivial projects. The idea is to includetinyformat.hinto 100 translation units and useprintf()five times in each to simulate a medium sized project. The resulting executable size and compile time (g++-4.8.2, linux ubuntu 14.04) is shown in the following tables, which can be regenerated usingmake bloat_test:

Non-optimized build

test name compiler wall time executable size (stripped)
libc printf 1.8s 48K (36K)
std::ostream 10.7s 96K (76K)
tinyformat, no inlines 18.9s 140K (104K)
tinyformat 21.1s 220K (180K)
tinyformat, c++0x mode 20.7s 220K (176K)
boost::format 70.1s 844K (736K)

Optimized build (-O3 -DNDEBUG)

test name compiler wall time executable size (stripped)
libc printf 2.3s 40K (28K)
std::ostream 11.8s 104K (80K)
tinyformat, no inlines 23.0s 128K (104K)
tinyformat 32.9s 128K (104K)
tinyformat, c++0x mode 34.0s 128K (104K)
boost::format 147.9s 644K (600K)

For large projects it's arguably worthwhile to do separate compilation of the non-templated parts of tinyformat, as shown in the rows labelledtinyformat, no inlines.These were generated by putting the implementation ofvformat (detail::formatImpl()etc) it into a separate file, tinyformat.cpp. Note that the results above can vary considerably with different compilers. For example, the-fipa-cp-cloneoptimization pass in g++-4.6 resulted in excessively large binaries. On the other hand, the g++-4.8 results are quite similar to using clang++-3.4.

Speed tests

The following speed tests results were generated by building tinyformat_speed_test.cppon an Intel core i7-2600K running Linux Ubuntu 14.04 with g++-4.8.2 using-O3 -DNDEBUG.In the test, the format string "%0.10f:%04d:%+g:%s:%p:%c:%%\n"is filled 2000000 times with output sent to /dev/null;for further details see the source and Makefile.

test name run time
libc printf 1.20s
std::ostream 1.82s
tinyformat 2.08s
boost::format 9.04s

It's likely that tinyformat has an advantage over boost.format because it tries reasonably hard to avoid formatting into temporary strings, preferring instead to send the results directly to the stream buffer. Tinyformat cannot be faster than the iostreams because it uses them internally, but it comes acceptably close.

Rationale

Or, why did I reinvent this particularly well studied wheel?

Nearly every program needs text formatting in some form but in many cases such formatting isincidentalto the main purpose of the program. In these cases, you really want a library which is simple to use but as lightweight as possible.

The ultimate in lightweight dependencies are the solutions provided by the C++ and C libraries. However, both the C++ iostreams and C's printf() have well known usability problems: iostreams are hopelessly verbose for complicated formatting and printf() lacks extensibility and type safety. For example:

//Verbose; hard to read, hard to type:
std::cout << std::setprecision(2) << std::fixed <<1.23456<<"\n";
//The alternative using a format string is much easier on the eyes
tfm::printf("%.2f\n",1.23456);

//Type mismatch between "%s" and int: will cause a segfault at runtime!
printf("%s",1);
//The following is perfectly fine, and will result in "1" being printed.
tfm::printf("%s",1);

On the other hand, there are plenty of excellent and complete libraries which solve the formatting problem in great generality (boost.format and fastformat come to mind, but there are many others). Unfortunately these kind of libraries tend to be rather heavy dependencies, far too heavy for projects which need to do only a little formatting. Problems include

  1. Having many large source files. This makes a heavy dependency unsuitable to bundle within other projects for convenience.
  2. Slow build times for every file using any sort of formatting (this is very noticeable with g++ and boost/format.hpp. I'm not sure about the various other alternatives.)
  3. Code bloat due to instantiating many templates

Tinyformat tries to solve these problems while providing formatting which is sufficiently general and fast for incidental day to day uses.

License

For minimum license-related fuss, tinyformat.h is distributed under the boost software license, version 1.0. (Summary: you must keep the license text on all source copies, but don't have to mention tinyformat when distributing binaries.)

Author and acknowledgements

Tinyformat is written and maintained by Chris Foster, with various contributions gratefully recievedfrom the community.

Originally the implementation was inspired by the wayboost::formatuses stream based formatting to simulate most of theprintf()syntax, and Douglas Gregor's toyprintf()in anearly variadic template example.

Bugs

Here's a list of known bugs which are probably cumbersome to fix:

  • Field padding won't work correctly with complicated user defined types. For general types, the only way to do this correctly seems to be format to a temporary string stream, check the length, and finally send to the output stream with padding if necessary. Doing this for all types would be quite inelegant because it implies extra allocations to make the temporary stream. A workaround is to add logic tooperator<<()for composite user defined types so they are aware of the stream field width.