04.09.13

mozilla/IntegerPrintfMacros.h now provides PRId32 and friends macros, for printfing uint32_t and so on

Tags: , , , , , — Jeff @ 09:37

Printing numbers using printf

The printf family of functions take a format string, containing both regular text and special formatting specifiers, and at least as many additional arguments as there are formatting specifiers in the format string. Each formatting specifier is supposed to indicate the type of the corresponding argument. Then, via compiler-specific magic, that argument value is accessed and formatted as directed.

C originally only had char, short, int, and long integer types (in signed and unsigned versions). So the original set of format specifiers only supported interpreting arguments as one of those types.

Fixed-size integers

With the rise of <stdint.h>, it’s common to want to print a uint32_t, or an int64_t, or similar. But if you don’t know what type uint32_t is, how do you know what format specifier to use? C99 defines macros in <inttypes.h> that expand to suitable format specifiers. For example, if uint32_t is actually unsigned long, then the PRIu32 macro might be defined as "lu".

uint32_t u = 3141592654;
printf("u: %" PRIu32 "\n", u);

Unfortunately <inttypes.h> isn’t available everywhere. So for now, we have to reimplement it ourselves. The new mfbt header mfbt/IntegerPrintfMacros.h, available via #include "mozilla/IntegerPrintfMacros.h", provides all the PRI* macros exposed by <inttypes.h>: by delegating to that header when present, and by reimplementing it when not. Go use it. (Note that all Mozilla code has __STDC_LIMIT_MACROS, __STDC_FORMAT_MACROS, and __STDC_CONST_MACROS defined, so you don’t need to do anything special to get the macros — just #include "mozilla/IntegerPrintfMacros.h".)

Limitations

The implementations of <inttypes.h> in all the various standard libraries/compilers we care about don’t always provide definitions of these macros that are free of format string warnings. This is, of course, inconceivable. We can reimplement the header as needed to fix these problems, but it seemed best to avoid that til someone really, really cared.

<inttypes.h> also defines format specifiers for fixed-width integers, for use with the scanf family of functions that read a number from a string. IntegerPrintfMacros.h does not provide these macros. (At least, not everywhere. You are not granted any license to use them if they happen to be incidentally provided.) First, it’s actually impossible to implement the entire interface for the Microsoft C runtime library. (For example: no specifier will write a number into an unsigned char*; this is necessary to implement SCNu8.) Second, sscanf is a dangerous function, because if the number in the input string doesn’t fit in the target location, anything (undefined behavior, that is) can happen.

uint8_t u;
sscanf("256", "%" SCNu8, &u); // I just ate ALL YOUR COOKIES

IntegerPrintfMacros.h does implement imaxabs, imaxdiv, strtoimax, strtoumax, wcstoimax, and wcstoumax. I mention this only for completeness: I doubt any Mozilla code needs these.

08.12.11

Party like it’s 1999: <stdint.h> comes to Mozilla!

tl;dr

Need an integer type with guaranteed size? If you’re not defining a cross-file interface, #include "mozilla/StdInt.h"#include "mozilla/StandardInteger.h" and use uint32_t or any other type defined by <stdint.h>. (If you are defining an interface, use PRUint32 and similar — for now.) mozilla/StdInt.hmozilla/StandardInteger.h is a cross-platform implementation of <stdint.h>‘s functionality usable in any code.

Embedders may find that the mozilla/StdInt.hmozilla/StandardInteger.h typedefs conflict with ones they have already been using. To work around this conflict, write a stdint.h compatible with the embedding’s typedefs (more likely: adapt an existing implementation), then set the preprocessor variable MOZ_CUSTOM_STDINT_H to a quoted path to that reimplementation when mozilla/StdInt.hmozilla/StandardInteger.h is included. It may be simplest to add this to command line flags when invoking the compiler.

Fixed-size integer types

Fixed-size integer types are signed or unsigned types with exactly N bits. They contrast with the built-in C and C++ types (char, short, int, &c.) with compiler-dependent sizes. Fixed-size integer types are quite useful:

  • They work well when serializing an object to a sequence of bytes, where the size of a serialized item must be constant for correctness.
  • They minimize memory use in classes and structs, also making padding-based waste more obvious.
  • They work well in cross-platform APIs, eliminating the challenge of implementing correct behavior when types have different sizes across platforms.

Fixed-size integer types are useful in the same way size_t, off_t, and other non-built-in types are: they fit some problem domains better than built-in types.

C99 and C++11 finally standardized fixed-size integer types in <stdint.h> and <cstdint>. They define {u,}int{8,16,32,64}_t types, plus useful constants for their limits (INT8_MIN, INT8_MAX, UINT8_MAX, INT16_MIN, &c.). One would expect projects to quickly use these types, but it didn’t happen.

Old projects predating <stdint.h> have been particularly slow to adopt it. Many such projects already rolled their own non-<stdint.h>-named fixed-size integer types; switching would be a hassle. And not all compilers shipped <stdint.h>: Visual Studio didn’t have it until 2010! (ಠ_ಠ) Projects implementing their own <stdint.h>-compatible types posed another problem, because different projects’ implementations might be incompatible.

New projects fare better, but not always. Sometimes their dependence on old projects anchors them to the old, pre-<stdint.h> world.

Fixed-size integer types in Mozilla

Mozilla sits squarely in the old-project category, facing every rationale noted above for using its own types. These have long been NSPR‘s PR{Ui,I}nt{8,16,32,64} and SpiderMonkey’s {u,}int{8,16,32,64} types. But recently the landscape has changed.

As Mozilla has imported more external code, fixed-size integer types have proliferated. Most imported code uses <stdint.h>, with fallback typedefs for Visual Studio; some code (IPC code from Chromium) defines and uses {u,}int{8,16,32,64}. As type definitions have proliferated, surprising problems have arisen.

The woes of multiplicity

#include-order issues are the simplest problem. For example, the IPC uint32-style definitions are incompatible with SpiderMonkey’s definitions with some compilers. #include IPC and SpiderMonkey headers in the wrong order, and the compiler will error on incompatible typedefs. This problem is easy to diagnose but harder to resolve, and sometimes it causes considerable pain. Mass refactorings that add #includes have fallen afoul of this, with the least bad solution usually being to fix the first error, recompile, and repeat until done (twenty-odd times in one instance).

Worse than #include mis-ordering and conflict are linking problems. int and long could be 32-bit integers yet appear different when linked. Suppose a method taking an int32 argument defined int32 = int during compilation, but a user of it saw int32 = long during compilation. Each alone would compile. But beneath typedefs they’d be incompatible and wouldn’t link.

As we’ve imported more code in Mozilla, more and more developers have been bitten by these problems. We’ve reached a breaking point. We could use PRUint32, JSUint32, and other types which never trample upon each other. Yet no one likes them given the standardized types, and it’s not possible to change “upstream” code to such a scheme. Thus a second solution: use <stdint.h> definitions for everything.

Switching to <stdint.h>

Using the <stdint.h> types in Mozilla code is now as simple as #include "mozilla/StdInt.h"#include "mozilla/StandardInteger.h". mozilla/StdInt.hmozilla/StandardInteger.h implements the <stdint.h> interface even in the edge cases: for compilers not supporting it, and for embedders who can’t use the regular definitions. It works as follows:

  1. If the preprocessor definition MOZ_CUSTOM_STDINT_H is defined, then #include MOZ_CUSTOM_STDINT_H. Embedders who can’t use the default definitions should use this to adapt. (MOZ_CUSTOM_STDINT_H may also be passed into the Mozilla build system using an environment variable. Note that while the preprocessor definition must be a quoted path, the environment variable must be an unquoted path.)
  2. Otherwise, if the compiler doesn’t provide <stdint.h>, use a custom implementation. This is currently limited to Visual Studio prior to 2010, using an implementation imported from msinttypes on Google Code.
  3. Otherwise use <stdint.h>.

We’re only providing these types now, but shortly we’ll start switching code using non-<stdint.h> fixed-size integer types to use them. Adding mozilla/StdInt.hmozilla/StandardInteger.h is merely the first step toward removing the other fixed-size integer types (except when they’re necessary to interact with external libraries).

Conclusion

It’s been a dozen years since <stdint.h> was standardized. Now it finally comes to Mozilla. Let’s lower a barrier to hackability in Mozilla and start using it.