Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend PyUnicode_FromFormat() #98836

Closed
serhiy-storchaka opened this issue Oct 29, 2022 · 1 comment
Closed

Extend PyUnicode_FromFormat() #98836

serhiy-storchaka opened this issue Oct 29, 2022 · 1 comment
Labels
topic-C-API topic-unicode type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Oct 29, 2022

PyUnicode_FromFormat() and several other functions like PyErr_Format() support a subset of printf-like formatting with some extensions to support Python objects. It is a very limited subset, for example %x is supported, but %lx and %X are not.

I propose to add support of more printf features:

  • Support for conversion specifiers o (octal) and X (uppercase hexadecimal).
  • Support for length modifiers j (intmax_t) and t (ptrdiff_t).
  • Length modifiers are now applied to all integer conversions.
  • Support for wchar_t C strings (%ls and %lV).
  • Support for variable width and precision *.
  • Support for flag - (left alignment).

The following standard features are intentionally not implemented:

  • Support for floating point formatting. It is very rare in error messages and reprs, and you always can use sprintf to a fixed buffer.
  • Flags #, (a space) and +. # is ambiguous for octals: should we use prefix 0 (as in C) or 0o (as in Python)? The latter two flags are just rarely used (I initially implemented the support of them, but then removed, it is not worth).
  • Length modifiers h (signed char and unsigned char) and hh (short and unsigned short). Values of these types are automatically promoted to int or unsigned int, and you can use explicit conversion to int or unsigned int in case of ambiguity.
  • %lc. Since %c already accepts integers outside of the range 0-255, and the difference between int and wint_t is subtle, I am not sure that that there is any value of supporting it.
  • %n.
  • Width and precision are not supported in %c and %p.

Unlike to printf, unsupported modifiers (like %lc or %10c) raise SystemError instead of be silently ignored. It will allow to add new features without breaking accidentally working code.

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Oct 29, 2022
* Support for conversion specifiers o (octal) and X (uppercase hexadecimal).
* Support for length modifiers j (intmax_t) and t (ptrdiff_t).
* Length modifiers are now applied to all integer conversions.
* Support for wchar_t C strings (%ls and %lV).
* Support for variable width and precision (*).
* Support for flag - (left alignment).
@serhiy-storchaka
Copy link
Member Author

I am not sure that it is worth support ptrdiff_t if we supported intmax_t. Removing it would save us 8 lines of code. Support of char and short integers would cost 10 lines of code each. Either is cheap.

I am ready to discuss inclusion or exclusion of other formatting features.

serhiy-storchaka added a commit that referenced this issue May 21, 2023
* Support for conversion specifiers o (octal) and X (uppercase hexadecimal).
* Support for length modifiers j (intmax_t) and t (ptrdiff_t).
* Length modifiers are now applied to all integer conversions.
* Support for wchar_t C strings (%ls and %lV).
* Support for variable width and precision (*).
* Support for flag - (left alignment).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-C-API topic-unicode type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

1 participant