Basic Unicode character/string support absent even in modern C++

Tom Honermann tom at honermann.net
Tue Apr 21 16:21:10 CDT 2020


Hi Shriramana.

The WG21 (C++ standard working group) SG16 (Unicode and text processing) 
study group is pursuing solutions for this issue.  I encourage you to 
reach out to them with your request and ideas.  See 
https://github.com/sg16-unicode/sg16.

Unfortunately, it is a difficult issue to address in a portable way for 
a variety of reasons.  A few of these are discussed with regard to 
char8_t at 
https://stackoverflow.com/questions/58878651/what-is-the-printf-formatting-character-for-char8-t/58895428#58895428.

Tom.

On 4/21/20 12:03 PM, Shriramana Sharma via Unicode wrote:
> char16_t and char32_t along with the corresponding string types 
> u16string and u32string were added in C++11:
>
> https://en.cppreference.com/w/cpp/language/types
> https://en.cppreference.com/w/cpp/string
>
> But till date one can't write any of them to cout. A simple cout << 
> u'அ' or cout << u"சொல்" doesn't work and throws umpteen lines of 
> obscure compiler errors.
>
> Some relevant threads:
> https://stackoverflow.com/q/6020314/1503120
> https://stackoverflow.com/q/5611759/1503120
>
> I really don't understand the point of having character and string 
> types if you can't print them!
>
> I don't accept the rationale (which seems to be mentioned in the top 
> answer to that first question) that there isn't so much demand for 
> writing to such an encoding.
>
> First of all, the encoding exists precisely because it's useful. 
> Second, this is about writing *from* that encoding to plain cout which 
> one assumes connects to a UTF-8 console. Or if that assumption isn't 
> acceptable, then resolve it! Let there be a proper encoding setting 
> for cout.
>
> It would seem that C++'s std::cout isn't really a "character" output 
> (or is it console output) unlike Qt's QTextStream or Python's 
> sys.stdout. Those seem to handle Unicode just fine.
>
> If there's someone here with the wherewithal to get this C++ situation 
> fixed, my humble request to you to do so!




More information about the Unicode mailing list