Basic Unicode character/string support absent even in modern C++

Shriramana Sharma samjnaa at gmail.com
Wed Apr 22 05:47:09 CDT 2020


Thanks for your reply. Glad to hear of this effort. Will look into it.

For now have downloaded:

https://github.com/nemtrif/utfcpp/

And added the following code to my program which now works:

#include <utf8cpp/utf8.h>

std::ostream & operator<<(std::ostream & os, std::u16string us) { return os
<< ::utf8::utf16to8(us); }


On Wed, 22 Apr, 2020, 02:51 Tom Honermann, <tom at honermann.net> wrote:

> Hi Shriramana.
>
> The WG21 (C++ standard working group) SG16 (Unicode and text processing)
> study group is pursuing solutions for this issue.  I encourage you to
> reach out to them with your request and ideas.  See
> https://github.com/sg16-unicode/sg16.
>
> Unfortunately, it is a difficult issue to address in a portable way for
> a variety of reasons.  A few of these are discussed with regard to
> char8_t at
>
> https://stackoverflow.com/questions/58878651/what-is-the-printf-formatting-character-for-char8-t/58895428#58895428
> .
>
> Tom.
>
> On 4/21/20 12:03 PM, Shriramana Sharma via Unicode wrote:
> > char16_t and char32_t along with the corresponding string types
> > u16string and u32string were added in C++11:
> >
> > https://en.cppreference.com/w/cpp/language/types
> > https://en.cppreference.com/w/cpp/string
> >
> > But till date one can't write any of them to cout. A simple cout <<
> > u'அ' or cout << u"சொல்" doesn't work and throws umpteen lines of
> > obscure compiler errors.
> >
> > Some relevant threads:
> > https://stackoverflow.com/q/6020314/1503120
> > https://stackoverflow.com/q/5611759/1503120
> >
> > I really don't understand the point of having character and string
> > types if you can't print them!
> >
> > I don't accept the rationale (which seems to be mentioned in the top
> > answer to that first question) that there isn't so much demand for
> > writing to such an encoding.
> >
> > First of all, the encoding exists precisely because it's useful.
> > Second, this is about writing *from* that encoding to plain cout which
> > one assumes connects to a UTF-8 console. Or if that assumption isn't
> > acceptable, then resolve it! Let there be a proper encoding setting
> > for cout.
> >
> > It would seem that C++'s std::cout isn't really a "character" output
> > (or is it console output) unlike Qt's QTextStream or Python's
> > sys.stdout. Those seem to handle Unicode just fine.
> >
> > If there's someone here with the wherewithal to get this C++ situation
> > fixed, my humble request to you to do so!
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/mailman/private/unicode/attachments/20200422/6c256808/attachment.htm>


More information about the Unicode mailing list