Re: Problem with using char* to return string by reference
"Hendrik Schober" <Spamtrap@gmx.de> wrote in message
news:%23RqLMy4zIHA.4816@TK2MSFTNGP03.phx.gbl
There is, however, one problem with all this:
'std::basic_string<>' was not designed for multi-byte
encodings. Therefor, when you put multi-byte encoded
strings into it (and if we're talking Unicode, except
for UTF-32 all encodings are multi-byte, since Unicode
specifies >2^16 characters), you're on your own. (For
example, 'wstring::size()' always gives you the number
of 'wchar_t' objects in the string, which might be
larger than the number of displayable characters.
Note that there's no one-to-one correspondence between "displayable
characters" and Unicode codepoints, what with combining diacritics,
ligatures, control characters and such.
It can be argued that the size of the string in Unicode codepoints is
useless for all practical purposes. E.g. for memory allocation purposes
you want the size in bytes (or some other fixed-size units). For text
editing purposes you want the size in glyphs (which may be less than the
number of codepoints).
--
With best wishes,
Igor Tandetnik
With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea. It is hard to be sure where they are going to
land, and it could be dangerous sitting under them as they fly
overhead. -- RFC 1925