Re: Convert CString to LONG.
 
"Control Freq" <nick@nhthomas.freeserve.co.uk> ha scritto nel messaggio 
news:bb766698-a861-4f6d-8bd5-c0840c6cba47@y21g2000hsf.googlegroups.com...
I am confused by the UNICODE or non-UNICODE aspects.
It is a "history" problem.
In the past, people used ASCII to store strings. Each character was stored 
in a 1-byte 'char'.
For American people having 256 (i.e. 2^8 bits) "symbols" is fine. American 
alphabet letters are few (just a-Z).
But, for example, we in Italy have "accented letters", too, like ? ? ? ...
And people from Germany have other symbols, and people from Norway others, 
etc.
And the Chinese and Japanese men have *thousands* symbols to write their 
texts...
So, one ASCII byte is absolutely useless in these contexts...
Then, other encodings were defined. There were code-pages and other messy 
stuff.
For example, I recall that under MS-DOS I had Italian code page, but 
text-based user interfaces of American programs were drawn in a meaningless 
way, because characters with most-significant-bit on (i.e. >= 128) where 
code-page dependent, and so let's say that in American code-page the value 
200 was associated to a straight vertical double line e.g. || (used to draw 
windows borders in text-mode), instead in Italian code page 200 was "?", so 
instead of having a border like this:
 ||
 ||
 ||
the Italian code page rendered that as:
 ?
 ?
 ?
(or something similar).
At certain point in time, Unicode was considered the standard to represent 
text.
But for back compatibility for non-Unicode platforms like Windows 9x, the 
Microsoft APIs started this Unicode-aware thing, i.e. the API were available 
in both ANSI and Unicode version. Typically, ANSI version ended with A, 
instead Unicode version ended with W, e.g.
  DoSomethingA
  DoSomethingW
and the Windows header files had the _UNICODE (or UNICODE) trick, e.g.
#ifdef _UNICODE
#define DoSomething DoSomethingW  /* Unicode */
#else
#define DoSomething DoSomethingA   /* ANSI */
#endif
So, in your code you used DoSomething, but it actually expanded to 
DoSomethingA or DoSomethingW basing on Unicode build mode.
The same trick was there for strings, e.g. you can have ANSI strings (const 
char *), and Unicode  strings (const wchar_t *); using a preprocessor macro 
(TCHAR) you can write code that expands to char* or wchar_t* based on 
Unicode build mode:
#ifdef _UNICODE
#define TCHAR wchar_t
#else
#define TCHAR char
#endif
And string literals in ANSI are identified by "something", instead in 
Unicode they have the L prefix: L"something".
So there is the _T() decorator, which expands to nothing on ANSI builds, and 
to L"" in Unicode builds.
 _T("something") --> "something" (in ANSI builds)
 _T("something") --> L"something" (in Unicode builds)
How should I make my code correct (compiling and running) in a UNICODE
and non-UNICODE aware way. I presume I should be coding so that a
#define _UNICODE would produce unicode aware binary, and without that
definition it will produce a non-unicode aware binary, but the actual
code will be the same.
It is easy to code in Unicode-aware way if you use CString and Microsoft 
non-standard stuff, like the Unicode-aware version of C library routines.
e.g. instead of using sscanf, you should use _stscanf.
For example:
<code>
 int ParseInt( const CString & str )
 {
      int n;
      _stscanf( str, _T("%d"), &n );
      ... check parsing error
      return n;
 }
  ...
 CString s = _T("1032");
 int n = ParseInt( s );
</code>
The above code is Unicode-ware. It compiles and runs fine in both ANSI/MBCS 
and Unicode builds (i.e. when _UNICODE and UNICODE are #defined).
In ANSI builds, the _T() decorator expands to nothing, _stscanf expands to 
sscanf(), and CString is a char-based string (or CStringA, in modern Visual 
C++).
So, in ANSI builds the code *automatically* (thanks to preprocessor 
#define's or typedef's) becomes something like this:
<code>
 int ParseInt( const CStringA & str )
 {
      int n;
      sscanf( str, "%d", &n );
      ... check parsing error
      return n;
 }
  ...
 CStringA s = "1032";
 int n = ParseInt( s );
</code>
Instead, in Unicode builds (UNICODE and _UNICODE #defined), the code 
becomes:
TCHAR --> wchar_t
 CString -> CStringW,
 _T("something") --> L"something"
 _stscanf() --> swscanf()
<code>
 int ParseInt( const CStringW & str )
 {
      int n;
      swscanf( str, L"%d", &n );
      ... check parsing error
      return n;
 }
  ...
 CStringW s = L"1032";
 int n = ParseInt( s );
</code>
So, if you use CString, _T() decorator, and Microsoft Unicode-aware 
extensions to C standard library, you can build code that compiles in both 
ANSI and Unicode builds.
The problem is with C++ standard library and I/O streams, which do not have 
this TCHAR idea.
So, if you really want to use C++ standard library, you may choose to write 
Unicode source code from the start, i.e. using wchar_t, or wcout or wcerr 
and wcin instead of ANSI cout, cerr, cin...
UNICODE and _UNICODE macros are Windows-only, not portable C++.
It may be possible to do some trick for standard C++ libraries, too.
e.g. for STL string, you may simulate CString behaviour with something like 
this:
  typedef std::basic_string< TCHAR > TString;
In ANSI builds, TCHAR expands to char, and TString will expands to 
std::basic_string< char > == std::string.
Instead, in Unicode builds, TCHAR expands to wchar_t, and TString will 
expands to std::basic_string< wchar_t > == std::wstring.
HTH,
Giovanni