I have one question about windows non-unicode (A) string functions:

do they ever handle single bytes? Or is there a localed version of windows
that deals with other ASCII formats, like DBCS (double byte caracter strings - that strings are NOT unicode)
or something? Is DBCS handled by the A versions of the string functions?

I found this funny code snipped at MS for going backward one char in a
DBCS string:
pchTemp = pch - 1; // point to previous byte


// If *(pch-1) is a Lead Byte-, it must actually
// be functioning as a trail byte so return pch-2.
if (IsDBCSLeadByte(*pchTemp))
return (pch - 2);
// Otherwise, step back until a non-lead-byte is found.
while (psz <= --pchTemp && IsDBCSLeadByte(*pchTemp))
;
// Now pchTemp+1 must point to the beginning of a character,
// so figure out whether we went back an even or an odd
// number of bytes and go back 1 or 2 bytes, respectively.
return (pch - 1 - ((pch - pchTemp) & 1));
Posted on 2002-07-26 08:27:59 by beaster
I found this on my MSDN CD in the Visual Basic section (I never used DBCS myself):

You need to be aware of the issues when sorting and comparing DBCS text, because the Option Compare Text statement has a special behavior when used on DBCS strings. When you use the Option Compare Binary statement, comparisons are made according to a sort order derived from the internal binary representations of the characters. When you use Option Compare Text statement, comparisons are made according to the case-insensitive textual sort order determined by the user's system locale.

In English "case-insensitive" means ignoring the differences between uppercase and lowercase. In a DBCS environment, this has additional implications. For example, some DBCS character sets (including Japanese, Traditional Chinese, and Korean) have two representations for the same character: a narrow-width letter and a wide-width letter. For example, there is a single-byte "A" and a double-byte "A." Although they are displayed with different character widths, Option Compare Text treats them as the same character. There are similar rules for each DBCS character set.

You need to be careful when you compare two strings. Even if the two strings are evaluated as the same using Like or StrComp, the exact characters in the strings can be different and the string length can be different, too.
Posted on 2002-07-27 09:30:52 by Ernie
thanks, I did afraid this. So I will further not use selfmade lstrlen and lstrcmp functions.
Posted on 2002-07-29 03:58:33 by beaster