understanding strings in com
|
system notes |
|
to replicate the steps described in this article, you'll need windows 95+ or windows nt 4.0+ and visual c++ 5.0 or higher. |
ansi and unicode, char and wchar_t were not enough: com introduced several new string data types, and the differences and the process of conversion are not always obvious to the uninitiated. this article clarifies the situation once and for all for the benefit of raw com, atl and mfc programmers.
strings, i.e. vectors of alphanumeric characters, are and have always been a fundamental data type in every programming language and platform. whereas the computer itself prefers to deal with numbers, human beings prefer messages of text to sequences of binary, hexadecimal or even decimal digits. this implies that whenever a piece of software needs to interact with the user (or signal some notable events) some kind of string treatment is likely to come into play.
until a few years ago strings were just strings, that is, arrays of single-byte data types (char in c/c++) containing the ascii number of the character at each element. the biggest problem was distinguishing zero-terminated strings (also known as asciiz) from non-zero-terminated arrays. then came unicode, a new character set which extended the size of each character from 8 to 16 bits, thus allowing for 65536 theoretical different characters, enough to contain also far eastern symbols such as the kanji standard set. in c/c++ a brand new standard data type was defined to store unicode strings, wchar_t, and consequently the apis of unicode-aware win32 operating systems that took strings as parameters had to be duplicated to accept both ansi and unicode versions.
just as the windows programmer community began to get acquainted with this duplication and got into the habit of not assuming anything about the length of a character a priori, com jumped to the central stage with its burden of new types and aliases. if you are wondering what is the functional difference between an array of olechars and a pointer to a bstr, when and how it is necessary to convert a string to another type, and what degree of assistance atl and mfc offer to the developer, this article is for you.
olechars