Charset support
Detailed Description
OCILIB supports ANSI and Unicode charsets
Oracle started a real Unicode support with Oracle8i but only for user data. All SQL and PL/SQ/ statements, metadata string, database objects names, etc, ... were still only supported in ANSI.
With Oracle 9i, Oracle provides a full Unicode support.
So depending on the compile time Oracle library or the runtime loaded library, the Unicode support differs.
OCILIB supports :
- ANSI (char)
- Unicode (wchar_t)
- Mixed charset: ansi for metadata, Unicode for user data
OCILIB uses two types of strings:
- mtext: for metadata, SQL strings, object attributes.
- dtext: for input binds and output data
mtext and dtext are declared as defines around char and wchar_t depending on the charset option
- Text macro
- MT() macro : "meta text" -> meta data and strings passed to OCI calls
- DT() macro : "data text" -> user input/output data
- Option OCI_CHARSET_ANSI
- dtext --> char
- DT(x) --> x
- mtext --> char
- MT(x) --> x
- Option OCI_CHARSET_UNICODE
- dtext --> wchar_t
- DT(x) --> L ## x
- mtext --> wchar_t --MT(x) --> L ## x
- Option OCI_CHARSET_MIXED
- dtext --> wchar_t
- DT(x) --> L ## x
- mtext --> char
- MT(x) --> x
- Unicode and ISO C
Well, ISO C:
- doesn't know anything about Unicode.
- makes wide characters support tricky because the size of a wide character is not defined and is freely adaptable by implementations.
OCILIB uses char/wchar_t strings for public interface and internal storage.
OCILIB, for Unicode builds, initialize OCI in UTF16 Unicode mode. Oracle implements this mode with a 2 bytes (fixed length) UTF16 encoding.
So, on systems implementing wchar_t as 2 bytes based UTF16 (e.g. Ms Windows), input strings are directly passed to Oracle and taken back from it.
On other systems (most of the unixes) that use UTF32 as encoding (4 bytes based wchar_t), OCILIB uses:
- temporary buffers to pass string to OCI for metadata strings
- buffer expansion from UTF16 to UTF32 for user data string:
- allocation based on sizeof(wchar_t)
- data filling based on sizeof(short) -> (UTF16 2 bytes)
- data expansion to sizeof(wchar_t).
The buffer expansion is done inplace and has the advantage of not requiring extra buffer. That reduces the cost of the Unicode/ISO C handling overhead on Unixes.
- Charset mapping macros
OCILIB main header file provides macro around most common string functions of the C standard library.
Theses macros are based on the model:
- mtsxxx() for mtext * typed strings
- dtsxxx() for dtext * typed strings
xxx is the standard C library string function name without the character type prefix (str/wcs).
List of available macros:
- mtsdup, dtsdup
- mtscpy, dtscpy
- mtsncpy, dtsncpy
- mtscat, dtscat
- mtsncat, dtsncat
- mtslen, dtslen
- mtscmp, dtscmp
- mtscasecmp, dtscasecmp
- mtsprintf, dtsprintf
- mtstol, dtstol
Generated on Thu Jul 30 17:41:53 2009 for OCILIB (C Driver for Oracle) by
1.5.4