KB Article #71171

Converting from UTF-8 to ISO8859-1

Problem

Fallback character request

Characters above U+00FF in Unicode (higher than x'c3bf' in UTF-8) cannot be converted, an error message being displayed.

A new functionality needs to be implemented so that the system to automatically convert characters out the ISO8859 range to a valid encoding character.

Resolution

Apply patch CORE_211_03-05-29, what would add default fallback characters for most of the CSC's.

These can be enabled and configured for each CSC like this:

a) to enable one, create the file $CORE_LOCAL/csc/CSC_NAME - this enables the default one. For all of the CSC, it tries to be the '?' character (usually 3F). CSC_NAME could be: windows, iso8859, ascii, ebcdic etc. Please note that it is not supported for each of the CSC parts (it is a global character for a CSC).

b) to change the character, edit that file and write the character using hex digits (i.e. A0, 3F, A1B0 for a 2 byte CSC). Do not use 00, or FFFF (for 2 byte character sets) since these are reserved. This fallback character will cause that in most cases, any conversion from UCS4 (or a larger CSC) to the current one to never fail. It will continue succesfully in all cases.

It is also the user's responsibility that the character is a legal one. If a non legal character is placed in the file, the program will not crash or report an error, it will try to convert it to a legal one as much as possible.

Still need help?