Since different applications are using different encodings to process data, there may be encoding issue when the source application and destination application are using different encodings.
ECC Encoding
- SAP ECC is saving data in database with Unicode encoding. You can check if your SAP application server is using Little Indian (4103) or Big Indian (4102) via
t-code I18N -> I18N Customizing -> I18N System Configuration -> Display Current NLS config -> "Code Page of Application Server"
. - When SAP ECC is sending a XML file to PI, UTF-8 encoding (4110) will be used.
- When SAP ECC is writing a flat file to application server folder, the encoding will depend on the ABAP statement
OPEN DATASET ... ENCODING ...
. IfENCODING DEFAULT
orENCODING UTF-8
is used, UTF-8 will be used. - When SAP ECC is using file port to write an IDOC flat file, the encoding will be UTF-8 if the
Unicode format
checkbox is selected in port (t-code WE21). - When SAP ECC is using file port to write an IDOC flat file, the encoding will be using the relevant code page if
Unicode format
is not selected andChar. Set
is specified in port (t-code WE21). - When SAP ECC is using file port to write an IDOC flat file, if
Unicode format
andChar. Set
are not specified, the encoding will be using the same encoding as ABAP statementENCODING NON-UNICODE
. In an Unicode SAP system, the encoding in the tableTCP0C
by platform, language and country will be used. In the case of Linux, English, and US, the encoding will be ISO-8859-1 (1100). You can use ABAP statementsSET COUNTRY
andSET LOCALE LANGUAGE
to change the language and country in a ABAP session.
PI Encoding
- PI sender file channel will use the specified file encoding to read the file, if file type is set to
Text
instead ofBinary
. - You can convert a flat file to XML file via adapter module
AF_Modules/MessageTransformBean
, and specify the encoding of the XML file by setting the parameterTransform.ContentType
totext/plain;charset=utf-8
. - You can convert the encoding of an output file from UTF-8 to ISO-8859-1 via adapter module
AF_Modules/TextCodepageConversionBean
with parameterConversion.charset
equal toiso-8859-1
.
Utilities
You can check the hex code and UTF-8 bytes for one character in this site. Please note:
- The hex code is using Unicode big indian, such as
0421
forCYRILLIC CAPITAL LETTER ES
-C
. - The UTF-8 bytes should contain space if there are multiple bytes for one character, such as
D0 A1
. - In some cases,
UTF-8 bytes as Latin-1 characters bytes
will be showing the same invalid characters as destination application, if source application is usingUTF-8
encoding, and the destination application is using encoding likeISO-8859-1
to process data.
You can use Notepad++ to open a file with different encodings, and you can also convert it to use another encoding.
You can use some application likeWinHex
orHex_edit
to see the hex code for one text file.
Common Encoding/Code Pages
- 1100 for ISO-8859-1, which is similar to WINDOWS-1252/ANSI
- 4110 for UTF-8, the optional BOM (Byte Order Mask) will be EFBBBF for UTF-8
- 4102 for UTF-16be
- 4103 for UTF-16le
- 8400 for GB2312
One encoding issue in my work
- Issue: Below is one encoding issue I came accross with in my work. In SAP ECC system, the character is showing as
C
, but in destination application, the character is showing asÐ ¡
, and can not be recoginzed. - Analysis: The special character
CYRILLIC CAPITAL LETTER ES
which is showing exactly the same asLATIN LETTER C
is used in SAP ECC. In debugging mode, we will see that the hex code is2104
since SAP ECC system is using Unicode little indian. In the site, we checked with the hex code0421
which is Unicode big indian, and we can find that the UTF-8 bytes for this character isD0 A1
, andUTF-8 bytes as Latin-1 characters bytes
is showing asÐ ¡
.
References
Check hex code and UTF-8 bytes for one character
SAP note 552464 - What is Big Endian / Little Endian? What Endian do I have?