Tuesday, July 16, 2013

Handling Special Characters

Introduction about Special Characters
A special character is any character that is not contained in the following list:

  • Digit
  • Letter
  • Extended letter
  • Hex digit
  • Language Specific Character

Characters that indicate the end of a line in a file

Digit:
A digit is a character.
Digit: 0 | 1 | 2 | | 3 | 4 | 5 | 6 | 7 | 8 | 9

Letter:
A letter is a character.
Letter: A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z

Extended letter:
An Extended letter is a character.
Extended Letter: # | @ | $

Hex digit:
A Hex digit is a character.
Hex digit: 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

Language Specific Character:
A language-specific character is any letter that occurs in a northern, southern, or central European language and is not contained in the list of letters.
German umlauts: ä, ö, ü
French letters with a "grave" accent.

Note 1: If you have installed a UNICODE-enabled database, a language-specific character is a character that is not included in the ASCII-Code list from 0 to 127.

Note 2: To know the standard special characters that are allowed in BW, please refer the OSS note: 173241 – “Allowed characters in the BW System”


Issues due to Special Characters
Generally, while loading data from different sources like flat file or some other source; we will come across the error due to special characters that are available in the load data which is not supported by BI system. This can be rectified in many ways and this document provides the details and collection of errors faced because of the special characters taken from various threads of sdn.sap.com.

Common errors due to special characters include:
1. Error in SID generation
2. Error when assigning SI: Action VAL_SID_CONVERT table
3. Characteristic 0XXX / ZXXXX contains invalid characters like (2Q235-A100[060D - note the [ character, which is an invalid character for BI by default)
4. Activation of M records from Data Store object terminated
5. Error mentioning #, @, $, %, ! etc are not allowed while loading data
6. Error

To avoid these errors few steps has to be done in BI system based on the business requirement

  • Including the required special characters
  • Excluding the unwanted special characters

Ways to handle Special Characters:










Special Characters can be handled in various ways in BW which could be inclusion in BW or exclusion in BW. Sections below explain ways to handle special characters in details.


Including the required special characters

Including new characters globally:
Special characters that are allowed and used in the system has to be included in the system under the
T-Code RSKC.
The below screen shows the special characters that are already included / allowed in the BW system

Note: “&, #,.” - Already included in the system


To include a new special character, we need to just enter it next to the existing characters and execute it.
This will include these characters in the table RSALLOWEDCHAR.
Note: We have included “*” and executing it.
This will show a success message.




Now the new special character “*”is also allowed in the system globally.


Including New Characters Locally:

Scenario:
Let us consider that the ZADDRESS field is getting loaded from the ECC system to DSO in BW.
And there are few cases which might have some special characters (unique characters).
Now these characters are not allowed in BW as it is not included in RSKC and also business doesn‟t allow including this in RSKC, as they don‟t want other fields or targets to use these special characters.

Analysis:
As the special character comes directly from source system, BW system will throw an error, if it is not included in the RSKC, so we there is an option to write a field routine in the Transformation. Whereas, even with the routine we cannot make the system to accept the special character, because either we can ignore or exclude but not allow it.

Conclusion:
So, it is not possible to allow that special (unique) character in the system locally.

Allowing Special Characters to be First Character
Generally, the special characters are not allowed to be as the first character of the data. As per the requirement, if characters like „#‟ or „!‟ has to be present as the 1st character then we can use ALL_CAPTIAL or ALL_CAPITAL_PLUS_HEX or ALL_CAPTIAL* in RSKC or if this is required only in one particular IOBJ, then a field routine can be written in the transformation.

Note: Using ALL_CAPITAL / ALL_CAPITAL_PLUS_HEX will have more impact like it will allow all the characters as per source system into BW. Also this will not convert lower case letters into upper case letters.








Excluding the special characters
Enabling lower case letters option
There are few Info objects, in which will the option for “Lower case letters” would not be enabled. But still while extracting the data from the source system / flat files there are always chances of getting the data in the “Lower Case”.
In this scenario, we can enable the “Lower case” option for the IOBJ, to avoid errors in future while loading.



















Note: Changing the settings of the info object in the existing model will have impact on more objects, as it is already being used in so many places and all the objects has to be transported again, this can be avoided by “Converting lower case letters using Translate”

Converting lower case letters using Translate
We can write a Formula in the Transfer Routine to translate the data into the Upper Case
In our scenario, let us consider the Info Object “ZREPO_BY”, in which the “Lowercase Letters” is not checked / enabled.














Here the source system is R/3 and the users are entering the lower case letters rarely and by mistake.

To rectify this, we can write a formula in the IOBJ - Transfer Routine to convert the lower case letter; this will convert even the lower case letters into upper case letters while loading the data to the target.

Removing unexpected special characters:
Even though we include so many special characters and do so many manipulations to avoid or remove the special characters during loading, there are chances to get unexpected characters from the source data.
This can be removed in the following ways:
  • Correcting the data in PSA before updating it into the target
  • Correcting through standard functions
Correction in PSA:
This can be done while loading the data, in general the system will throw error only while updating the data to the target, whereas PSA will have the same structure of data as the source system, so once the error is thrown, we can correct the data in PSA directly and then the corrected data can be further loaded into the higher level targets.

Correcting through standard functions:
Using Function Module:
The Function Module 'SCP_REPLACE_STRANGE_CHARS', can be used in the field routine or in the start routine, which will remove the unwanted characters from the specified field or in whole.

Code Sample:
call function 'SCP_REPLACE_STRANGE_CHARS' 
exporting
 intext = SOURCE_FIELDS-*Infoobject name* 
intext_lg = 0 
inter_cp = '0000' 
inter_base_cp = '0000' 
in_cp = '0000' 
replacement = 46 
importing outtext = RESULT 
* OUTUSED = 
* OUTOVERFLOW = 
** EXCEPTIONS 
* INVALID_CODEPAGE = 1 
* CODEPAGE_MISMATCH = 2 
* INTERNAL_ERROR = 3 
* CANNOT_CONVERT = 4 
* FIELDS_NOT_TYPE_C = 5 
* OTHERS = 6. 
translate RESULT to upper case. 
Note:*Infoobject name - Name of the Info Object for which the values has to be replaced.
Ex:
From : EXTRAÑO ÉTRANGE NÃO OÙ ÖFFNUNLAD JUN-MAY 
To : EXTRANO ETRANGE NAO OU OeFFNUNLAD JUN-MAY

The Function Module ' RSKC_CHAVL_CHECK‟ can be used to check the Non-Allowed characters in BW, then we can use the ABAP Replace statement to replace it with a valid statement.

Ex:
REPLACE ALL OCCURRENCES OF „X‟ IN i_string WITH 'Y'

Removing Unexpected Characters while Loading from Flat File:
While loading data from Flat file the chances of getting the non-printable characters is more. To avoid this, we can also run a UNIX command on the data file.

Command: “: tr '\001'-'\011''\013'-'\037''\177'-'\237' ' *' <filename.bak>”

Also can refer this article from SDN, to remove the invalid characters while loading from “Flat File Source
[http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/50db4398-2dea-2b10-1fab-e3195bb311dc?QuickLink=index&overridelayout=true]

Note: To deal with special characters, invisible, like TAB, CR BACKSPACE etc. you need to write ABAP routines in transfer rules to eliminate them

Special Currency Indicators:

The symbols for the US Dollar ($), British Pound (£), Japanese Yen (¥) and the Euro are not allowed by default. The symbol for the US Dollar ($) is allowed in BW customizing. The other special currency indicators lead to errors in the system.
In the Info Object maintenance screens or in the transfer rules, you need to create a transfer routine for any currency fields and other characteristic values that have special currency indicators. This transfer routine converts the invalid characters into valid characters or character sets. This conversion is mandatory for currency fields.
The currency codes, into which you need to change the currencies, are stored in BW customizing under General Settings -> Currencies -> Check Currency Codes (or ISO4217). The key must agree with the keys in the currency tables. Most of the special currency indicators can be assigned to three-character currency codes.
For example, the $ dollar symbol is converted into USD for US dollars or AUD for Australian dollars.


No comments:

Post a Comment