Friday 5 December 2014

Multi Byte Char Issue and Resolution



Issue: It has been observed that MBNA DW house and flat files in target could not retain Unicode character's in the file. So they ask us to substitute them with some normal text. A nice example is French trademark characters like MD . in MasterCard MD this character is 4byte( 2+2) )
Solution: We have tried to replace this character in transformer but transformer function like trim/convert or replace could not identify this character due to extended encodings.
So we have achieved this conversion at SQL query level as shown below.
Select
case when INSTR((CPN.SHORT_NAME),unistr('\1D39\1D30'),1,1) >0 then
replace(CPN.SHORT_NAME,unistr('\1D39\1D30'),'MD') Else CPN.SHORT_NAME END SHORT_NAME
from
CREDIT_PRODUCT_NAME CPN
We have used the hex value of superscript M and D i.e. 1D39 and 1D30 so that it could be understood by function unistr(),and would be converted into Unicode internally .
Here we are checking if this character exist in the data by checking the start position >0 and if it exists then we are substituting it with the English 'MD' word.

1 comment:

  1. Whether we can do automation for identification of message handler in particular job for ex. whether paticular job contains message handler or not (by script or another job instead of clicking and checking)

    ReplyDelete