Hi,
I’m trying to decode (convert to utf8) some string from medical data
with escape sequence inside (in some case).
I known the source encoding of these string but in the ‘iso number’
form.
In some case i found a potential encoding but I need some support about
how to map correctly the ‘iso number’ with the corresponding ruby
‘encoding name’ to convert properly the source.
Specific used encodings are:
-
Japanese (tagged as ISO_IR 13) that include:
ISO IR 13: G1 code element => JIS X 0201: Katakana
ISO IR 14: G0 code element => JIS X 0201: Romaji -
Thai (tagged as ISO_IR 166) that include
ISO-IR 166: G1 code element ( TIS 620-2533 (1990) )
ISO-IR 6: G0 code element (ISO 646) -
Japanese (tagged as ISO 2022 IR 13) that include:
ISO-IR 13: G1 element => JIS X 0201: Katakana (ESC 02/09 04/09)
ISO-IR 14: G0 element => JIS x 0201: Romaji (ESC 02/08 04/10) -
Thai (tagged as ISO 2022 IR 166) that include:
ISO-IR 166: G1 code element ( TIS 620-2533 (1990) ) (ESC 02/13 05/04)
ISO-IR 6: G0 code element (ISO 646) (ESC 02/08 04/02) -
Japanese (tagged as ISO 2022 IR 87) that include:
ISO-IR 87: G0 code element => JIS X 0208: Kanji (ESC 02/04 04/02) -
Japanese (tagged as ISO 2022 IR 159) that include:
ISO-IR 159: G0 code element => JIS X 0212: Supplementary Kanji set
(ESC 02/04 02/08 04/04) -
Korean (tagged as ISO 2022 IR 149) that include:
ISO-IR 149: G1 code element => KS X 1001: Hangul and Hanja (ESC 02/04
02/09 04/03)
Supposed encoding are
- Shift_JIS
- TIS-620
- G0: Shift_JIS - G1: Shift_JIS
- G0: US-ASCII - G1: TIS-620
- G0: ISO-2022-JP
- G0: ISO-2022-JP
- G1: CP949
Is this mapping correct ?
Thanks in advance
Enrico