Changing Database Characterset With Export/Import Utilities

Recently there was a thread on Oracle Forums regarding change of Database characterset from WE8ISO8859P1 to AL32UTF8. User wanted to upgrade his database from 9i to 10g and also change the database characterset. He was looking forward to use either Export/Import utilities or Database Upgrade Assistant for the same. There were few responses to this question with responses from few people (NLS being a dry topic for most of DBA’s 🙂 )

To make this issue simpler, we will keep this problem to only change of Database characterset. Export/Import is considered to be a Straightforward approach but we should understand that this should not be the first step.

As per Globalization Support Guide, there are two steps to change the characterset

1)Data Scanning

2)Data Conversion

Export/Import comes under second step, i.e Data Conversion . Data Scanning is a very important step for any Characterset Change and should not be compromised. Failure to do so can result in data loss or data corruption.

In case you are planning to use Export/Import ,Approach should be to first run csscan.

Csscan utility helps to identify if data is Changeless (Data to be stored in same way in new Characterset), Convertible (Data have different code points in new Characterset and needs to be converted), Truncation (Data will be truncated in new Characterset , so column needs to be modified) and Lossy (Data is not understood by new characterset and will be lost on Conversion).

In case only Changeless and Convertible data is there, you can go ahead and use export/import.

If only changeless data is there, then you need to simply use “Alter database ” statement or csalter (in 10g) to change the characterset.

In case you have lossy data, then you would be required to identify what the data is, correct it and then change the characterset.

You should read the following Metalink Notes and Documentation before attempting to change Database Characterset

Oracle® Database Globalization Support Guide10g Release 2 (10.2)Part Number B14225-02

Note:227338.1 – Character Set Scanner – Frequently Asked Questions

Note 257736.1 Changing the Database Character Set – an extended overview

Note 260192.1 Changing WE8ISO8859P1/WE8ISO8859P15 or WE8MSWIN1252 to (AL32)UTF8