Thursday 24 February 2011

CSV files in MS Excel and Oracle Open Office

My input file is created in MS Excel, it is 1 row of data with 3 columns (in the second column there is a newline character):


When I save this file as .csv in excel I get following structure:
COL1,COL2,COL3
³ó ?????,"2 äöü ??? lines
In 1 field ???",+123456

File is corrupted

When you open this file in Excel no multibytes and even lost "+" in phone number but the new line character properly inserted.



When I save the input Excel file as Unicode Text .txt the Unicode (MS Unicode ucs-2le) all the issues with .csv will be resolved (while saving).
COL1 COL2 COL3

łóś ещьул "2 äöü ыва lines
In 1 field ещь" +123456


Unfortunately while opening such file with Text Import Wizard:





In third step you can change type to Text (to rescue phone number):


After clicking "Finish" the result is dissapointing:

Multibytes and phone number is properly shown, but the newline is broken.

When you change the extension of the previous Unicode File from .txt to .csv and double click on it to open it with Excel you will have new line and multibytes but the phone will be corrupted:


When you try to open the same Unicode .csv file with Oracle Open Office:
File -> Open -> .csv file
In Import Wizard you set character set to Unicode (it's MS Unicode), delimiter to Tab and column types to Text (to rescue phone number):


Success story:



The question is:
"Why excel cannot properly open a CSV file created by itself?"

Or maybe there is a workaround (without modyfing the CSV file) to do so?

Oracle Open Office FTW!!!