Problem
When using Genie in Databricks to extract SQL query results and download them as CSV files, you notice Korean characters appear broken or garbled when the file is opened in Microsoft Excel.
Cause
The CSV files you download from Genie into Excel are encoded in UTF-8, but are missing the BOM (Byte Order Mark) encoding. Excel by default uses a different default encoding, leading to the misinterpretation of Korean characters.
Solution
Open the CSV file from within Excel instead to set the correct encoding.
- Launch Excel.
- Navigate to the Data tab.
- Click on From Text/CSV.
- Select the CSV file you downloaded from Genie.
- In the data preview window, under File Origin, select 65001: Unicode (UTF-8).
- Ensure the delimiter is correctly identified (usually a comma).
- Click Load to import the data.