Problem with encoding in Excel file reader for Java

Asked

Viewed 716 times

2

I have a relatively "common" problem. I have a program in Java that imports an Excel spreadsheet and one of the column fields has words with accents, cedilhas, etc.

When reading the variable, it is always marked with a black diamond in these special characters. I have tried some solutions with normalizer, getBytes() with all possible encodings and tried to use something like this:

WorkbookSettings ws = new WorkbookSettings();  
ws.setEncoding("Cp1252");

But nothing solved :(

The main code (to understand the problem) is:

Workbook workbook = Workbook.getWorkbook(new File(diretorio,v_arquivo));
Sheet sheet = workbook.getSheet(0);
Cell[] celula;

for (int i = 1; i < sheet.getRows(); i++){
        celula = sheet.getRow(i);
        if (celula.length > 0){
            evento = celula[7].getContents().trim(); 
        }
}

And my event string appears as for example lacta?? o Grateful for the attention.

ps: I’m new to the forum, I’m still learning the formatting, sorry for possible errors.

  • The POI should handle this automatically. The XLS file was generated as? Is it not wrong? You can share an Excel file that doesn’t work?

  • The xls file was generated from the test data of a database (but at the time of pasting the data I used up to notepad to remove possible formatting from the database itself). I’ll see if I can generate some file that can be shared.

1 answer

2

Guys, nobody commented but if anyone has this problem in the future, I will leave what I found registered here (I managed to get around the problem in a way). In case of using this variable in any SQL query (which was my problem, because it was not compatible with possible selects with the database) I used the following function:

public static String formatString(String s) {  
        String temp = Normalizer.normalize(s, java.text.Normalizer.Form.NFD);  
        return temp.replaceAll("[^\\p{ASCII}]","%");  
}

What is done: Only special characters are replaced by "%", so when executing an SQL query the word will be found normally in the table. But as I said, it’s just an outline for the problem... it’s not a :x solution

Browser other questions tagged

You are not signed in. Login or sign up in order to post.