1
I am facing a somewhat strange problem. I have a Mysql database in version 5.6 and a table with field of type longblob
that stores a text in compressed HTML format (ZIP). When my website makes a request for the backend (utilizo Spring Boot 2.3.3, JPA and Java 11.0.9 Amazon correct), the same searches this database record, unzips and returns only the HTML text to my website.
When I do this on my local machine works perfectly, but when I do this from the server, with the same version of Java, the process does not work, the backend converts accented characters to "strange characters".
This is an example of the text saved in the database:
<p style="text-indent:0pt;margin-top:0pt;margin-bottom:0pt;"><span style="color:#000000;font-weight:bold;"> SAIBAM</span><span style="color:#000000;"> quantos a presente </span><span style="color:#000000;font-weight:bold;">Escritura Pública de Cessão e Transferência de Posse</span><span style="color:#000000;"> virem que, sendo aos æData_lav1>, neste Distrito de Itaió, 2º do município e comarca de Itaiópolis, Estado de Santa Catarina, neste Ofício de Notas, sito às margens da Rodovia SC 477, s/n, perante mim, Tabelião de Notas, partes entre si, justas e contratadas a saber</span></p>
This is the result obtained on the server:
It can be verified that all accented letters have been converted to "strange characters".
I have tried to change the database connection to UTF-8, but failed:
conexão...&useUnicode=yes&characterEncoding=UTF-8
spring.jpa.properties.hibernate.connection.characterEncoding=utf-8
spring.jpa.properties.hibernate.connection.CharSet=utf-8
spring.jpa.properties.hibernate.connection.useUnicode=true
This is the method that unpacks ZIP:
public String convertToEntityAttribute(byte[] compactado) {
if(compactado == null){
return "";
}
final int BUFFER_SIZE = 1024;
try {
ByteArrayInputStream is = new ByteArrayInputStream(compactado);
GZIPInputStream gis = new GZIPInputStream(is, BUFFER_SIZE);
StringBuilder builder = new StringBuilder();
byte[] data = new byte[BUFFER_SIZE];
int bytesRead;
while ((bytesRead = gis.read(data)) != -1) {
builder.append(new String(data, 0, bytesRead, Charset.defaultCharset()));
}
gis.close();
is.close();
return builder.toString();
} catch (IOException e) {
return "";
}
}
Does anyone have any idea what it might be?
Have you checked if the server is emitting the correct character encoding headers? It needs to inform the browser that the content is in UTF8 - by its print, it seems that the browser does not know this and is trying to display as if it were Latin 1.
– bfavaretto
Hello @bfavaretto I believe that the problem is not this, because I put a log with the result of Builder.toString() before the Return in my method that decompresses and the text is already with the strange characters.
– Everton
Charset.defaultCharset()
. Have you triedStandardCharsets.UTF_8
? You seem to be setting UTF-8, if the server has another default, e. g.,CP-1252
, the conversion process will fail.– Anthony Accioly
Hello @Anthonyaccioly was just that, I don’t know how I hadn’t thought of it before rsrs, thank you so much!
– Everton