Character 'º' is accepted by utf8 within the Insert, but not when I use COPY

Question

Character 'º' is accepted by utf8 within the Insert, but not when I use COPY

Asked 5 years, 9 months ago

Viewed 127 times

0

I’m trying to migrate data from a Firebird database to a Postgresql version 12.0.

To generate a. csv file containing the table records in Firebird, I use the software Fbexport,

./fbexport -Sc -D /opt/firebird/bin/measures.fdb -H localhost -U sysdba -P masterkey -F /home/dani/Documents/raw_cfgpar.out -Q "SELECT * FROM CFGPAR"

I created the database using only the name as parameter, so the others were defined with the pattern, including the parameter "encoding", which according to the documentation, must receive utf8 through the standard template template1.

Create database measures;

Table contains columns such as integer values and other varchar.

When I try to add the file records using the COPY function, I get an error in the terminal:

COPY CFGPAR (CODIGO, PARNAME, PARVALUE, VRSCODIGO) from '/home/dani/Documents/raw_cfgpar.out' DELIMITER ',' CSV HEADER;
ERROR:  invalid byte sequence for encoding "UTF8": 0xba

Showing that Character ° is not accepted by UTF8

But when I try to enter a record using Insert through the terminal, the record is saved even containing Character °.

INSERT INTO cfgpar (codigo, parname, parvalue, vrscodigo) VALUES (55555, 'ºªã', 'ºº', 666);
INSERT 0 1

measures=# select * from cfgpar;
 codigo | parname | parvalue | vrscodigo
--------+---------+----------+-----------
  66666 | testeãº | teste    |       555
  55555 | ºªã     | ºº       |       666
(2 rows)

Do I need to convert the csv file generated by Fbexport to another format? how to convert?

Have you tried to make encoding explicit? I think postgre uses another standard encoding. So: COPY CFGPAR (CODIGO, PARNAME, PARVALUE, VRSCODIGO) from '/home/dani/Documents/raw_cfgpar.out' DELIMITER ',' CSV HEADER ENCODING 'UTF8'; https://www.postgresql.org/docs/9.1/sql-copy.html

– Wilson Faustino

2019/10/18 at 12:10
I tried now, the error remains the same, invalid byte Sequence for encoding "UTF8": 0xba

– dwenzel

2019/10/18 at 12:37
According to the reference below, the character 0xba is in ISO-8869-1, not in UTF8 - he suggests converting before the migration! Reference: https://stackoverflow.com/questions/25599816/unicodedecodeerror-utf8-codec-cant-decode-byte-0xba-in-position-1266-invali

– Wilson Faustino

2019/10/18 at 12:47
According to UNICODE standard: Unicode code point: U+00BA | 
Character: º | 
UTF-8: c2 ba (hex.) | 
Name: MASCULINE ORDINAL INDICATOR

– anonimo

2019/11/24 at 16:55

No answers

Browser other questions tagged sql postgresql utf-8 sqlbulkcopy

You are not signed in. Login or sign up in order to post.